You are not logged in.
The kernel is writting 80% in C.
Offline
Most of the kernel code is written is assembly language, IIRC. Thus, there may be no real performance gain over gcc.
Try compiling some data-mincing application written in C/C++. For example, LAME encoder.
well icc gains over gcc (LAME, and MySQL for example) are well described and I am not contesting this, however
andre.ramaciotti
asked
How does the kernel compiled by ICC compares with the GCC one?
and you responded that though you did not use bootchard you have got the feeling that icc kernel is slightly faster.
So out of curiosity I compiled icc kernel (in spite of huge discussion at kernel/hardware subforum, this is quick and easy task, in fact it took more time to read about ICCCFLAGS andICCCXXFLAGS flags optimization than the rest of the process ) and run bootchart.
Because I have intel cpu, not AMD I was not concerned about different gotchas related to cpu arch.
So in response to Andre's question:
boot time is about the same (1sec difference is nothing) but disk responsiveness was lower in icc kernel than in gcc kernel. I don't know the reason, nor I am interested in the investigation.
In conclusion: for my humble expectations I have not seen anything (yet) to convince me that icc kernel makes any sense, but I also have made some reservations (I don't know what flags are best for kernel, so maybe there is better way to compile icc kernel than I did, however I found that optimization flags have little or no effect on speed also in the case of gcc).
Offline
Exactly I said that I got the feeling, and nothing else. YMMV.
I'd like to point out as well that the CFLAGS when compiling the kernel are often very different from the CFLAGS when compiling anything else - the kernel is not very tolerant about "hard" CFLAGS, so stuff like -O3 and similar may actually result in the loss of performance, and will result in a much larger kernel image. Personally, I like to leave the CFLAGS for the kernel at the default for the compiler, and have only been experimenting with -xSSE2 (since I have an SSE2 CPU) and with -ip. Again, YMMV.
Only the best is good enough.
Offline
Exactly I said that I got the feeling, and nothing else. YMMV.
I'd like to point out as well that the CFLAGS when compiling the kernel are often very different from the CFLAGS when compiling anything else - the kernel is not very tolerant about "hard" CFLAGS, so stuff like -O3 and similar may actually result in the loss of performance, and will result in a much larger kernel image. Personally, I like to leave the CFLAGS for the kernel at the default for the compiler, and have only been experimenting with -xSSE2 (since I have an SSE2 CPU) and with -ip. Again, YMMV.
Yes, I have understood that this was only subjective assessment, so because next someone was asking about bootchart, and I was also curious about icc kernel, then I tested this. I was not trying to "correct" your feeling, just adding my experience.
flags I used:
export CFLAGS=${ICCCFLAGS}
export CXXFLAGS=${ICCCXXFLAGS}
ICCCFLAGS="-xSSE2 -ip -parallel -gcc"
ICCCXXFLAGS="-fstack-security-check -fstack-protector -fvisibility=hidden -fvisibility-inlines-hidden"
I had to set first:
export LANG=C
Offline
No problem, I just wanted to make things clear
Anyway, the CFLAGS and CXXFLAGS I used while compiling my kernel were just
-O2 -xSSE2 -ip
OTOH, when compiling other applications I like to experiment. I don't think you really need that -gcc flag, personally. Although it's just for compatibility, in my personal experience compatibility is often reverse proportional to performance, so you might want to remove it and try again.
Also, you may want to try just the "-fast" flag; I don't like it personally since it incorporates -O3, which is a gamble that depends on the application and (sruprisingly enough) your CPU architecture.
(edit: typos)
Last edited by Wintershade (2009-03-05 00:41:45)
Only the best is good enough.
Offline
I never have seen flags that would really affect kernel performance. I added these just for fun.
-fast requires SS3 so if your machine does not support ss3 -fast will not work anyway
I have read icc man page, so I know what these flags mean. I used -O3 and have not seen any impact on performance.
Last edited by broch (2009-03-05 01:22:43)
Offline
I've seen an impact in the filesize with -O3, and I remember some apps, like Firefox 2 (back when I was using Gentoo) got actually more latent with -O3. That was a long time ago, though.
I think -O2 would have better performance with the kernel, compared to -Os; I am mentioning this one since many distros have -Os by default.
Only the best is good enough.
Offline
I have succesfully compiled my Zen-Kernel 2.6.29-rc7-zen2 with ICC 10.1.018 on my Dell 8600 2Ghz 60 GB 7200 RPM box with these flags:
-O2 -ip -wpo-ipo -scalar-rep -complex-limited-range -opt-multi-version-aggressive -static -mno-ieee-fp -IPF_fp_relaxed -rcd -ftz -fp-model fast=1 -fp-port -pragma-optimization-level=GCC -ffunction-sections -auto_ilp32 -inline-calloc -restrict -ansi-alias -alias-args -alias-const -prec-sqrt -pc64 -no-prec-div -msse2 -funroll-loops -unroll-aggressive -unroll -vec-guard-write -fno-builtin -mcpu=pentium4 -mtune=pentium4 -march=i686
I didn't do bootchart or any other serious benchmark (except Google V8) - BUT the kernel is running faster. Also booting seems to me faster. Apps opening faster. My vmlinux is becoming bigger, though.
I have compiled Icecat 3.0.6 with ICC, too. It is now faster and more stable.
Last edited by mothersh1p (2009-04-05 11:21:34)
Offline
I have succesfully compiled my Zen-Kernel 2.6.29-rc7-zen2 with ICC 10.1.018 on my Dell 8600 2Ghz 60 GB 7200 RPM box with these flags:
-O2 -ip -wpo-ipo -scalar-rep -complex-limited-range -opt-multi-version-aggressive -static -mno-ieee-fp -IPF_fp_relaxed -rcd -ftz -fp-model fast=1 -fp-port -pragma-optimization-level=GCC -ffunction-sections -auto_ilp32 -inline-calloc -restrict -ansi-alias -alias-args -alias-const -prec-sqrt -pc64 -no-prec-div -msse2 -funroll-loops -unroll-aggressive -unroll -vec-guard-write -fno-builtin -mcpu=pentium4 -mtune=pentium4 -march=i686
I didn't do bootchart or any other serious benchmark (except Google V8) - BUT the kernel is running faster. Also booting seems to me faster. Apps opening faster. My vmlinux is becoming bigger, though.
I have compiled Icecat 3.0.6 with ICC, too. It is now faster and more stable.
Thanks but it's kind of hard to just take your word for it. A bootchart would be nice
Offline
Offline
Has anyone done any benchmarks? I'm download the compiler now (2 hours left ), and will attempt to benchmark lame, for example.
What else should I benchmark?
Offline
Just a note for those kernel compilers in here, the kernel responds well to being compiled for size (-Os in GCC)... there's even a config option to make sure this happens. It lowers kernel size, theoretically may speed boot times, and reduces kernel RAM consumption.
Offline
Just a note for those kernel compilers in here, the kernel responds well to being compiled for size (-Os in GCC)... there's even a config option to make sure this happens. It lowers kernel size, theoretically may speed boot times, and reduces kernel RAM consumption.
But doesn't it lower the overall speed of the kernel?
Offline
compiler flags do not improve kernel speed. check out gentoo forums for more in detail benchmarks. Improperly set may in fact degrade speed/stability.
Otherwise try yourself, who knows..
Offline
compiler flags do not improve kernel speed
That's a complete impossibility. I would very much agree that it's highly doubtful you can get a noticeable performance increase (or even statistically significant and consistent increases) with anything over a simply-compiled kernel (-Os or -O2), but compiler flags always matter, they have to.
@Fackamoto: Firstly, -Os helps the speed of many apps... it's one of the reasons -O3 can hurt - -Os optimizes for size, which not only means lower binary size (theoretically less load time for kernel, may be significant with other apps), but can translate to less CPU cache and RAM used. This can make a fair difference with the kernel; you want it taking up as little RAM and especially CPU cache as possible. Firefox, for example, benefits well from -Os. It's up in the air, really... likely there'll never be a noticeable difference. I just mentioned it because it seemed there were lots of people willing to go crazy compiling their kernel
Offline
Hm, I'd hate to post this up, but I am miserably failing at compiling the Vanilla kernel with my .config.
rm -f include/config/kernel.release
echo 2.6.29-APRZ > include/config/kernel.release
set -e; : ' CHK include/linux/version.h'; mkdir -p include/linux/; (echo \#define LINUX_VERSION_CODE 132637; echo '#define KERNEL_VERSION(a,b,c) (((a) << 16) + ((b) << 8) + (c))';) < /usr/src/linux-2.6.29/Makefile > include/linux/version.h.tmp; if [ -r include/linux/version.h ] && cmp -s include/linux/version.h include/linux/version.h.tmp; then rm -f include/linux/version.h.tmp; else : ' UPD include/linux/version.h'; mv -f include/linux/version.h.tmp include/linux/version.h; fi
set -e; : ' CHK include/linux/utsrelease.h'; mkdir -p include/linux/; if [ `echo -n "2.6.29-APRZ" | wc -c ` -gt 64 ]; then echo '"2.6.29-APRZ" exceeds 64 characters' >&2; exit 1; fi; (echo \#define UTS_RELEASE \"2.6.29-APRZ\";) < include/config/kernel.release > include/linux/utsrelease.h.tmp; if [ -r include/linux/utsrelease.h ] && cmp -s include/linux/utsrelease.h include/linux/utsrelease.h.tmp; then rm -f include/linux/utsrelease.h.tmp; else : ' UPD include/linux/utsrelease.h'; mv -f include/linux/utsrelease.h.tmp include/linux/utsrelease.h; fi
set -e; if [ -L include/asm ]; then asmlink=`readlink include/asm | cut -d '-' -f 2`; if [ "$asmlink" != "x86" ]; then echo "ERROR: the symlink include/asm points to asm-$asmlink but asm-x86 was expected"; echo " set ARCH or save .config and run 'make mrproper' to fix it"; exit 1; fi; test -e $asmlink || rm include/asm; elif [ -d include/asm ]; then echo "ERROR: include/asm is a directory but a symlink was expected"; exit 1; fi
if [ ! -L include/asm ]; then : ' SYMLINK include/asm -> include/asm-x86'; if [ ! -d include/asm-x86 ]; then mkdir -p include/asm-x86; fi; ln -fsn asm-x86 include/asm; fi
mkdir -p .tmp_versions ; rm -f .tmp_versions/*
make -f scripts/Makefile.build obj=scripts/basic
(cat /dev/null; ) > scripts/basic/modules.order
make -f scripts/Makefile.build obj=.
(cat /dev/null; ) > modules.order
mkdir -p kernel/
icc -Wp,-MD,kernel/.bounds.s.d -nostdinc -isystem /usr/lib/gcc/i686-pc-linux-gnu/4.3.3/include -Iinclude -I/usr/src/linux-2.6.29/arch/x86/include -include include/linux/autoconf.h -D__KERNEL__ -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -Werror-implicit-function-declaration -O2 -m32 -msoft-float -mregparm=3 -freg-struct-return -march=i686 -ffreestanding -DCONFIG_AS_CFI=1 -DCONFIG_AS_CFI_SIGNAL_FRAME=1 -pipe -Wno-sign-compare -fno-asynchronous-unwind-tables -Iarch/x86/include/asm/mach-default -fomit-frame-pointer -D"KBUILD_STR(s)=#s" -D"KBUILD_BASENAME=KBUILD_STR(bounds)" -D"KBUILD_MODNAME=KBUILD_STR(bounds)" -fverbose-asm -S -o kernel/bounds.s kernel/bounds.c
icc: command line warning #10156: ignoring option '-W'; no argument required
icc: command line warning #10156: ignoring option '-W'; no argument required
icc: command line error: invalid argument for option '-m'
make[1]: *** [kernel/bounds.s] Error 1
make: *** [prepare0] Error 2
Weee.... Fun fun fun!
Last edited by Aprz (2009-04-23 10:52:03)
Offline
You need to change a lot of stuff to build Linux with icc. http://www.google.com/search?q=intel+co … ile+kernel
Offline
Fun. I like it when people Google/wiki for me!
Last edited by Aprz (2009-04-23 10:55:15)
Offline
There's probably something wrong with this
make[1]: *** [kernel/bounds.s] Error 1
See where the bounds.s is in the config (via the menuconfig) and try to change the settings regarding it. Disable it if you can.
See if it compiles after that.
Only the best is good enough.
Offline
Nah, this wouldn't have to do with bounds.s, but the -m flag being passed, probably from -m32 not being compatible with icc causing it to die. My .config is completely monolithic and narrowed down to what my hardware is and what feature they support, so enabling something would be a slow down and disabling stuff would mean lost of features. I actually edited arch/x86/Makefile to get farther than that, but I bump into a new error, haha, probably from doing something totally wrong. I am really just blindly experimenting now. This is as far as I've gotten so far.
CHK include/linux/version.h
CHK include/linux/utsrelease.h
SYMLINK include/asm -> include/asm-x86
CALL scripts/checksyscalls.sh
CHK include/linux/compile.h
CC arch/x86/kernel/acpi/realmode/video-vga.o
and line warning #10156: ignoring option '-W'; no argument required
icc: command line warning #10121: overriding '-mtunepentium4' with '-mtunegeneric'
icc: command line warning #10156: ignoring option '-W'; no argument required
icc: command line warning #10006: ignoring unknown option '-fno-asynchronous-unwind-tables'
icc: command line warning #10156: ignoring option '-W'; no argument required
icc: command line warning #10156: ignoring option '-W'; no argument required
icc: command line warning #10156: ignoring option '-W'; no argument required
icc: command line warning #10006: ignoring unknown option '-fwrapv'
icc: command line warning #10006: ignoring unknown option '-fno-dwarf2-cfi-asm'
icc: command line warning #10006: ignoring unknown option '-fno-toplevel-reorder'
arch/x86/kernel/acpi/realmode/../../../boot/boot.h(176): remark #593: parameter "s1" was set but never used
static inline int memcmp(const void *s1, const void *s2, size_t len)
^
arch/x86/kernel/acpi/realmode/../../../boot/boot.h(176): remark #593: parameter "s2" was set but never used
static inline int memcmp(const void *s1, const void *s2, size_t len)
^
arch/x86/kernel/acpi/realmode/../../../boot/boot.h(176): remark #593: parameter "len" was set but never used
static inline int memcmp(const void *s1, const void *s2, size_t len)
^
arch/x86/kernel/acpi/realmode/../../../boot/boot.h(184): remark #593: parameter "s1" was set but never used
static inline int memcmp_fs(const void *s1, addr_t s2, size_t len)
^
arch/x86/kernel/acpi/realmode/../../../boot/boot.h(184): remark #593: parameter "s2" was set but never used
static inline int memcmp_fs(const void *s1, addr_t s2, size_t len)
^
arch/x86/kernel/acpi/realmode/../../../boot/boot.h(184): remark #593: parameter "len" was set but never used
static inline int memcmp_fs(const void *s1, addr_t s2, size_t len)
^
arch/x86/kernel/acpi/realmode/../../../boot/boot.h(191): remark #593: parameter "s1" was set but never used
static inline int memcmp_gs(const void *s1, addr_t s2, size_t len)
^
arch/x86/kernel/acpi/realmode/../../../boot/boot.h(191): remark #593: parameter "s2" was set but never used
static inline int memcmp_gs(const void *s1, addr_t s2, size_t len)
^
arch/x86/kernel/acpi/realmode/../../../boot/boot.h(191): remark #593: parameter "len" was set but never used
static inline int memcmp_gs(const void *s1, addr_t s2, size_t len)
^
arch/x86/kernel/acpi/realmode/../../../boot/video.h(133): remark #2259: non-pointer conversion from "int" to "u16={unsigned short}" may lose significant bits
return inb(port+1);
^
arch/x86/kernel/acpi/realmode/../../../boot/video.h(138): remark #2259: non-pointer conversion from "int" to "u16={unsigned short}" may lose significant bits
outw(index+(v << 8), port);
^
arch/x86/kernel/acpi/realmode/../../../boot/video.h(144): remark #2259: non-pointer conversion from "u16={unsigned short}" to "u8={unsigned char}" may lose significant bits
out_idx(port, index, v);
^
arch/x86/kernel/acpi/realmode/../../../boot/video-vga.c(129): remark #2259: non-pointer conversion from "int" to "u16={unsigned short}" may lose significant bits
return (inb(0x3cc) & 1) ? 0x3d4 : 0x3b4;
^
arch/x86/kernel/acpi/realmode/../../../boot/video-vga.c(143): remark #2259: non-pointer conversion from "int" to "u8={unsigned char}" may lose significant bits
out_idx(end, crtc, 0x12); /* Vertical display end */
^
(0): internal error: backend signals
compilation aborted for arch/x86/kernel/acpi/realmode/video-vga.c (code 4)
make[3]: *** [arch/x86/kernel/acpi/realmode/video-vga.o] Error 4
make[2]: *** [arch/x86/kernel/acpi/realmode/wakeup.bin] Error 2
make[1]: *** [arch/x86/kernel/acpi] Error 2
make: *** [arch/x86/kernel] Error 2
I am betting that even if I do get this compile, the kernel won't work. I don't know how all these other folks simply compiled the kernel. I wish I knew what their trick was if they truly were successful which I do have my doubts.
Hey, this doesn't seem like an Arch topic? Shouldn't a mod move it to GNU/Linux or Try This or something? Now I don't really want to post in this post since it feels out of place, haha, I just found it using the search button. :s
Edit: Although I feel like I am wasting my time doing it this way. I think I'll try to be more professional about it starting now. XD
Last edited by Aprz (2009-04-23 11:53:43)
Offline
You need to change a lot of stuff to build Linux with icc. http://www.google.com/search?q=intel+co … ile+kernel
Actually, not anymore. All you have to do is tell the Makefile that the CC is icc and CXX is icpc. And the flags, of course.
ICC 11 has gone a long way since the dreaded version 9, and there is more and more GCC compatibility with every new version. However, there still are things that cannot - and probably never will be, not because of the compatibility but because of the nature of the issue itself - be compiled with it, such as glibc or the xorg server. Kernel, however, is on the list of applications which compile nicely
I think I'll try to be more professional about it starting now. XD
That would be a good idea. However, there are a few opinions of yours which aren't completely right.
1) enabling things in your kernel doesn't necessarily mean loss of performance.
2) a monolithic kernel also doesn't necessarily increase performance, and has a number of disadvantages compared to modular systems. modules are there for a reason (including the actual loss of performance, which is just as likely to occur as the gain).
I would suggest you to try the Arch kernel with it's original config. I have successfully compiled it on my 32-bit processor. Once you do that, you can start carefully tuning the settings. This has an advantage over "config-from-scratch" in a manner that the first compile will actually go through, and the further compiles will go much faster (unless you make clean or distclean).
Once you've compiled this with ICC and made it work, you'll probably want to make the fine tuning you desire. As a first step I'd recommend removing the drivers you don't have the hardware for, and the features of the kernel which are "experimental", "deprecated", "obsolete", and so on, unless you're absolutely positive that you cannot live without them.
P.S. if you're really into performance, you'll find that there are patchsets which will make the kernel work - at least in some aspects - a bit faster than vanilla.
(reason for editing: typos)
Last edited by Wintershade (2009-04-23 12:38:54)
Only the best is good enough.
Offline
Well, I feel like a noob, and I about to contradict that very soon after I mentioned the noob thing. I can't even compile a dummy.c with icc so I installed icc incorrectly hence it not working.
I know that it appears that I am a noob, but believe me when I say that I am fully aware of the advantages and disadvantages of a monolithic approach for a kernel. When I said that my kernel was completely monolithic and narrowed down to what my hardware supports, I am totally serious. Without the -Os flag, which is actually a slow down (it prefers size over speed and I have confirmed that my kernel is slower... it loads 2s longer and is less responsive in a few tests), my kernel is ~1.9M, almost the size of Arch's stock kernel. To literally disable something in my kernel means I lose support for something and to literally enable something would means that I am only increasing my kernel size for something my hardware does not support.
Last edited by Aprz (2009-04-24 08:28:05)
Offline
Currently, closed source drivers will not install out of the box, though the developers believed this issue should be easily fixable.
Boooo
NVIDIA strike again.
Offline
No way... I just cannot believe that people are compiling their kernels with icc that easily... I downloaded icc 11.0.083, I tested it and it works on things I write in C, but when I try it on the Linux kernel, it doesn't get past my previous errors using just make CC=icc CXX=icpc. Are you sure that's correct? Maybe even the possibility it has to do with me using 32 and you might be using 64 or something? I cannot get this to compile at all.... <_<
Edit: Ah, found something. I think I might be able to do it now.
Edit: It's compiling with HOSTCC=icc AR=xiar LD=xild. I can't get it to compile with CC=icc even using an online wrapper that I originally tried using from the dna-linux site. That will probably have to be totally hacked up by me or something. I'll look into more after this compiles. Of course, I'll post up bootchart too.
Edit: Nope.
Edit: Damn, I shouldn't stay up for two days... but it now does compile with CC= option. This shall be in.... never mind. It just died on me as I was typing this. Stoped at arch/x86/kernel/acpi... Not very far.
Edit: Okay! Finally download the LinuxDNA kernel, haha! I'll let you guys know if I am successful at all. It is certainly working a lot farther than the Vanilla kernel so this is good.
Edit: Damn, I actually got it tin compile with that LinuxDMA thing no problem and followed their instructions. I am now compiling it in gcc and then going to make a switch with the vmlinux so that that one can be compressed and then try to boot that one.
Last edited by Aprz (2009-04-24 18:48:37)
Offline
What else should I benchmark?
XviD, oggenc (I believe this is in vorbis-tools), x264 (I'd really like to see a benchmark for that one), MySQL, perhaps mencoder (not it's frontends, but the backbone), p7zip, lzma, gzip... I think you get the idea. Utilities that process a lot of data.
Perhaps libgl and mesa (together with the drivers for your graphics card); this is a far shot, but you may get a frame or two per second more at glxgears. Xorg server won't compile, though, IIRC.
Only the best is good enough.
Offline