You are not logged in.

#1 2005-04-17 15:59:01

Riklaunim
Member
Registered: 2005-04-09
Posts: 106
Website

GCC Flags & ABS - speed test

- Tested app: Dillo 0.8.4 (413,2kb tar.bz2). Language: C
- Computer:Pentium III 846MHz, 256 KB Cache, Kernel 2.6.11, GCC 3.4.3 march=i686 (arch linux)

size in kb of Arch Packages generated by makepkg:
-O1 -march=pentium3 -s -mfpmath=sse -fomit-frame-pointer -pipe 189,4
-O2 -march=pentium3 -s -mfpmath=sse -fomit-frame-pointer -pipe   211,3kb
-O3 -march=pentium3 -s -mfpmath=sse -fomit-frame-pointer -pipe   241,7
-Os -march=pentium3 -s -mfpmath=sse -fomit-frame-pointer -pipe   183,6
-Os -march=pentium3 -s -fomit-frame-pointer -pipe   183,6
-Os -march=pentium3 -s -mfpmath=sse -pipe   185,0
-Os -march=pentium3 -mfpmath=sse -fomit-frame-pointer -pipe   189,4
-Os -march=i686 -s -mfpmath=sse -fomit-frame-pointer -pipe   183,6
-Os -march=i586 -s -fomit-frame-pointer -pipe   183,0
-Os -march=i386 -s -fomit-frame-pointer -pipe   182,4
-Os -mtune=i686 -s -fomit-frame-pointer -pipe   183,5
-Os -mtune=i586 -s -fomit-frame-pointer -pipe   183,0

Now we have to check the execution time of some of those pkgs:
(foo.sh):

echo "Start: "
date +%s%n%N
dillo http://localhost/html/a.php

Where http://localhost/html/a.php is a page to open. The page contains:

<?php
echo microtime();
?>

- Launch Dillo with ./foo.sh
Time from the PHP - Time from the console =~ execution time.

Results (graphs)
http://www.linuks.rk.edu.pl/binary.php?id=74
http://www.linuks.rk.edu.pl/binary.php?id=75
http://www.linuks.rk.edu.pl/binary.php?id=76
"Baza": -Os -march=pentium3 -s -mfpmath=sse -fomit-frame-pointer -pipe
"Bez": without a * Flag

NOTES:
- -mfpmath=sse is a big +++ for pentium3 and never processors (those witch sse cpu flag) expecialy if the application makes a lot of matematical operations (Dillo isn't a calculator but the effect is visible)
- -fomit-frame-pointer also affects execution time
- if march or mtune is set to your processor family then it will generate code very similar to that generated by march=my_cpu. i686 Arch for P3 is ok, but i586 SuSE less.
- Os is the best size/speed optimalisation flag, unless files are very big and contain a lot of code (then check O2)

Offline

#2 2005-04-17 20:34:11

iBertus
Member
From: Greenville, NC
Registered: 2004-11-04
Posts: 2,228

Re: GCC Flags & ABS - speed test

Does using the --mfpmath=sse flag cause the code to run correctly only on cpus that have sse support? If not, then why isn't that used by default?

Offline

#3 2005-04-17 20:55:53

phrakture
Arch Overlord
From: behind you
Registered: 2003-10-29
Posts: 7,879
Website

Re: GCC Flags & ABS - speed test

sse only works if your proc supports sse - some i686s don't so this can't be done for all of arch

this sounds like a gentoo post to me... a gain of 0.05 seconds and 2KB of space isn't warranted in my book

Offline

#4 2005-04-17 21:08:24

i3839
Member
Registered: 2004-02-04
Posts: 1,185

Re: GCC Flags & ABS - speed test

SSE is already used when -march=pentium3 is set.

Anyway, it's not really a Gentoo post, if you ignore the notes. Why? Because there are benchmark results, telling us nicely how much sense some settings make in this case. It would be if he was demanding specific cpu model optimized binaries or other nonsense, but he doesn't.

Offline

#5 2005-04-17 21:27:43

cactus
Taco Eater
From: t͈̫̹ͨa͖͕͎̱͈ͨ͆ć̥̖̝o̫̫̼s͈̭̱̞͍̃!̰
Registered: 2004-05-25
Posts: 4,622
Website

Re: GCC Flags & ABS - speed test

fetching performance results for binary builds by using http requests from a php instance is silly. Not to mention that rendering still occurs after php gets its microtime value.
Also, system load at the time, as well as if the tests were run consecutively, or after a clean boot, would all be determining factors.

I agree with phrak. Further, I could understand doing a web benchmark against different http servers, or benchmarking execution time of a statically compiled binary using different compiler optimizations, but using a web server to test compiler optimizations seems...very strange.

Also, anything above 02 is silly most of the time.
*shrug*

oh. did I say silly in this post yet? just...silly..


"Be conservative in what you send; be liberal in what you accept." -- Postel's Law
"tacos" -- Cactus' Law
"t̥͍͎̪̪͗a̴̻̩͈͚ͨc̠o̩̙͈ͫͅs͙͎̙͊ ͔͇̫̜t͎̳̀a̜̞̗ͩc̗͍͚o̲̯̿s̖̣̤̙͌ ̖̜̈ț̰̫͓ạ̪͖̳c̲͎͕̰̯̃̈o͉ͅs̪ͪ ̜̻̖̜͕" -- -̖͚̫̙̓-̺̠͇ͤ̃ ̜̪̜ͯZ͔̗̭̞ͪA̝͈̙͖̩L͉̠̺͓G̙̞̦͖O̳̗͍

Offline

#6 2005-04-17 22:50:02

phrakture
Arch Overlord
From: behind you
Registered: 2003-10-29
Posts: 7,879
Website

Re: GCC Flags & ABS - speed test

i3839 wrote:

SSE is already used when -march=pentium3 is set.

yeah, even though I think the performance increase may be nice, it's not going to be noticable... and then you'll start losing support for some processors.  Yes, you'd start phasing out older processors, but then where does it stop? would the next step be MMX, then SSE2, then maybe we only allow AGP video cards (don't ask...)? I think it's important that Arch is baselined at i686.  Once the "additional" repos become a big thing, then it might be worthwhile to make an "sse2" repo or something... *Shrug*

Offline

#7 2005-04-18 16:12:48

jerem
Member
From: France
Registered: 2005-01-15
Posts: 310

Re: GCC Flags & ABS - speed test

-mfpmath is not a good idea, even if you have sse support.

Use -msse instead. And if you have a pIII just use -march=pentium3.


-O3 is not a good idea for big compilations(kde)

-02 is the best compromise

-01 is stupid

-0s can be useful on very slow computers

Generally, avoid using exotic flags in gcc. Especially with 3.4. And also know that some programs(Openoffice) dont compile with gcc 3.3.x.

Offline

#8 2005-04-18 21:20:09

miqorz
Member
Registered: 2004-12-31
Posts: 475

Re: GCC Flags & ABS - speed test

-omgod-optimized


http://wiki2.archlinux.org/

Read it. Love it. Live it. Or die.

Offline

#9 2005-04-18 21:51:23

dp
Member
From: Zürich, Switzerland
Registered: 2003-05-27
Posts: 3,378
Website

Re: GCC Flags & ABS - speed test

good idea for the testing ...

as a web browser is not doing much computing, i would suggest that you try to do your test with some apps that need computing ... i'll prepare a test-case ...


The impossible missions are the only ones which succeed.

Offline

#10 2005-04-18 22:03:34

dp
Member
From: Zürich, Switzerland
Registered: 2003-05-27
Posts: 3,378
Website

Re: GCC Flags & ABS - speed test

ok, here something to try:

in this file:

http://daperi.home.solnet.ch/uni/bk/mb/ … pled.fasta

there are 2 nucleic acid sequences (DNA) that can be alligned with muscle or clustalw (both available in extra)

whereas clustalw cannot be compiled with optimisation flags (at least not with gcc 3.4.x), muscle i compiled with standard arch flags for extra (see PKGBUILD here)

you can change the PKGBUILD and try better (=more optimised) ones, if you like

i measured it with "time" and here the example (on 2ghz pentium4 with arch flags):

[damir@Asteraceae mb]$ time muscle -in cc3285_and_cc1842_crippled.fasta -out out.fasta

MUSCLE v3.52 by Robert C. Edgar

http://www.drive5.com/muscle
This software is donated to the public domain.
Please cite: Edgar, R.C. Nucleic Acids Res 32(5), 1792-97.

cc3285_and_cc1842_crippled 2 seqs, max length 7070, avg  length 7070
00:00:00     11 MB(1%)  Iter   1  100.00%  K-mer dist pass 1
00:00:00     11 MB(1%)  Iter   1  100.00%  K-mer dist pass 2
00:00:03   234 MB(30%)  Iter   1  100.00%  Align node
00:00:03   234 MB(30%)  Iter   1  100.00%  Root alignment

real    0m3.714s
user    0m2.927s
sys     0m0.536s

NOTE: as you can see, muscle needs a LOT of RAM! (234mb for this short piece of genes) ... it is also a good testing case if your swap works and if your kernel can handle it correctly (btw: that's why i use kernel26mm) - if you have less than 230mb free ram for this experiment, simply open this example fasta-file in an editor and shorten both sequences in similar ways - the original genes (in full version) are on my site under uni/bk/mb/cc3285_and_cc1842.fasta (DONT try to allign them with muscle - it will need more ram that you have (i have 768mb and it is not able to finish and stops))


The impossible missions are the only ones which succeed.

Offline

#11 2005-04-18 22:10:32

phrakture
Arch Overlord
From: behind you
Registered: 2003-10-29
Posts: 7,879
Website

Re: GCC Flags & ABS - speed test

ummm... dp sometimes your knowledge scares me

Offline

#12 2005-04-18 22:11:32

cactus
Taco Eater
From: t͈̫̹ͨa͖͕͎̱͈ͨ͆ć̥̖̝o̫̫̼s͈̭̱̞͍̃!̰
Registered: 2004-05-25
Posts: 4,622
Website

Re: GCC Flags & ABS - speed test

that benchmark would only capture a limited performance benchmark.
For a good test, you would want something that tested floating point arithmatic time, integer arithmatic time, etc..

Something like this: http://shootout.alioth.debian.org/great … rt=fullcpu
but with performance comparisons, filesize output differences, and other information based upon different compiler flags instead of different language implementations, would be very cool. Their apps for C might be useful, as they are meant to cover a wide range of things.


"Be conservative in what you send; be liberal in what you accept." -- Postel's Law
"tacos" -- Cactus' Law
"t̥͍͎̪̪͗a̴̻̩͈͚ͨc̠o̩̙͈ͫͅs͙͎̙͊ ͔͇̫̜t͎̳̀a̜̞̗ͩc̗͍͚o̲̯̿s̖̣̤̙͌ ̖̜̈ț̰̫͓ạ̪͖̳c̲͎͕̰̯̃̈o͉ͅs̪ͪ ̜̻̖̜͕" -- -̖͚̫̙̓-̺̠͇ͤ̃ ̜̪̜ͯZ͔̗̭̞ͪA̝͈̙͖̩L͉̠̺͓G̙̞̦͖O̳̗͍

Offline

#13 2005-04-18 22:21:56

i3839
Member
Registered: 2004-02-04
Posts: 1,185

Re: GCC Flags & ABS - speed test

Enabling sse by default is silly indeed. Much better to focus on linker settings like -Wl,--as-needed and -Wl,-O1.

Offline

#14 2005-04-19 06:08:32

dp
Member
From: Zürich, Switzerland
Registered: 2003-05-27
Posts: 3,378
Website

Re: GCC Flags & ABS - speed test

phrakture: is this ironically meant? i study biology, so it's not really unusual to know how to use clustalw ... and muscle is a really cool app for short sequences but for longer ones, it is not more than a memory-benchmark ;-)

cactus: thanx for the link

i3839: exactly! (the --as-needed i still don't know how it works in detail but what i read it makessense)


The impossible missions are the only ones which succeed.

Offline

#15 2005-04-19 06:09:10

dp
Member
From: Zürich, Switzerland
Registered: 2003-05-27
Posts: 3,378
Website

Re: GCC Flags & ABS - speed test

just found out that i use the right fasta:
http://shootout.alioth.debian.org/great … rt=fullcpu


The impossible missions are the only ones which succeed.

Offline

#16 2005-04-19 14:18:48

i3839
Member
Registered: 2004-02-04
Posts: 1,185

Re: GCC Flags & ABS - speed test

dp wrote:

(the --as-needed i still don't know how it works in detail but what i read it makessense)

It is very simple, really. Normally the app is linked to all libs you tell it to link to, wether it is needed or not. With --as-needed is only links to the libs which are really required.

Offline

#17 2005-04-19 15:35:10

dp
Member
From: Zürich, Switzerland
Registered: 2003-05-27
Posts: 3,378
Website

Re: GCC Flags & ABS - speed test

i3839 wrote:
dp wrote:

(the --as-needed i still don't know how it works in detail but what i read it makessense)

It is very simple, really. Normally the app is linked to all libs you tell it to link to, wether it is needed or not. With --as-needed is only links to the libs which are really required.

thank you - that's exactly what i had meant, but my confusion is that this is not the normal behaviour (in my eyes, it would be more logical to have this behaviour as default and a flag to set if you need superficial libs to link)
... the question i ask myself: what good is it to have somethin glinked if it is not used in any case


The impossible missions are the only ones which succeed.

Offline

#18 2005-04-19 15:54:28

i3839
Member
Registered: 2004-02-04
Posts: 1,185

Re: GCC Flags & ABS - speed test

Yes, I had that struggle too: Why oh why isn't it the default behaviour? I guess because you told ld to link to something, so it wouldn't be polite to ignore that request. Though in practice it's probably good to have it enables as most people don't know and use ld options. Thus it would make sense and be good if gcc would pass those obviously good options to ld by default.

Offline

Board footer

Powered by FluxBB