You are not logged in.

#1 2009-12-05 14:50:51

graysky
Wiki Maintainer
From: :wq
Registered: 2008-12-01
Posts: 10,675
Website

kernel 2.6.32 final/x264 multicore speed increases

As an update to my previous post about the changes to 2.6.32's scheduler as it affects x264 encodes on multicore CPUs, I did the same experiment I posted before, except this time I used 0.9.4 (final) of handbrake and compared the 2.6.31-6-ARCH vs. 2.6.32-ARCH currently in [testing] packages doing the same 720p test encode on a quad core machine.

$ time HandBrakeCLI --verbose --input hdtest.mpg --output test.m4v --preset="High Profile"

Arch x86_64 (kernel 2.6.31.6):

avg 16.35 fps

real    4m32.860s
user    12m42.657s
sys     0m44.547s

Arch x86_64 (kernel 2.6.32):

fps 22.73 fps

real    3m16.456s
user    11m47.144s
sys    0m44.374s

So .32 is about 28 % faster based on time and 39 % faster based on fps.  Nice smile

Last edited by graysky (2009-12-06 11:00:43)


CPU-optimized Linux-ck packages @ Repo-ck  • AUR packagesZsh and other configs

Offline

#2 2009-12-05 15:57:31

tomd123
Developer
Registered: 2008-08-12
Posts: 565

Re: kernel 2.6.32 final/x264 multicore speed increases

.32 has poor ext4 performance.

Relevant links:
Explanation: http://www.phoronix.com/scan.php?page=a … ions&num=1
Benchmarks: http://www.phoronix.com/scan.php?page=a … arks&num=1

Offline

#3 2009-12-05 19:53:56

Harlequin
Member
Registered: 2009-07-28
Posts: 25

Re: kernel 2.6.32 final/x264 multicore speed increases

But according to the article you can fix that by adding "-o nobarrier" to fstab for any ext4 filesystem or am i getting it wrong?

@graysky
Can you please post the system you were using for the benchmark?

Last edited by Harlequin (2009-12-05 20:51:00)

Offline

#4 2009-12-05 22:34:40

graysky
Wiki Maintainer
From: :wq
Registered: 2008-12-01
Posts: 10,675
Website

Re: kernel 2.6.32 final/x264 multicore speed increases

@Harlequin: DFI LP LT P35-TR2 (BIOS: LP35D919)
X3360 (xeon version of the q9550) @ 8.5x400=3.40 GHz
TWIN2X4096-8500C5D 4 x 2 Gb @ 5-5-5-15 @ 1,000 MHz (4:5) @ 2.100 V using the 266/667 MHz strap


CPU-optimized Linux-ck packages @ Repo-ck  • AUR packagesZsh and other configs

Offline

#5 2009-12-05 23:25:47

tomd123
Developer
Registered: 2008-08-12
Posts: 565

Re: kernel 2.6.32 final/x264 multicore speed increases

Harlequin wrote:

But according to the article you can fix that by adding "-o nobarrier" to fstab for any ext4 filesystem or am i getting it wrong?

@graysky
Can you please post the system you were using for the benchmark?

"Hey, thanks for doing the digging smile

It is required for safe behavior with volatile write caches on drives.

You could mount with -o nobarrier and it would go away, but a sequence like write->fsync->lose power->reboot may well find your file without the data that you synced, if the drive had write caches enabled.

If you know you have no write cache, or that it is safely battery backed, then you can mount with -o nobarrier, and not incur this penalty.

-Eric"

It may not be the best idea unless you know you have "no write cache, or that it is safely battery backed". This will probably be enabled by default so that the poor ext4 performance will be there by default.

Offline

#6 2009-12-06 00:05:48

pyther
Member
Registered: 2008-01-21
Posts: 1,395
Website

Re: kernel 2.6.32 final/x264 multicore speed increases

I wonder how this "improved" scheduler compares to bfs.


Website - Blog - arch-home
Arch User since March 2005

Offline

#7 2009-12-06 00:21:37

skottish
Forum Fellow
From: Here
Registered: 2006-06-16
Posts: 7,942

Re: kernel 2.6.32 final/x264 multicore speed increases

graysky,

On behalf of the people that actually read your opening post, nice! My primary mirror is going to sync in a few minutes and x264 is the one thing that I'm just itching to test.

@pyther,

I heard some really impressive numbers with BFS and x264. I'm curious to see how they compare.

Offline

#8 2009-12-06 05:05:16

pyther
Member
Registered: 2008-01-21
Posts: 1,395
Website

Re: kernel 2.6.32 final/x264 multicore speed increases

I'm kinda surprised at these results...

time HandBrakeCLI --verbose --input test.mpg --output test2.mpg --preset="High Profile"

CFS (Default):

24.58fps

real  6m12.601s
user  10m59.148s
sys  1m32.075s

BFS

22.69fps

real  6m43.406s
user  11m2.861s
sys  1m31.017

So in this case, CFS is about 8.3% faster than BFS

AMD Athlon(tm) 7750 Dual-Core Processor

Last edited by pyther (2009-12-06 19:44:53)


Website - Blog - arch-home
Arch User since March 2005

Offline

#9 2009-12-06 05:29:49

mikesd
Member
From: Australia
Registered: 2008-02-01
Posts: 788
Website

Re: kernel 2.6.32 final/x264 multicore speed increases

tomd123 wrote:

The explanation is a good read. Interesting how they are using git-bisect to automate regression tracking in the kernel including automated reboots.

Offline

#10 2009-12-06 05:37:41

pyther
Member
Registered: 2008-01-21
Posts: 1,395
Website

Re: kernel 2.6.32 final/x264 multicore speed increases

I decided to compile a vanilla 2.6.32 kernel using both CFQ and BFS.

BFS was 2 seconds faster than CFS. Interesting...


Website - Blog - arch-home
Arch User since March 2005

Offline

#11 2009-12-06 10:58:29

graysky
Wiki Maintainer
From: :wq
Registered: 2008-12-01
Posts: 10,675
Website

Re: kernel 2.6.32 final/x264 multicore speed increases

pyther wrote:

I wonder how this "improved" scheduler compares to bfs.

Check out the link in my first post (to my original post).  I have edited it to show the results I got with the bfs kernel.  Also see my post in the handbrake forums (same data).

Last edited by graysky (2009-12-06 11:01:16)


CPU-optimized Linux-ck packages @ Repo-ck  • AUR packagesZsh and other configs

Offline

#12 2009-12-06 23:34:40

skottish
Forum Fellow
From: Here
Registered: 2006-06-16
Posts: 7,942

Re: kernel 2.6.32 final/x264 multicore speed increases

The performance increase is amazing. I didn't do any scientific benchmarks, but the speed increase is very obvious.

Offline

#13 2010-01-10 14:02:18

o1911
Member
From: Hobart, Australia
Registered: 2009-04-28
Posts: 106

Re: kernel 2.6.32 final/x264 multicore speed increases

I agree, handbrake flies with vanilla 2.6.32 big_smile


Arch x86_64

Offline

#14 2010-05-30 12:23:45

graysky
Wiki Maintainer
From: :wq
Registered: 2008-12-01
Posts: 10,675
Website

Re: kernel 2.6.32 final/x264 multicore speed increases

Update - Now that 2.6.34 is in testing and 2.6.34-ck is in the AUR, I ran these again.  Results are more or less equal encoding Chapter 10 from Napoleon Dynamite DVD to the high profile as shown above.

Using handbrake-svn 3332:
kernel26 (stock): real time = 64.85 and fps = 80.78
kernel26-ck*: real time = 62.48 and fps = 83.78
kernel26-ck**: real time = 61.28 and fps = 85.51


All values are average of two runs;  *I compiled kernel26-ck with 1000 Hz ticks/tickless option disabled whereas the stock kernel is only 300 Hz. **Here I compiled the kernel26-ck with 300 Hz ticks/tickless option enabled.

So the ck kernel is 6-8 % faster in both benchmarks.

Last edited by graysky (2010-05-30 20:05:22)


CPU-optimized Linux-ck packages @ Repo-ck  • AUR packagesZsh and other configs

Offline

#15 2010-05-30 13:11:51

ammon
Member
Registered: 2008-12-11
Posts: 413

Re: kernel 2.6.32 final/x264 multicore speed increases

Can you try with lower ticks, like 100Hz or so.
I want to see if it affects performance.

Offline

#16 2010-05-30 17:59:31

graysky
Wiki Maintainer
From: :wq
Registered: 2008-12-01
Posts: 10,675
Website

Re: kernel 2.6.32 final/x264 multicore speed increases

@ammon - sorry dude, no time to experiment further... if you're up to it, go for it and post the results.  Just curiously, why are you interested in lower ticks?  I haven't been able to measure a difference in power consumption between the Arch stock 2.6.34 and the kernel26-ck I compiled (300 Hz vs. 1000 Hz).  See post 29 in this thread.

EDIT: did one more (editing my post above).

Last edited by graysky (2010-05-30 20:05:49)


CPU-optimized Linux-ck packages @ Repo-ck  • AUR packagesZsh and other configs

Offline

#17 2010-05-30 21:51:09

Ranguvar
Member
Registered: 2008-08-12
Posts: 2,563

Re: kernel 2.6.32 final/x264 multicore speed increases

Story from Dark Shikari, one of the main x264 developers (I read his blog regularly, x264 is extremely fun to learn about, the most optimized software in existance perhaps): http://x264dev.multimedia.cx/?p=185

Offline

Board footer

Powered by FluxBB