You are not logged in.
As an update to my previous post about the changes to 2.6.32's scheduler as it affects x264 encodes on multicore CPUs, I did the same experiment I posted before, except this time I used 0.9.4 (final) of handbrake and compared the 2.6.31-6-ARCH vs. 2.6.32-ARCH currently in [testing] packages doing the same 720p test encode on a quad core machine.
$ time HandBrakeCLI --verbose --input hdtest.mpg --output test.m4v --preset="High Profile"
Arch x86_64 (kernel 2.6.31.6):
avg 16.35 fps
real 4m32.860s
user 12m42.657s
sys 0m44.547s
Arch x86_64 (kernel 2.6.32):
fps 22.73 fps
real 3m16.456s
user 11m47.144s
sys 0m44.374s
So .32 is about 28 % faster based on time and 39 % faster based on fps. Nice
Last edited by graysky (2009-12-06 11:00:43)
CPU-optimized Linux-ck packages @ Repo-ck • AUR packages • Zsh and other configs
Offline
.32 has poor ext4 performance.
Relevant links:
Explanation: http://www.phoronix.com/scan.php?page=a … ions&num=1
Benchmarks: http://www.phoronix.com/scan.php?page=a … arks&num=1
Offline
But according to the article you can fix that by adding "-o nobarrier" to fstab for any ext4 filesystem or am i getting it wrong?
@graysky
Can you please post the system you were using for the benchmark?
Last edited by Harlequin (2009-12-05 20:51:00)
Offline
@Harlequin: DFI LP LT P35-TR2 (BIOS: LP35D919)
X3360 (xeon version of the q9550) @ 8.5x400=3.40 GHz
TWIN2X4096-8500C5D 4 x 2 Gb @ 5-5-5-15 @ 1,000 MHz (4:5) @ 2.100 V using the 266/667 MHz strap
CPU-optimized Linux-ck packages @ Repo-ck • AUR packages • Zsh and other configs
Offline
But according to the article you can fix that by adding "-o nobarrier" to fstab for any ext4 filesystem or am i getting it wrong?
@graysky
Can you please post the system you were using for the benchmark?
"Hey, thanks for doing the digging
It is required for safe behavior with volatile write caches on drives.
You could mount with -o nobarrier and it would go away, but a sequence like write->fsync->lose power->reboot may well find your file without the data that you synced, if the drive had write caches enabled.
If you know you have no write cache, or that it is safely battery backed, then you can mount with -o nobarrier, and not incur this penalty.
-Eric"
It may not be the best idea unless you know you have "no write cache, or that it is safely battery backed". This will probably be enabled by default so that the poor ext4 performance will be there by default.
Offline
graysky,
On behalf of the people that actually read your opening post, nice! My primary mirror is going to sync in a few minutes and x264 is the one thing that I'm just itching to test.
@pyther,
I heard some really impressive numbers with BFS and x264. I'm curious to see how they compare.
Offline
I'm kinda surprised at these results...
time HandBrakeCLI --verbose --input test.mpg --output test2.mpg --preset="High Profile"
CFS (Default):
24.58fps
real 6m12.601s
user 10m59.148s
sys 1m32.075s
BFS
22.69fps
real 6m43.406s
user 11m2.861s
sys 1m31.017
So in this case, CFS is about 8.3% faster than BFS
AMD Athlon(tm) 7750 Dual-Core Processor
Last edited by pyther (2009-12-06 19:44:53)
Offline
.32 has poor ext4 performance.
Relevant links:
Explanation: http://www.phoronix.com/scan.php?page=a … ions&num=1
Benchmarks: http://www.phoronix.com/scan.php?page=a … arks&num=1
The explanation is a good read. Interesting how they are using git-bisect to automate regression tracking in the kernel including automated reboots.
Offline
I wonder how this "improved" scheduler compares to bfs.
Check out the link in my first post (to my original post). I have edited it to show the results I got with the bfs kernel. Also see my post in the handbrake forums (same data).
Last edited by graysky (2009-12-06 11:01:16)
CPU-optimized Linux-ck packages @ Repo-ck • AUR packages • Zsh and other configs
Offline
The performance increase is amazing. I didn't do any scientific benchmarks, but the speed increase is very obvious.
Offline
I agree, handbrake flies with vanilla 2.6.32
Arch x86_64
Offline
Update - Now that 2.6.34 is in testing and 2.6.34-ck is in the AUR, I ran these again. Results are more or less equal encoding Chapter 10 from Napoleon Dynamite DVD to the high profile as shown above.
Using handbrake-svn 3332:
kernel26 (stock): real time = 64.85 and fps = 80.78
kernel26-ck*: real time = 62.48 and fps = 83.78
kernel26-ck**: real time = 61.28 and fps = 85.51
All values are average of two runs; *I compiled kernel26-ck with 1000 Hz ticks/tickless option disabled whereas the stock kernel is only 300 Hz. **Here I compiled the kernel26-ck with 300 Hz ticks/tickless option enabled.
So the ck kernel is 6-8 % faster in both benchmarks.
Last edited by graysky (2010-05-30 20:05:22)
CPU-optimized Linux-ck packages @ Repo-ck • AUR packages • Zsh and other configs
Offline
Can you try with lower ticks, like 100Hz or so.
I want to see if it affects performance.
Offline
@ammon - sorry dude, no time to experiment further... if you're up to it, go for it and post the results. Just curiously, why are you interested in lower ticks? I haven't been able to measure a difference in power consumption between the Arch stock 2.6.34 and the kernel26-ck I compiled (300 Hz vs. 1000 Hz). See post 29 in this thread.
EDIT: did one more (editing my post above).
Last edited by graysky (2010-05-30 20:05:49)
CPU-optimized Linux-ck packages @ Repo-ck • AUR packages • Zsh and other configs
Offline
Story from Dark Shikari, one of the main x264 developers (I read his blog regularly, x264 is extremely fun to learn about, the most optimized software in existance perhaps): http://x264dev.multimedia.cx/?p=185
Offline