The linux-pf thread; BFS/CK, TuxOnIce, BFQ, AUFS3

nous · 2011-08-18 10:51:39

xclaude wrote:

yes your right, i'm using LZF.
LZO is still with error -22 when resuming.
and the error with the repository still exists in my vmware even after -Syy, but works correct on my notebook - i've no idea why it is so ?!?
both, the vm (bridged network) and my notebook behind the same router without a proxy.

I got failures in two core2 machines with LZF (an error about getting 4079 bytes while expectin 4096). Oh, well. Anyway, pf2 is out so you should be able to use the repo fine (and maybe the lzo error is fixed -- still building).

graysky · 2011-08-18 11:11:26

@nous - have a look at my linux-ck package, lines 173 and 196. What are you thoughts about including them in your PKGBUILD?

http://aur.archlinux.org/packages.php?ID=50911

nous · 2011-08-18 12:31:00

graysky wrote:

@nous - have a look at my linux-ck package, lines 173 and 196. What are you thoughts about including them in your PKGBUILD?

I had thought about that in the past, but take a look at these:

Pentium Pro
-mpreferred-stack-boundary=2 -march=i686 -mtune=generic -maccumulate-outgoing-args -mno-sse -mno-mmx -mno-sse2 -mno-3dnow -auxbase-strip -O2 -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -Werror=implicit-function-declaration -Wno-format-security -Wno-sign-compare -Wframe-larger-than=1024 -Wno-unused-but-set-variable -Wdeclaration-after-statement -Wno-pointer-sign -fno-strict-aliasing -fno-common -fno-delete-null-pointer-checks -freg-struct-return -ffreestanding -fstack-protector -fno-asynchronous-unwind-tables -fomit-frame-pointer -fno-strict-overflow -fconserve-stack

Core 2 64bit
-Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -Werror-implicit-function-declaration -Wno-format-security -fno-delete-null-pointer-checks -O2 -m64 -march=core2 -mno-red-zone -mcmodel=kernel -funit-at-a-time -maccumulate-outgoing-args -fstack-protector -pipe -Wno-sign-compare -fno-asynchronous-unwind-tables -mno-sse -mno-mmx -mno-sse2 -mno-3dnow -Wframe-larger-than=2048 -Wno-unused-but-set-variable -fomit-frame-pointer -Wdeclaration-after-statement -Wno-pointer-sign -fno-strict-overflow -fconserve-stack

Those are the standard KCFLAGS for the respective CPUs and you can see that, besides the -march directive, the other extra instructions are disabled, for good reason. Now, I'm not a gcc expert, but I reckon that the most speedwise influencing flags (arch, O-level, preferred-stack-boundary and omit-frame-pointer) are already there (pipe only speeds up compilation time). I just don't know how stable an extremely optimized kernel can be.

Besides, one can always make menuconfig before build and safely select his/her CPU.

Thanks for bringing this up though, I might add that KCFLAGS override as an option for the daring.

Last edited by nous (2011-08-18 12:36:18)

graysky · 2011-08-18 13:30:41

@nous - where'd you get the stuff in the code tags?

nous · 2011-08-19 06:05:45

graysky wrote:

@nous - where'd you get the stuff in the code tags?

Just 'ps axwwwww' when compiling.

xclaude · 2011-08-19 14:44:27

i have installed the package from repo, but the error -22 (Display in tuxoniceui-text: Compress read returned -22) still exists when using LZO on my system. btw. LZF is still working fine.

Thx for adding LZF as module to your standard config in precompiled packages.

nous · 2011-08-22 19:11:01

I've been doing some performance tests this evening, as I've been experiencing alarming moments of unresponsiveness in every box I run linux-pf during heavy disk activity. My feeling of course was subjective, but a stuck mouse pointer for periods of up to 2 seconds was not.

So I investigated a little further and found that switching the I/O scheduler from BFQ which is the default to anything else (NO-OP, CFQ, Deadline) solved the problem. Now, no-op is not suitable for mechanical hard disks, so I tested the latter two which behaved way better than BFQ, which suffered even before my system hit swap.

I'm about to switch back to either CFQ or Deadline in the next linux-pf update but I'd really like to hear some feedback from others. Switching to other scheduler is as easy as echoing its name to /sys/block/queue/scheduler:

echo cfq >| /sys/block/queue/scheduler
echo deadline >| /sys/block/queue/scheduler
echo bfq >| /sys/block/queue/scheduler

EDIT
Well, it seems that with kernel-3.0 there's provision for per device scheduling control. The new control file is /sys/block/<device>/queue/scheduler

echo deadline >| /sys/block/sda/queue/scheduler
echo cfq >| /sys/block/sda/queue/scheduler

Last edited by nous (2011-08-23 06:30:43)

m45t3r · 2011-08-22 21:11:38

nous wrote:

I've been doing some performance tests this evening, as I've been experiencing alarming moments of unresponsiveness in every box I run linux-pf during heavy disk activity. My feeling of course was subjective, but a stuck mouse pointer for periods of up to 2 seconds was not.
So I investigated a little further and found that switching the I/O scheduler from BFQ which is the default to anything else (NO-OP, CFQ, Deadline) solved the problem. Now, no-op is not suitable for mechanical hard disks, so I tested the latter two which behaved way better than BFQ, which suffered even before my system hit swap.
I'm about to switch back to either CFQ or Deadline in the next linux-pf update but I'd really like to hear some feedback from others. Switching to other scheduler is as easy as echoing its name to /sys/block/queue/scheduler:
echo cfq >| /sys/block/queue/scheduler
echo deadline >| /sys/block/queue/scheduler
echo bfq >| /sys/block/queue/scheduler

I am using BFQ more because CFQ on other kernels (including linux-ck) gave me unresponsiveness desktop on heavy disk activity. Not really my mouse pointer stuck, but Chrome and other softwares starts to lock up when the disk activity is high. After disk activity comes back to normal, the programs starts to respond again.

Didn't have this behavior on BFQ, but didn't really try a heavy disk transfer yet.

J. · 2011-08-23 09:52:57

I may as well say that I just tried this for the first time yesterday; no idea whether it counts as heavy or not, but to test this out, I just started 3 multiple-GiB files copying from an external HDD to the internal one, and maxed out each CPU core at the same time for the sake of it, and my desktop's acting as if there's nothing going on. Even a cold start for something big like GIMP is no slower than usual. All of this is completely different to the stock kernel.

I doubt it can have much to do with desktop environment/window manager?

nous · 2011-08-23 10:54:12

J. wrote:

I may as well say that I just tried this for the first time yesterday; no idea whether it counts as heavy or not, but to test this out, I just started 3 multiple-GiB files copying from an external HDD to the internal one, and maxed out each CPU core at the same time for the sake of it, and my desktop's acting as if there's nothing going on. Even a cold start for something big like GIMP is no slower than usual. All of this is completely different to the stock kernel.
I doubt it can have much to do with desktop environment/window manager?

No, it happens even with xfce4. But I also doubt the transfer rate from an external hdd can bring the internal one to its limits. Could you try and launch as many memory hogs as you can until it hits swap and see how it goes?

kalpik · 2011-08-23 11:03:13

Disk activity (even light, like copying a file from internal HDD to a USB Flash drive) stalls chrome

J. · 2011-08-23 11:58:05

nous wrote:

Could you try and launch as many memory hogs as you can until it hits swap and see how it goes?

Switching between programs started getting painful, of course, but I didn't get any cursor or other lockups until I actually ran out of swap as well (...).

nous · 2011-08-23 12:15:40

OK, thank you all.

@kalpik Do you get better response if you "echo deadline >| /sys/block/sda/queue/scheduler"?

Cape · 2011-08-23 17:44:23

@kalpik If your external HD is ntfs or fat32 there's nothing you can do... Support for those filesystems is very basic (and always will be), so high CPU usage and tiny throughput are to be expected. You should try with a "real" filesystem like ext2.

Meanwhile i've run some test on my machine (core 2 duo, 4gb ram) using "stress -d 60" and nepomunk running at full blown (for no reason, but...).
With -pf/BFQ i get a much better overall experience. Applications never freeze (actually i can use firefox pretty smoothly). But i've also noticed a higher cpu usage (from 80/85% to 90%) and i also get less throughput (overall writes to sda goes from ~28Mb/s to ~26Mb/s).

With stock ARCH/CFQ and same stress firefox is zombie and also killing the stress process took some time.

So i'd definitely say BFQ is better, but i don't care that much since in a couple months i'm going to have a super SSD which only needs a noop scheduler ;-)

nous · 2011-08-26 16:17:57

I just read a couple of threads over at the Gentoo forums and it would seem that the -22 decompression error only concerns x86_64 kernels. Can anyone on i686 confirm that? I only have access to 64bit boxes and they all fail at decompression. Also, someone reported that non-BFS kernels don't get that error...

Last edited by nous (2011-08-26 16:22:25)

lucke · 2011-08-26 19:01:35

I get -22 errors too and I run only i686 on my computers.

nous · 2011-08-26 20:52:14

I just tested Tuxonice/LZO with BFS disabled and it works fine...

safrax · 2011-08-28 05:00:21

https://gist.github.com/1172524

Any chance we could get this added in to the PF kernel patchset? Interactive governor from android. Supposed to give a nice battery life boost.

Last edited by safrax (2011-08-28 05:00:31)

nous · 2011-08-28 09:27:40

safrax wrote:

https://gist.github.com/1172524
Any chance we could get this added in to the PF kernel patchset? Interactive governor from android. Supposed to give a nice battery life boost.

Ask postfactum if he would. If he refuses, I'll try to add it as a separate patch in PKGBUILD.

nous · 2011-08-29 09:02:14

After a mini-brainstorming I had with postfactum, we pinpointed the cause of the decompression errors to commit 407850d of tuxonice (revert "restart/stop I/O worker threads outside of freezer"). A testing pf-patchset with the aforementioned commit removed, resumed normally. The problem is that the tuxonice maintainer didn't respond to postfactum's emails. Maybe if more people "annoyed" him...

ethail · 2011-08-29 09:57:29

I've followed this issue with some interest as my laptop was not using tuxonice since the issues with 2.6.39.4. There are some things that sound odd to me:

nous wrote:

I just tested Tuxonice/LZO with BFS disabled and it works fine...

nous wrote:

After a mini-brainstorming I had with postfactum, we pinpointed the cause of the decompression errors to commit 407850d of tuxonice (revert "restart/stop I/O worker threads outside of freezer"). A testing pf-patchset with the aforementioned commit removed, resumed normally. The problem is that the tuxonice maintainer didn't respond to postfactum's emails. Maybe if more people "annoyed" him...

If it "only broke" with BFS (that's what we know for now), there is something that bothers me. Who has really the blame, BFS or tuxonice? I mean, Do we know for sure that BFS works with other kind of hibernation methods?

I just would want to be sure that we're blaming the true culprit instead of not reporting this to CK if has to be reported.

nous · 2011-08-29 10:40:13

ethail wrote:

I've followed this issue with some interest as my laptop was not using tuxonice since the issues with 2.6.39.4. There are some things that sound odd to me:
nous wrote:
I just tested Tuxonice/LZO with BFS disabled and it works fine...
nous wrote:
After a mini-brainstorming I had with postfactum, we pinpointed the cause of the decompression errors to commit 407850d of tuxonice (revert "restart/stop I/O worker threads outside of freezer"). A testing pf-patchset with the aforementioned commit removed, resumed normally. The problem is that the tuxonice maintainer didn't respond to postfactum's emails. Maybe if more people "annoyed" him...
If it "only broke" with BFS (that's what we know for now), there is something that bothers me. Who has really the blame, BFS or tuxonice? I mean, Do we know for sure that BFS works with other kind of hibernation methods?
I just would want to be sure that we're blaming the true culprit instead of not reporting this to CK if has to be reported.

Well, it's not odd since both are out-of-mainline patches and they grew incompatible to each other since 2.6.39.3:
- BFS and in-kernel hibernation (with really fast threaded lzo compression too) works like a charm in 3 of my boxes.
- Stock kernel and tuxonice+lzo also work in those same boxes.
- The -pf patchset is for people who want them both. Personally, I want tuxonice because of its easy way to show a nice framebuffer with a progress bar on suspend/resume.

I asked CK at his blog about BFS and tuxonice and he said "freezing all the tasks and putting the scheduler in a state where the suspend process can proceed is EXTREMELY non-trivial, so just about any sort of interaction can happen that breaks suspend / resume. Last I saw, the compression is done in weird ways like disabling interrupts. Just about anything bad might happen like that.".

It's rather obvious that one of them triggers a bug of the other (or maybe both do). Me and postfactum found just a possible guilty commit in tuxonice.

ethail · 2011-08-29 10:57:56

That makes a lot of sense now. I guess no more "odd things" remain there.

Thank you for the detailed explanation, nous.

post-factum · 2011-08-31 15:30:45

Well, Nigel has responded. He's going to apply both BFS and TuxOnIce and debug this issue.

nous · 2011-08-31 17:52:27

New goodies in 3.0.3-pf: aufs3 and interactive cpufreq governor.

Last edited by nous (2011-08-31 17:52:56)

Arch Linux

#176 2011-08-18 10:51:39

Re: The linux-pf thread; BFS/CK, TuxOnIce, BFQ, AUFS3

#177 2011-08-18 11:11:26

Re: The linux-pf thread; BFS/CK, TuxOnIce, BFQ, AUFS3

#178 2011-08-18 12:31:00

Re: The linux-pf thread; BFS/CK, TuxOnIce, BFQ, AUFS3

#179 2011-08-18 13:30:41

Re: The linux-pf thread; BFS/CK, TuxOnIce, BFQ, AUFS3

#180 2011-08-19 06:05:45

Re: The linux-pf thread; BFS/CK, TuxOnIce, BFQ, AUFS3

#181 2011-08-19 14:44:27

Re: The linux-pf thread; BFS/CK, TuxOnIce, BFQ, AUFS3

#182 2011-08-22 19:11:01

Re: The linux-pf thread; BFS/CK, TuxOnIce, BFQ, AUFS3

#183 2011-08-22 21:11:38

Re: The linux-pf thread; BFS/CK, TuxOnIce, BFQ, AUFS3

#184 2011-08-23 09:52:57

Re: The linux-pf thread; BFS/CK, TuxOnIce, BFQ, AUFS3

#185 2011-08-23 10:54:12

Re: The linux-pf thread; BFS/CK, TuxOnIce, BFQ, AUFS3

#186 2011-08-23 11:03:13

Re: The linux-pf thread; BFS/CK, TuxOnIce, BFQ, AUFS3

#187 2011-08-23 11:58:05

Re: The linux-pf thread; BFS/CK, TuxOnIce, BFQ, AUFS3

#188 2011-08-23 12:15:40

Re: The linux-pf thread; BFS/CK, TuxOnIce, BFQ, AUFS3

#189 2011-08-23 17:44:23

Re: The linux-pf thread; BFS/CK, TuxOnIce, BFQ, AUFS3

#190 2011-08-26 16:17:57

Re: The linux-pf thread; BFS/CK, TuxOnIce, BFQ, AUFS3

#191 2011-08-26 19:01:35

Re: The linux-pf thread; BFS/CK, TuxOnIce, BFQ, AUFS3

#192 2011-08-26 20:52:14

Re: The linux-pf thread; BFS/CK, TuxOnIce, BFQ, AUFS3

#193 2011-08-28 05:00:21

Re: The linux-pf thread; BFS/CK, TuxOnIce, BFQ, AUFS3

#194 2011-08-28 09:27:40

Re: The linux-pf thread; BFS/CK, TuxOnIce, BFQ, AUFS3

#195 2011-08-29 09:02:14

Re: The linux-pf thread; BFS/CK, TuxOnIce, BFQ, AUFS3

#196 2011-08-29 09:57:29

Re: The linux-pf thread; BFS/CK, TuxOnIce, BFQ, AUFS3

#197 2011-08-29 10:40:13

Re: The linux-pf thread; BFS/CK, TuxOnIce, BFQ, AUFS3

#198 2011-08-29 10:57:56

Re: The linux-pf thread; BFS/CK, TuxOnIce, BFQ, AUFS3

#199 2011-08-31 15:30:45

Re: The linux-pf thread; BFS/CK, TuxOnIce, BFQ, AUFS3

#200 2011-08-31 17:52:27

Re: The linux-pf thread; BFS/CK, TuxOnIce, BFQ, AUFS3

Board footer