You are not logged in.

#1 2011-06-07 10:44:51

made_in_nz
Member
From: Wellington, New Zealand
Registered: 2010-01-04
Posts: 53

[SOLVED] System slows after high CPU load

After running a CPU intensive app for some time - usually 5-10 mins, the system becomes incredibly slow. 
Typically this occurs when watching flash videos in the browser (firefox and chrome), but I also experienced it last night when remote desktop-ing into my laptop with NX and running Eclipse and a Windows virtualbox guest.

I originally thought I had the same problem as this post: https://bbs.archlinux.org/viewtopic.php?id=104762  as when I ran cpufreq-info when the system was slow the CPU's had in fact slowed to 900Mhz instead of 2.65Ghz. 

I tried adding the suggested  processor.ignore_ppc=1 to the kernel boot line but the problem continues. 
I have also tried uninstalling cpufrequtils and removing the cpufreq_* modules from my @MODULES in /etc/rc.conf, but the problem still continues.

Once the system slows, I have to reboot.  Logging out and restarting X doesn't fix it. 
I haven't found anything in the logs to indicate the problem.

I am running the lastest Arch:

[nickh] 51-> uname -a
Linux nickh 2.6.38-ARCH #1 SMP PREEMPT Mon May 23 20:04:02 UTC 2011 i686 Intel(R) Core(TM)2 Duo CPU T9550 @ 2.66GHz GenuineIntel GNU/Linux
[nickh] 56-> dmesg | grep -i chipset
[    0.789388] DMAR: Disabling IOMMU for graphics on this chipset
[    2.255498] agpgart-intel 0000:00:00.0: Intel GM45 Chipset
[nickh] 57-> dmesg | grep -i intel
[    0.077296] CPU0: Intel(R) Core(TM)2 Duo CPU     T9550  @ 2.66GHz stepping 0a
[    0.079995] Performance Events: PEBS fmt0+, Core2 events, Intel PMU driver.
[    1.419517] intel_idle: MWAIT substates: 0x3122220
[    1.419533] intel_idle: does not run on family 6 model 23
[    2.255498] agpgart-intel 0000:00:00.0: Intel GM45 Chipset
[    2.256140] agpgart-intel 0000:00:00.0: detected gtt size: 2097152K total, 262144K mappable
[    2.260231] agpgart-intel 0000:00:00.0: detected 32768K stolen memory
[    2.261111] agpgart-intel 0000:00:00.0: AGP aperture is 256M @ 0xe0000000
[    3.445739] fbcon: inteldrmfb (fb0) is primary device
[    3.526627] fb0: inteldrmfb frame buffer device
[   13.235332] iTCO_wdt: Intel TCO WatchDog Timer Driver v1.06
[   16.176397] HDA Intel 0000:00:1b.0: PCI INT A -> GSI 21 (level, low) -> IRQ 21
[   16.176642] HDA Intel 0000:00:1b.0: irq 45 for MSI/MSI-X
[   16.176809] HDA Intel 0000:00:1b.0: setting latency timer to 64
[   16.326034] iwlagn: Intel(R) Wireless WiFi Link AGN driver for Linux, in-tree:
[   16.326048] iwlagn: Copyright(c) 2003-2010 Intel Corporation
[   16.326592] iwlagn 0000:0c:00.0: Detected Intel(R) WiFi Link 5100 AGN, REV=0x54
[   16.363170] input: HDA Intel Mic at Sep Left Jack as /devices/pci0000:00/0000:00:1b.0/sound/card0/input9
[   16.373767] input: HDA Intel Mic at Ext Right Jack as /devices/pci0000:00/0000:00:1b.0/sound/card0/input10
[   16.374489] input: HDA Intel Line Out at Sep Left Jack as /devices/pci0000:00/0000:00:1b.0/sound/card0/input11
[   16.375100] input: HDA Intel HP Out at Ext Right Jack as /devices/pci0000:00/0000:00:1b.0/sound/card0/input12

I'm happy to give more info if you can tell me what so I can get to the bottom of this.
Thanks.

Last edited by made_in_nz (2011-06-14 05:48:26)

Offline

#2 2011-06-07 11:44:33

einhard
Member
From: Poland
Registered: 2010-01-05
Posts: 89

Re: [SOLVED] System slows after high CPU load

Have you checked your CPU temperature with sensors? Most processors throttle when critical temperature is reached (90-105 C). What laptop do you have? What DM (KDE, gnome, xfc, etc.) and power managment solutions are you using? I don't see discrete card but do your notebook have one (Nvidia Optimus?)?

Last edited by einhard (2011-06-07 11:47:21)

Offline

#3 2011-06-07 13:49:13

made_in_nz
Member
From: Wellington, New Zealand
Registered: 2010-01-04
Posts: 53

Re: [SOLVED] System slows after high CPU load

Good idea about sensors, but I have checked this and the CPU temps are in the low 60's and well off high or critical temp when the issue occurs.

My laptop is a Dell Latitude E5500, and I'm running LXDE. 
I don't have any power management package setup.
The Intel GM45 Chipset has integrated graphics, and I don't have any other graphics card installed.

Offline

#4 2011-06-07 14:14:56

einhard
Member
From: Poland
Registered: 2010-01-05
Posts: 89

Re: [SOLVED] System slows after high CPU load

Set governor to performance/ondemand with cpufreq

# cpufreq-set -c 0 -g ondemand
# cpufreq-set -c 1 -g ondemand

or just install laptop-mode-tools and activate/configure cpufreq hook. You should use some power management solutions on notebook (especially on LXDE). Do you use cpufreq daemon (not module)?

Can you change frequency when CPU is in this "slow mode" with

# cpufreq-set -c 0 -g userspace -f 2.66Ghz
# cpufreq-set -c 1 -g userspace -f 2.66Ghz

Could you write your modules and daemons from rc.conf?

Last edited by einhard (2011-06-07 14:32:38)

Offline

#5 2011-06-08 05:30:13

made_in_nz
Member
From: Wellington, New Zealand
Registered: 2010-01-04
Posts: 53

Re: [SOLVED] System slows after high CPU load

I do have acpid installed and it is started by hal:

[root@nickh ~]# ps -ef | grep acpid
root        19     2  0 08:03 ?        00:00:00 [kacpid]
root      2212     1  0 08:04 ?        00:00:00 /usr/sbin/acpid
hal       2267  2215  0 08:04 ?        00:00:00 hald-addon-acpi: listening on acpid socket /var/run/acpid.socket
root      3926  3919  0 08:13 pts/0    00:00:00 grep acpid

I actually uninstalled cpufrequtils as part of trying to investigate this problem - to rule out cpufrequtils being the cause of scaling the CPU down to a low speed.  Before I uninstalled cpufrequtils I did try increasing the CPU frequency with your second set of commands but this did not speed up the system.  So I uninstalled cpufrequtils and removed acpi-cpufreq from my MODULES, however the problem still occurs.

Right now in my /etc/rc.conf I have:

MODULES=(!pcspkr !snd_pcm_oss !snd_mixer_oss !snd_seq_oss vboxdrv vboxnetflt vboxnetadp !cisco_ipsec)

I previously had:

MODULES=(!pcspkr !snd_pcm_oss !snd_mixer_oss !snd_seq_oss vboxdrv vboxnetflt vboxnetadp !cisco_ipsec acpi-cpufreq cpufreq_ondemand cpufreq_powersave)

And daemons:

DAEMONS=(syslog-ng hal net-auto-wired net-auto-wireless sensors alsa cups sshd crond !httpd !transmissiond xinetd jetty)

For now I don't want to install any power management solution.  If my CPU temps are OK I don't see how adding CPU scaling into the mix can help.  However I will load the acpi-cpufreq module and watch what goes on next time with:

[root@nickh ~]# modprobe acpi-cpufreq
[root@nickh ~]# cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_available_governors
performance 
[root@nickh ~]# watch grep \"cpu MHz\" /proc/cpuinfo
Every 2.0s: grep "cpu MHz" /proc/cpuinfo               Wed Jun  8 08:26:19 2011

cpu MHz         : 2668.000
cpu MHz         : 2668.000

I'll post back any more information i gather next time it occurs.  einhard thanks for your replies.

Offline

#6 2011-06-08 11:19:36

einhard
Member
From: Poland
Registered: 2010-01-05
Posts: 89

Re: [SOLVED] System slows after high CPU load

I did try increasing the CPU frequency with your second set of commands but this did not speed up the system

But has it worked? Have you checked with cpufreq-info if frequency changed from 900 Mhz to 2,66 Ghz?

The problem is probably acpid daemon. Uninstall it (at least for confirmation). You use hal? What for? LXDE don't need it. Hal can cause this issue too.

If you don't have any decent power management solution system sometimes won't know what to do and just toss some defaults, on desktop it's not big deal but on notebook it can create real mess. If you don't need CPU scaling just use performance governor (default in Arch) or permanently set frequency to maximum. On the other hand ondemand behaviour is probably the best for both, hardware and user.

If I were you I would uninstall acpid (or look closely to it's configuration), hal, install laptop-mode-tools https://wiki.archlinux.org/index.php/Laptop_Mode_Tools, configure it (especially cpufreq hook, where you can set everything you want it to do on AC, battery, closed lid, etc.) and add laptop-mode with dbus to daemons in rc.conf. You will of course need acpi-cpufreq in modules.

Have you checked CPU load on this slowdown with htop/top/lxtask? Maybe some program/service is causing this. What's surprising, I have old laptop with really similar processor T9400 (2,53 Ghz) and on powersave (800Mhz) it's not slow at all on KDE, not mention LXDE. How do you start LXDE session?

Last edited by einhard (2011-06-08 11:43:03)

Offline

#7 2011-06-08 14:00:59

made_in_nz
Member
From: Wellington, New Zealand
Registered: 2010-01-04
Posts: 53

Re: [SOLVED] System slows after high CPU load

But has it worked? Have you checked with cpufreq-info if frequency changed from 900 Mhz to 2,66 Ghz?

No, the frequency did not change.  Since my last post I have rebooted twice due to this problem.  Most recently, I have reinstalled cpufrequtils and tried to reset the frequencies.  I also added cpufreq_userspace to MODULES.  I did the following:

[root@nickh ~]# cpufreq-set -c 0 -g userspace
[root@nickh ~]# cpufreq-set -c 1 -g userspace
[root@nickh ~]# cpufreq-set -c 0 -f 2.67GHz
[root@nickh ~]# cpufreq-set -c 1 -f 2.67GHz

However cpufreq-info reported:

[root@nickh ~]# cpufreq-info
cpufrequtils 008: cpufreq-info (C) Dominik Brodowski 2004-2009
Report errors and bugs to cpufreq@vger.kernel.org, please.
analyzing CPU 0:
  driver: acpi-cpufreq
  CPUs which run at the same hardware frequency: 0 1
  CPUs which need to have their frequency coordinated by software: 0
  maximum transition latency: 10.0 us.
  hardware limits: 800 MHz - 2.67 GHz
  available frequency steps: 2.67 GHz, 2.67 GHz, 2.13 GHz, 1.60 GHz, 800 MHz
  available cpufreq governors: userspace, performance
  current policy: frequency should be within 800 MHz and 800 MHz.
                  The governor "userspace" may decide which speed to use
                  within this range.
  current CPU frequency is 800 MHz (asserted by call to hardware).
analyzing CPU 1:
  driver: acpi-cpufreq
  CPUs which run at the same hardware frequency: 0 1
  CPUs which need to have their frequency coordinated by software: 1
  maximum transition latency: 10.0 us.
  hardware limits: 800 MHz - 2.67 GHz
  available frequency steps: 2.67 GHz, 2.67 GHz, 2.13 GHz, 1.60 GHz, 800 MHz
  available cpufreq governors: userspace, performance
  current policy: frequency should be within 800 MHz and 800 MHz.
                  The governor "userspace" may decide which speed to use
                  within this range.
  current CPU frequency is 800 MHz (asserted by call to hardware).

From this the 800 - 800 range seems wrong, so I tried:

[root@nickh ~]# cpufreq-set -c 0 -u 2.67GHz
[root@nickh ~]# cpufreq-set -c 1 -u 2.67GHz

But the upper range stayed at 800 MHz.  So I can't seem to increase the frequency with cpufreq-set.

The problem is probably acpid daemon. Uninstall it (at least for confirmation). You use hal? What for? LXDE don't need it. Hal can cause this issue too.

I use hal because I run a couple of KDE apps.

Have you checked CPU load on this slowdown with htop/top/lxtask? Maybe some program/service is causing this. What's surprising, I have old laptop with really similar processor T9400 (2,53 Ghz) and on powersave (800Mhz) it's not slow at all on KDE, not mention LXDE. How do you start LXDE session?

Yes, I have checked the load and basically everything is consuming more CPU time than it should.  Simple things take up a huge portion of CPU time.  For example if i drag a terminal window left and right slowly across the screen, X consumes about 65% of my CPUs (both not one).   Conky will often take up more than 30% of CPU time.  Its not limited to one bad process.
I was also surprised as I would expect a dual-core processor at 800Mhz to be able to LXDE fine.
I start LXDE in my .xinitrc

exec ck-launch-session startlxde

I don't run a GUI login manager.

Right now I don't have processor.ignore_ppc=1 on the kernel boot line - shall I add that?
I will try uninstalling acpid and see what happens.
What I don't understand is that it was running with only the performance governor available yet still dropped to 800MHz, and couldn't be increased.

Thanks again.

Offline

#8 2011-06-08 15:12:58

einhard
Member
From: Poland
Registered: 2010-01-05
Posts: 89

Re: [SOLVED] System slows after high CPU load

What cpufreq-info shows when on performance governor?

I use hal because I run a couple of KDE apps.

KDE doesn't need hal from 4.6 so KDE apps either. All you need is dbus in daemons since it won't be started automatically by hal.

Could you install trayfreq (tray application with frequency control https://wiki.archlinux.org/index.php/Trayfreq) http://aur.archlinux.org/packages.php?ID=26999 and see if you can set max frequency with it?

I still think it's someting with acpid and hal. Nothing else can trigger frequency change on your system.

Now that I think, is your system up to date?

Last edited by einhard (2011-06-08 15:30:22)

Offline

#9 2011-06-09 06:33:01

made_in_nz
Member
From: Wellington, New Zealand
Registered: 2010-01-04
Posts: 53

Re: [SOLVED] System slows after high CPU load

Thanks, I wasn't aware that hal was no longer required by KDE.  I no longer have hal or acpid.  I have installed trayfreq, and while the system is running fine I can switch frequencies.  I will wait and see if the problem recurs and see if I can increase the frequency with trayfreq at that time.  Right now I have not loaded any additional governors.  If things settle down I will configure trayfreq to handle the scaling.

In my rc.conf I now have:

MODULES=(vboxdrv vboxnetflt vboxnetadp acpi-cpufreq)
...
DAEMONS=(syslog-ng dbus udev net-auto-wired net-auto-wireless sensors alsa cups sshd crond !httpd !transmissiond xinetd jetty)

I'll post back again once something, or hopefully *nothing*, happens...

EDIT - I should have mentioned that my system is up to date.

Last edited by made_in_nz (2011-06-09 06:33:38)

Offline

#10 2011-06-09 08:26:10

made_in_nz
Member
From: Wellington, New Zealand
Registered: 2010-01-04
Posts: 53

Re: [SOLVED] System slows after high CPU load

It has dropped again.  This time running amarok and eclipse was enough to trigger it.
trayfreq is unable to increase the frequency.  cpufreq-info output is the same as I have posted earlier when stuck at 800MHz.

To confirm, I do not have hal or acpid installed or running:

[nickh] 53-> ps -ef | grep hal
nicholas  9491  9478  0 11:17 pts/7    00:00:00 grep hal
[nickh] 54-> ps -ef | grep acpid
root        19     2  0 09:14 ?        00:00:00 [kacpid]
nicholas  9493  9478  0 11:17 pts/7    00:00:00 grep acpid
[nickh] 55->

Feels like I am running out of options.  This is my work machine so I need to resolve this somehow.

Offline

#11 2011-06-09 10:19:08

made_in_nz
Member
From: Wellington, New Zealand
Registered: 2010-01-04
Posts: 53

Re: [SOLVED] System slows after high CPU load

I'm now pretty sure this is a hardware issue and not software. 

This morning I noticed my screen flickering during boot when switching to the framebuffer, a bright grey flash.  Also when I switch to the external monitor and turn the LVDS screen off it remains on with a grey screen.

The litmus test though, was booting up an Ubuntu LiveCD.  By the time ubuntu had loaded the frequency had dropped to 800MHz.

Before I call in Dell support, I have one last theory, and that is that it is very hot here in Crete at the moment.  I don't turn on air-con in my office and it is about 30-31 degrees celsius.  Despite the sensors readings constantly displaying the CPUs temp in the late 60's and reporting that a high/critical temp is 105, could it be a heat issue?  I will try running the laptop tonight when the temperature drops a bit to see what happens.

Is there any logging if the hardware is forcing the CPU frequency down?

Thanks again.

Offline

#12 2011-06-09 16:18:18

einhard
Member
From: Poland
Registered: 2010-01-05
Posts: 89

Re: [SOLVED] System slows after high CPU load

You may be right. From the start it looked like critical hardware throttling. Do you have Windows on this laptop? If yes, does this problem occur there?

You don't have coretemp in modules so i don't know from where you have temperatures, Dell internal sensors? In most notebooks there are at least two CPU sensors, one pair of thermal probes in processor (which can show garbage when wrongly calibrated - not so rare with Intel CPUs) and second pair on processor provided by manufacturer. If I remember correctly first one trigger emergency shutdown and throttling. You can check if it's heat problem with setting governor to powersave, CPU shouldn't be hot with that even in extreme conditions.

You say that temperatures are 6x but you probably read them on idle, it won't be suprising if they reached 100 when on full load. I have checked your laptop tests on notebookcheck and their processor (with similar TDP) went to 85 on load with 22 air temperature.

Is there any logging if the hardware is forcing the CPU frequency down?

I don't know but you can check logs in /var/log, especially kernel.log, dmesg.log and errors.log

If it's hardware problem you can't do anything, it shouldn't depend on air temperature too much. It can be wrongly applied thermal paste, problem with cooling system or damaged processor.

Last edited by einhard (2011-06-09 16:40:07)

Offline

#13 2011-06-10 05:52:52

made_in_nz
Member
From: Wellington, New Zealand
Registered: 2010-01-04
Posts: 53

Re: [SOLVED] System slows after high CPU load

It is certainly looking like overheating.  I took the notebook home yesterday afternoon seeing as I couldn't keep it running here.  At home, which is much cooler, it ran fine.  The screen flicker also stopped.  So I bought a cooling fan pad for the notebook to sit on, and now have the air-con on in the office so if it still overheats I think I'll be making use of the Dell 3yr Complete Cover smile

I don't run Windows other than in virtualbox.  From my understanding of lm_sensors, because I have sensors in my DAEMONS it loads the coretemp module.  I do have coretemp loaded.
I'm not sure how quickly CPU's cool down when the frequency is scaled back but you're probably right - when I checked the temps it was after the CPU had scaled down and the system was running slowly, I really should have had a script to dump the temps every so often to see what they got to before the scaling.

Thank you for your help einhard, it is very much appreciated.

Offline

Board footer

Powered by FluxBB