You are not logged in.

#1 2019-04-05 13:59:48

jwhendy
Member
Registered: 2010-04-01
Posts: 621

After resume from suspend, one CPU core has high load/temps

Edit: updated title now that problem is more clear

I think this is a new problem, but can't be sure. I've only noticed it on the past ~3-5 boots... My fan seemed to be running higher than usual with no obvious heavy applications open. Just chromium and a terminal.

HP Zbook 15, CPU0: Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz (family: 0x6, model: 0x9e, stepping: 0x9)

- htop, showing CPU1 at 77% with the rest at near 0%
- pretty representative of letting i7z run for a a bit, confirming the non-uniform load distribution

Socket [0] - [physical cores=4, logical cores=8, max online cores ever=4]
  TURBO ENABLED on 4 Cores, Hyper Threading ON
  Max Frequency without considering Turbo 3003.10 MHz (100.10 x [30])
  Max TURBO Multiplier (if Enabled) with 1/2/3/4 Cores is  39x/37x/36x/35x
  Real Current Frequency 3876.28 MHz [100.10 x 38.72] (Max of below)
        Core [core-id]  :Actual Freq (Mult.)      C0%   Halt(C1)%  C3 %   C6 %   C7 %  Temp
        Core 1 [0]:       3876.28 (38.72x)      84.1       0       0       0       1    60
        Core 2 [1]:       3564.24 (35.61x)         1    0.166      0       1    98.7    50
        Core 3 [2]:       3626.30 (36.23x)         1    0.663      1       1    96.8    49
        Core 4 [3]:       3644.56 (36.41x)         1    2.59       1    1.15    94.7    49

Top output of the only non-zero %CPU items listed:

top - 08:48:55 up 10:33,  1 user,  load average: 0.25, 0.25, 0.31
Tasks: 243 total,   1 running, 242 sleeping,   0 stopped,   0 zombie
%Cpu(s):  0.1 us,  0.2 sy,  0.0 ni, 89.9 id,  0.1 wa,  9.7 hi,  0.0 si,  0.0 st
MiB Mem :  31960.6 total,  26571.3 free,   1835.6 used,   3553.7 buff/cache
MiB Swap:      0.0 total,      0.0 free,      0.0 used.  29648.5 avail Mem 

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
  383 root      20   0       0      0      0 S   0.7   0.0   0:06.46 btrfs-transacti
25565 jwhendy   20   0   13124   4960   3508 S   0.7   0.0   0:04.93 htop
  353 root       0 -20       0      0      0 I   0.3   0.0   0:01.94 kworker/u17:2-kcryptd+
22793 root       0 -20       0      0      0 I   0.3   0.0   0:00.37 kworker/u17:4-kcryptd+
23444 root       0 -20       0      0      0 I   0.3   0.0   0:00.25 kworker/u17:1-kcryptd+
23605 jwhendy   20   0 1382048 328392 158120 S   0.3   1.0   0:42.53 chromium

- when polybar was running, it showed 61C as the temp.

$ acpi -t
]$ acpi -t
Thermal 0: ok, 24.0 degrees C
Thermal 1: ok, 44.0 degrees C
Thermal 2: ok, 41.0 degrees C
Thermal 3: ok, 0.0 degrees C
Thermal 4: ok, 61.0 degrees C
Thermal 5: ok, 41.0 degrees C

$ sensors
coretemp-isa-0000
Adapter: ISA adapter
Package id 0:  +58.0°C  (high = +100.0°C, crit = +100.0°C)
Core 0:        +58.0°C  (high = +100.0°C, crit = +100.0°C)
Core 1:        +49.0°C  (high = +100.0°C, crit = +100.0°C)
Core 2:        +48.0°C  (high = +100.0°C, crit = +100.0°C)
Core 3:        +53.0°C  (high = +100.0°C, crit = +100.0°C)

acpitz-acpi-0
Adapter: ACPI interface
temp1:        +59.0°C  (crit = +128.0°C)
temp2:        +42.0°C  (crit = +128.0°C)
temp3:        +40.0°C  (crit = +128.0°C)
temp4:        +40.0°C  (crit = +128.0°C)
temp5:        +24.0°C  (crit = +128.0°C)
temp6:         +0.0°C  (crit = +128.0°C)

iwlwifi-virtual-0
Adapter: Virtual device
temp1:        +29.0°C  

pch_skylake-virtual-0
Adapter: Virtual device
temp1:        +43.5°C  

Questions:
- is the singled out active state on Core0 suspicious?
- if so, any suggestions on where to look next?

Last edited by jwhendy (2019-04-06 19:24:42)

Offline

#2 2019-04-05 16:58:37

d_fajardo
Member
Registered: 2017-07-28
Posts: 1,563

Re: After resume from suspend, one CPU core has high load/temps

Have you checked your BIOS to see if the 4 cores are all active and checked their P-states (or C-states)?

Offline

#3 2019-04-05 23:10:32

ewaller
Administrator
From: Pasadena, CA
Registered: 2009-07-13
Posts: 19,739

Re: After resume from suspend, one CPU core has high load/temps

d_fajardo wrote:

Have you checked your BIOS to see if the 4 cores are all active and checked their P-states (or C-states)?

I would say the i7z output confirms that.


Nothing is too wonderful to be true, if it be consistent with the laws of nature -- Michael Faraday
Sometimes it is the people no one can imagine anything of who do the things no one can imagine. -- Alan Turing
---
How to Ask Questions the Smart Way

Offline

#4 2019-04-06 17:59:08

jwhendy
Member
Registered: 2010-04-01
Posts: 621

Re: After resume from suspend, one CPU core has high load/temps

@ewaller: I concur. Now, of course I've only looked at i7z since I have a reason to suspect something is off. I don't know what I should expect. I googled for a cpu load simulator and found this script. It does appear they'll scale under load, so it's not like the other cores can't share the load:

$ python2 cpu-load-generator.py -n 8 10 test.data

Socket [0] - [physical cores=4, logical cores=8, max online cores ever=4]
  TURBO ENABLED on 4 Cores, Hyper Threading ON
  Max Frequency without considering Turbo 3003.10 MHz (100.10 x [30])
  Max TURBO Multiplier (if Enabled) with 1/2/3/4 Cores is  39x/37x/36x/35x
  Real Current Frequency 3492.02 MHz [100.10 x 34.88] (Max of below)
        Core [core-id]  :Actual Freq (Mult.)      C0%   Halt(C1)%  C3 %   C6 %   C7 %  Temp      VCore
        Core 1 [0]:       3491.99 (34.88x)        97       0       0       0       0    62      1.1290
        Core 2 [1]:       3491.96 (34.88x)        95       0       0       0       1    60      1.1293
        Core 3 [2]:       3492.02 (34.88x)      94.8       0       0       0       0    60      1.1266
        Core 4 [3]:       3492.02 (34.88x)        98       0       0       0       0    65      1.1266

Under low-demand conditions (browser and terminal)  I'm surprised that:

1) CPU0 would be at 80+% cstate C0 (top shows almost nothing above a few % cpu, so where is the load coming from?)
2) the temp is enough to cause the fans to run noticeably (well get used to the noise level of a computer doing nothing, and this is definitely running the fan more than that)
3) the other cores are dormant (if CPU0 is needed to such a degree to kick the fans on, why aren't the others helping out?)

Offline

#5 2019-04-06 18:32:33

jwhendy
Member
Registered: 2010-04-01
Posts: 621

Re: After resume from suspend, one CPU core has high load/temps

Evidence that this C0% for Core1 is not a definite thing (i7z outputs with low C0% across all CPUs):
- example 1
- example 2 (third pic attached to issue)
- example 3

Offline

#6 2019-04-06 19:06:06

jwhendy
Member
Registered: 2010-04-01
Posts: 621

Re: After resume from suspend, one CPU core has high load/temps

Odd. Things are insanely better. On the last boot I futzed with powertop and upgraded. Things seem good now.

Kind of i7z I'm seeing now:

Socket [0] - [physical cores=4, logical cores=8, max online cores ever=4]
  TURBO ENABLED on 4 Cores, Hyper Threading ON
  Max Frequency without considering Turbo 3003.10 MHz (100.10 x [30])
  Max TURBO Multiplier (if Enabled) with 1/2/3/4 Cores is  39x/37x/36x/35x
  Real Current Frequency 798.25 MHz [100.10 x 7.97] (Max of below)
        Core [core-id]  :Actual Freq (Mult.)      C0%   Halt(C1)%  C3 %   C6 %   C7 %  Temp
        Core 1 [0]:       797.94 (7.97x)        5.91      13       0       1    84.4    37
        Core 2 [1]:       798.25 (7.97x)        1.63    1.95       1       1    95.6    38
        Core 3 [2]:       797.86 (7.97x)        4.58    10.1       1    1.41    86.2    38
        Core 4 [3]:       797.88 (7.97x)           1    24.1       1    4.13    70.6    36

Looks much better!

Possible causes:
- In powertop I set all of the tunables to "Good", but after a reboot, the ones that were "Bad" are back to being bad. So, don't think it was that.

- Wakeups/wecond in powertop were at ~700 on the last boot (and high Core0/61C temps), I'd seen them as high as 5k, and now they're at 1200-1500 but still low CPU usage so that doesn't seem to correlate.

- upgrades on last boot:

[2019-04-06 13:49] [ALPM] upgraded arduino (1:1.8.8-1 -> 1:1.8.9-1)
[2019-04-06 13:49] [ALPM] upgraded blosc (1.16.2-1 -> 1.16.3-1)
[2019-04-06 13:49] [ALPM] upgraded openjpeg2 (2.3.0-3 -> 2.3.1-1)
[2019-04-06 13:49] [ALPM] upgraded chromium (73.0.3683.86-1 -> 73.0.3683.103-1)
[2019-04-06 13:49] [ALPM] upgraded jbig2dec (0.15-1 -> 0.16-1)
[2019-04-06 13:49] [ALPM] upgraded ghostscript (9.26-2 -> 9.27-1)
[2019-04-06 13:49] [ALPM] upgraded libmagick (7.0.8.36-1 -> 7.0.8.37-1)
[2019-04-06 13:49] [ALPM] upgraded imagemagick (7.0.8.36-1 -> 7.0.8.37-1)
[2019-04-06 13:49] [ALPM] upgraded jemalloc (1:5.1.0-1 -> 1:5.2.0-1)
[2019-04-06 13:49] [ALPM] upgraded libnotify (0.7.7-2 -> 0.7.8-1)
[2019-04-06 13:49] [ALPM] upgraded linux (5.0.5.arch1-1 -> 5.0.6.arch1-1)
[2019-04-06 13:49] [ALPM] upgraded mariadb-libs (10.3.13-4 -> 10.3.14-1)
[2019-04-06 13:49] [ALPM] upgraded nano (4.0-1 -> 4.0-2)
[2019-04-06 13:49] [ALPM] upgraded nvidia (418.56-5 -> 418.56-6)
[2019-04-06 13:49] [ALPM] upgraded python-pillow (5.4.1-1 -> 6.0.0-1)
[2019-04-06 13:49] [ALPM] upgraded python-pycryptodome (3.8.0-1 -> 3.8.1-1)
[2019-04-06 13:49] [ALPM] upgraded python-setuptools (1:40.8.0-1 -> 1:40.9.0-1)
[2019-04-06 13:49] [ALPM] upgraded python2-setuptools (1:40.8.0-1 -> 1:40.9.0-1)
[2019-04-06 13:49] [ALPM] upgraded ruby-hpricot (0.8.6-7 -> 0.8.6-8)
[2019-04-06 13:49] [ALPM] upgraded talloc (2.1.16-2 -> 2.2.0-1)
[2019-04-06 13:49] [ALPM] upgraded xfsprogs (4.19.0-2 -> 4.20.0-1)
[2019-04-06 13:49] [ALPM] upgraded xmlsec (1.2.27-1 -> 1.2.27-2)

Kernel? Nvidia? I'm going to downgrade the kernel and try again.

Offline

#7 2019-04-06 19:22:55

jwhendy
Member
Registered: 2010-04-01
Posts: 621

Re: After resume from suspend, one CPU core has high load/temps

Got it! It's something related to suspend/resume.

At startup:

  Real Current Frequency 804.79 MHz [100.10 x 8.04] (Max of below)
        Core [core-id]  :Actual Freq (Mult.)      C0%   Halt(C1)%  C3 %   C6 %   C7 %  Temp      VCore
        Core 1 [0]:       797.84 (7.97x)           1    0.465      0       0    99.5    42      0.6702
        Core 2 [1]:       802.24 (8.01x)           1    0.218      0       0    99.8    42      0.6710
        Core 3 [2]:       794.34 (7.94x)           1    1.08       0       0    98.9    42      0.6715
        Core 4 [3]:       804.79 (8.04x)           1    1.44       0       0    98.5    41      0.6710

After a suspend/resume cycle:

  Real Current Frequency 3841.43 MHz [100.14 x 38.36] (Max of below)
        Core [core-id]  :Actual Freq (Mult.)      C0%   Halt(C1)%  C3 %   C6 %   C7 %  Temp
        Core 1 [0]:       3841.43 (38.36x)      87.3       0       0       0       1    55
        Core 2 [1]:       3641.83 (36.37x)      1.71    16.9       1       1    78.9    43
        Core 3 [2]:       3539.75 (35.35x)         1       0       1       1    98.7    42
        Core 4 [3]:       3642.87 (36.38x)         1    3.24       1    2.15    93.2    42

Any tips on where I would hunt down a crazy process post-suspend? The only thing I can think of is my i3 screenlock suspend service.

$ cat /etc/systemd/system/suspend\@jwhendy.service
[Unit]
Description=User suspend actions
Before=suspend.target

[Service]
User=%I
Type=simple
Environment=DISPLAY=:0
ExecStart=/home/jwhendy/installed/scripts/slock.sh
ExecStartPost=/usr/bin/sleep 1

[Install]
WantedBy=suspend.target

I'm going to disable that and reboot.

Offline

#8 2019-04-06 20:25:01

jwhendy
Member
Registered: 2010-04-01
Posts: 621

Re: After resume from suspend, one CPU core has high load/temps

Things tried:
- disabling the screen lock suspend service above
- using the kernel option intel_pstate=disable to try and go with the acpi cpufreq driver
- systemctl suspend directly vs. lid close
- removing the acpi package

So far the behavior is always the same; really stumped here. Not sure if I'm looking for a botched setting that isn't properly set after resume, or a process that's triggered on resume and somehow Core0 gets dedicated to it.

Some logs
- dmesg
- ps -e v before and after suspend

Last edited by jwhendy (2019-04-06 20:27:54)

Offline

#9 2019-04-06 20:29:48

V1del
Forum Moderator
Registered: 2012-10-16
Posts: 21,423

Re: After resume from suspend, one CPU core has high load/temps

Issues like these are often firmware rather than kernel related. Check and update your UEFI/BIOS .

Online

#10 2019-04-06 21:00:16

jwhendy
Member
Registered: 2010-04-01
Posts: 621

Re: After resume from suspend, one CPU core has high load/temps

@V1del: anything in particular I'm looking for with respect to "check your UEFI/BIOS"? I can look into updating as well.

Would your suspicion change if different distros behaved differently? I dual boot arch and ubuntu bionic. Bionic does not appear to have this issue. Things resumed and didn't ramp up to high fan. Unfortunately, ubuntu forces secure modules so I'm not able to run i7z unless I recompile that module. I grabbed dmesg from both arch and ubuntu just for the suspend through resume messages. Anything stand out in this diff?

Offline

#11 2019-10-30 08:26:41

mijorus
Member
Registered: 2019-10-30
Posts: 3

Re: After resume from suspend, one CPU core has high load/temps

hi, I've having the exact same issue on my dell inspiron laptop, CPU goes to 3.1ghz after resuming from suspend, with no tasks keeping it on load. I noticed the following:

- it only happens on AC power
- happens on api acpi and intel_pstate, I tried to disable pstate already
- doesnt get fixed till next reboot

this is mostly to BUMP this discussion hopefully i can find some help. thank you

EDIT:
running top, I see a huge load on CPU0 caused by hardware interrupts:

%Cpu0  :  0,3 us,  1,3 sy,  0,0 ni, 64,3 id,  0,0 wa, **34,0 hi**,  0,0 si,  0,0 st <= see here
%Cpu1  :  3,0 us,  1,3 sy,  0,0 ni, 95,7 id,  0,0 wa,  0,0 hi,  0,0 si,  0,0 st
%Cpu2  :  3,3 us,  1,0 sy,  0,0 ni, 95,3 id,  0,0 wa,  0,3 hi,  0,0 si,  0,0 st
%Cpu3  :  1,7 us,  1,3 sy,  0,0 ni, 97,0 id,  0,0 wa,  0,0 hi,  0,0 si,  0,0 st

when this goes away, sometimes after a while by itself, the load goes away. Honestly, this is the first time i am throbleshooting this issue, I dont really know where to start

EDIT2:
ok, i found a similar problem here on the forum, another guy was complaining. he was reporting issues with the touchpad. I found out that as soon as i touch the touchpad the cpu gets back to normal, i know this sound crazy but it is exacly what happens. If i use my laptop with a mouse, it is stuck to 3.1ghz. I will try to look for a fix

EDIT3: (blacklist intel_lpss_pci
sorry for that)
found a workaround for the issue
i created a blacklist.conf file to prevent the kernel module to load, IDK if there are any side-effects, for now seems fine.

sudo nano /etc/modprobe.d/blacklist.conf
blacklist intel_lpss_pci

save and reboot

Last edited by mijorus (2019-10-30 13:00:52)

Offline

#12 2019-10-30 14:10:27

jwhendy
Member
Registered: 2010-04-01
Posts: 621

Re: After resume from suspend, one CPU core has high load/temps

@mijorus: as you can see from this and other threads, there are a myriad of issues. From other thread off the top of my head, things range from:

- you don't understand how pstates work, so frequency is misleading and you can only really assess by % of the time in c0-c1 states
- upgrade the BIOS
- futz with acpi and other power management tools

My issue ultimately went away, and I thought it might have been kernel related, but in re-reading this thread I see I installed powertop. I just removed that as I didn't even realize I had it installed. I'll let you know if the problem comes back.

Given you have a Dell and I'm on an HP, I'm not sure how to diagnose the mouse issue, and I'd have to find my equivalent driver or at least dig in. Sorry I can't help more. On one hand it bugs me, on the other hand my problem isn't there anymore!

Offline

Board footer

Powered by FluxBB