You are not logged in.
Edit: updated title now that problem is more clear
I think this is a new problem, but can't be sure. I've only noticed it on the past ~3-5 boots... My fan seemed to be running higher than usual with no obvious heavy applications open. Just chromium and a terminal.
HP Zbook 15, CPU0: Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz (family: 0x6, model: 0x9e, stepping: 0x9)
- htop, showing CPU1 at 77% with the rest at near 0%
- pretty representative of letting i7z run for a a bit, confirming the non-uniform load distribution
Socket [0] - [physical cores=4, logical cores=8, max online cores ever=4]
TURBO ENABLED on 4 Cores, Hyper Threading ON
Max Frequency without considering Turbo 3003.10 MHz (100.10 x [30])
Max TURBO Multiplier (if Enabled) with 1/2/3/4 Cores is 39x/37x/36x/35x
Real Current Frequency 3876.28 MHz [100.10 x 38.72] (Max of below)
Core [core-id] :Actual Freq (Mult.) C0% Halt(C1)% C3 % C6 % C7 % Temp
Core 1 [0]: 3876.28 (38.72x) 84.1 0 0 0 1 60
Core 2 [1]: 3564.24 (35.61x) 1 0.166 0 1 98.7 50
Core 3 [2]: 3626.30 (36.23x) 1 0.663 1 1 96.8 49
Core 4 [3]: 3644.56 (36.41x) 1 2.59 1 1.15 94.7 49
Top output of the only non-zero %CPU items listed:
top - 08:48:55 up 10:33, 1 user, load average: 0.25, 0.25, 0.31
Tasks: 243 total, 1 running, 242 sleeping, 0 stopped, 0 zombie
%Cpu(s): 0.1 us, 0.2 sy, 0.0 ni, 89.9 id, 0.1 wa, 9.7 hi, 0.0 si, 0.0 st
MiB Mem : 31960.6 total, 26571.3 free, 1835.6 used, 3553.7 buff/cache
MiB Swap: 0.0 total, 0.0 free, 0.0 used. 29648.5 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
383 root 20 0 0 0 0 S 0.7 0.0 0:06.46 btrfs-transacti
25565 jwhendy 20 0 13124 4960 3508 S 0.7 0.0 0:04.93 htop
353 root 0 -20 0 0 0 I 0.3 0.0 0:01.94 kworker/u17:2-kcryptd+
22793 root 0 -20 0 0 0 I 0.3 0.0 0:00.37 kworker/u17:4-kcryptd+
23444 root 0 -20 0 0 0 I 0.3 0.0 0:00.25 kworker/u17:1-kcryptd+
23605 jwhendy 20 0 1382048 328392 158120 S 0.3 1.0 0:42.53 chromium
- when polybar was running, it showed 61C as the temp.
$ acpi -t
]$ acpi -t
Thermal 0: ok, 24.0 degrees C
Thermal 1: ok, 44.0 degrees C
Thermal 2: ok, 41.0 degrees C
Thermal 3: ok, 0.0 degrees C
Thermal 4: ok, 61.0 degrees C
Thermal 5: ok, 41.0 degrees C
$ sensors
coretemp-isa-0000
Adapter: ISA adapter
Package id 0: +58.0°C (high = +100.0°C, crit = +100.0°C)
Core 0: +58.0°C (high = +100.0°C, crit = +100.0°C)
Core 1: +49.0°C (high = +100.0°C, crit = +100.0°C)
Core 2: +48.0°C (high = +100.0°C, crit = +100.0°C)
Core 3: +53.0°C (high = +100.0°C, crit = +100.0°C)
acpitz-acpi-0
Adapter: ACPI interface
temp1: +59.0°C (crit = +128.0°C)
temp2: +42.0°C (crit = +128.0°C)
temp3: +40.0°C (crit = +128.0°C)
temp4: +40.0°C (crit = +128.0°C)
temp5: +24.0°C (crit = +128.0°C)
temp6: +0.0°C (crit = +128.0°C)
iwlwifi-virtual-0
Adapter: Virtual device
temp1: +29.0°C
pch_skylake-virtual-0
Adapter: Virtual device
temp1: +43.5°C
Questions:
- is the singled out active state on Core0 suspicious?
- if so, any suggestions on where to look next?
Last edited by jwhendy (2019-04-06 19:24:42)
Offline
Have you checked your BIOS to see if the 4 cores are all active and checked their P-states (or C-states)?
Offline
Have you checked your BIOS to see if the 4 cores are all active and checked their P-states (or C-states)?
I would say the i7z output confirms that.
Nothing is too wonderful to be true, if it be consistent with the laws of nature -- Michael Faraday
Sometimes it is the people no one can imagine anything of who do the things no one can imagine. -- Alan Turing
---
How to Ask Questions the Smart Way
Offline
@ewaller: I concur. Now, of course I've only looked at i7z since I have a reason to suspect something is off. I don't know what I should expect. I googled for a cpu load simulator and found this script. It does appear they'll scale under load, so it's not like the other cores can't share the load:
$ python2 cpu-load-generator.py -n 8 10 test.data
Socket [0] - [physical cores=4, logical cores=8, max online cores ever=4]
TURBO ENABLED on 4 Cores, Hyper Threading ON
Max Frequency without considering Turbo 3003.10 MHz (100.10 x [30])
Max TURBO Multiplier (if Enabled) with 1/2/3/4 Cores is 39x/37x/36x/35x
Real Current Frequency 3492.02 MHz [100.10 x 34.88] (Max of below)
Core [core-id] :Actual Freq (Mult.) C0% Halt(C1)% C3 % C6 % C7 % Temp VCore
Core 1 [0]: 3491.99 (34.88x) 97 0 0 0 0 62 1.1290
Core 2 [1]: 3491.96 (34.88x) 95 0 0 0 1 60 1.1293
Core 3 [2]: 3492.02 (34.88x) 94.8 0 0 0 0 60 1.1266
Core 4 [3]: 3492.02 (34.88x) 98 0 0 0 0 65 1.1266
Under low-demand conditions (browser and terminal) I'm surprised that:
1) CPU0 would be at 80+% cstate C0 (top shows almost nothing above a few % cpu, so where is the load coming from?)
2) the temp is enough to cause the fans to run noticeably (well get used to the noise level of a computer doing nothing, and this is definitely running the fan more than that)
3) the other cores are dormant (if CPU0 is needed to such a degree to kick the fans on, why aren't the others helping out?)
Offline
Evidence that this C0% for Core1 is not a definite thing (i7z outputs with low C0% across all CPUs):
- example 1
- example 2 (third pic attached to issue)
- example 3
Offline
Odd. Things are insanely better. On the last boot I futzed with powertop and upgraded. Things seem good now.
Kind of i7z I'm seeing now:
Socket [0] - [physical cores=4, logical cores=8, max online cores ever=4]
TURBO ENABLED on 4 Cores, Hyper Threading ON
Max Frequency without considering Turbo 3003.10 MHz (100.10 x [30])
Max TURBO Multiplier (if Enabled) with 1/2/3/4 Cores is 39x/37x/36x/35x
Real Current Frequency 798.25 MHz [100.10 x 7.97] (Max of below)
Core [core-id] :Actual Freq (Mult.) C0% Halt(C1)% C3 % C6 % C7 % Temp
Core 1 [0]: 797.94 (7.97x) 5.91 13 0 1 84.4 37
Core 2 [1]: 798.25 (7.97x) 1.63 1.95 1 1 95.6 38
Core 3 [2]: 797.86 (7.97x) 4.58 10.1 1 1.41 86.2 38
Core 4 [3]: 797.88 (7.97x) 1 24.1 1 4.13 70.6 36
Looks much better!
Possible causes:
- In powertop I set all of the tunables to "Good", but after a reboot, the ones that were "Bad" are back to being bad. So, don't think it was that.
- Wakeups/wecond in powertop were at ~700 on the last boot (and high Core0/61C temps), I'd seen them as high as 5k, and now they're at 1200-1500 but still low CPU usage so that doesn't seem to correlate.
- upgrades on last boot:
[2019-04-06 13:49] [ALPM] upgraded arduino (1:1.8.8-1 -> 1:1.8.9-1)
[2019-04-06 13:49] [ALPM] upgraded blosc (1.16.2-1 -> 1.16.3-1)
[2019-04-06 13:49] [ALPM] upgraded openjpeg2 (2.3.0-3 -> 2.3.1-1)
[2019-04-06 13:49] [ALPM] upgraded chromium (73.0.3683.86-1 -> 73.0.3683.103-1)
[2019-04-06 13:49] [ALPM] upgraded jbig2dec (0.15-1 -> 0.16-1)
[2019-04-06 13:49] [ALPM] upgraded ghostscript (9.26-2 -> 9.27-1)
[2019-04-06 13:49] [ALPM] upgraded libmagick (7.0.8.36-1 -> 7.0.8.37-1)
[2019-04-06 13:49] [ALPM] upgraded imagemagick (7.0.8.36-1 -> 7.0.8.37-1)
[2019-04-06 13:49] [ALPM] upgraded jemalloc (1:5.1.0-1 -> 1:5.2.0-1)
[2019-04-06 13:49] [ALPM] upgraded libnotify (0.7.7-2 -> 0.7.8-1)
[2019-04-06 13:49] [ALPM] upgraded linux (5.0.5.arch1-1 -> 5.0.6.arch1-1)
[2019-04-06 13:49] [ALPM] upgraded mariadb-libs (10.3.13-4 -> 10.3.14-1)
[2019-04-06 13:49] [ALPM] upgraded nano (4.0-1 -> 4.0-2)
[2019-04-06 13:49] [ALPM] upgraded nvidia (418.56-5 -> 418.56-6)
[2019-04-06 13:49] [ALPM] upgraded python-pillow (5.4.1-1 -> 6.0.0-1)
[2019-04-06 13:49] [ALPM] upgraded python-pycryptodome (3.8.0-1 -> 3.8.1-1)
[2019-04-06 13:49] [ALPM] upgraded python-setuptools (1:40.8.0-1 -> 1:40.9.0-1)
[2019-04-06 13:49] [ALPM] upgraded python2-setuptools (1:40.8.0-1 -> 1:40.9.0-1)
[2019-04-06 13:49] [ALPM] upgraded ruby-hpricot (0.8.6-7 -> 0.8.6-8)
[2019-04-06 13:49] [ALPM] upgraded talloc (2.1.16-2 -> 2.2.0-1)
[2019-04-06 13:49] [ALPM] upgraded xfsprogs (4.19.0-2 -> 4.20.0-1)
[2019-04-06 13:49] [ALPM] upgraded xmlsec (1.2.27-1 -> 1.2.27-2)
Kernel? Nvidia? I'm going to downgrade the kernel and try again.
Offline
Got it! It's something related to suspend/resume.
At startup:
Real Current Frequency 804.79 MHz [100.10 x 8.04] (Max of below)
Core [core-id] :Actual Freq (Mult.) C0% Halt(C1)% C3 % C6 % C7 % Temp VCore
Core 1 [0]: 797.84 (7.97x) 1 0.465 0 0 99.5 42 0.6702
Core 2 [1]: 802.24 (8.01x) 1 0.218 0 0 99.8 42 0.6710
Core 3 [2]: 794.34 (7.94x) 1 1.08 0 0 98.9 42 0.6715
Core 4 [3]: 804.79 (8.04x) 1 1.44 0 0 98.5 41 0.6710
After a suspend/resume cycle:
Real Current Frequency 3841.43 MHz [100.14 x 38.36] (Max of below)
Core [core-id] :Actual Freq (Mult.) C0% Halt(C1)% C3 % C6 % C7 % Temp
Core 1 [0]: 3841.43 (38.36x) 87.3 0 0 0 1 55
Core 2 [1]: 3641.83 (36.37x) 1.71 16.9 1 1 78.9 43
Core 3 [2]: 3539.75 (35.35x) 1 0 1 1 98.7 42
Core 4 [3]: 3642.87 (36.38x) 1 3.24 1 2.15 93.2 42
Any tips on where I would hunt down a crazy process post-suspend? The only thing I can think of is my i3 screenlock suspend service.
$ cat /etc/systemd/system/suspend\@jwhendy.service
[Unit]
Description=User suspend actions
Before=suspend.target
[Service]
User=%I
Type=simple
Environment=DISPLAY=:0
ExecStart=/home/jwhendy/installed/scripts/slock.sh
ExecStartPost=/usr/bin/sleep 1
[Install]
WantedBy=suspend.target
I'm going to disable that and reboot.
Offline
Things tried:
- disabling the screen lock suspend service above
- using the kernel option intel_pstate=disable to try and go with the acpi cpufreq driver
- systemctl suspend directly vs. lid close
- removing the acpi package
So far the behavior is always the same; really stumped here. Not sure if I'm looking for a botched setting that isn't properly set after resume, or a process that's triggered on resume and somehow Core0 gets dedicated to it.
Some logs
- dmesg
- ps -e v before and after suspend
Last edited by jwhendy (2019-04-06 20:27:54)
Offline
Issues like these are often firmware rather than kernel related. Check and update your UEFI/BIOS .
Online
@V1del: anything in particular I'm looking for with respect to "check your UEFI/BIOS"? I can look into updating as well.
Would your suspicion change if different distros behaved differently? I dual boot arch and ubuntu bionic. Bionic does not appear to have this issue. Things resumed and didn't ramp up to high fan. Unfortunately, ubuntu forces secure modules so I'm not able to run i7z unless I recompile that module. I grabbed dmesg from both arch and ubuntu just for the suspend through resume messages. Anything stand out in this diff?
Offline
hi, I've having the exact same issue on my dell inspiron laptop, CPU goes to 3.1ghz after resuming from suspend, with no tasks keeping it on load. I noticed the following:
- it only happens on AC power
- happens on api acpi and intel_pstate, I tried to disable pstate already
- doesnt get fixed till next reboot
this is mostly to BUMP this discussion hopefully i can find some help. thank you
EDIT:
running top, I see a huge load on CPU0 caused by hardware interrupts:
%Cpu0 : 0,3 us, 1,3 sy, 0,0 ni, 64,3 id, 0,0 wa, **34,0 hi**, 0,0 si, 0,0 st <= see here
%Cpu1 : 3,0 us, 1,3 sy, 0,0 ni, 95,7 id, 0,0 wa, 0,0 hi, 0,0 si, 0,0 st
%Cpu2 : 3,3 us, 1,0 sy, 0,0 ni, 95,3 id, 0,0 wa, 0,3 hi, 0,0 si, 0,0 st
%Cpu3 : 1,7 us, 1,3 sy, 0,0 ni, 97,0 id, 0,0 wa, 0,0 hi, 0,0 si, 0,0 st
when this goes away, sometimes after a while by itself, the load goes away. Honestly, this is the first time i am throbleshooting this issue, I dont really know where to start
EDIT2:
ok, i found a similar problem here on the forum, another guy was complaining. he was reporting issues with the touchpad. I found out that as soon as i touch the touchpad the cpu gets back to normal, i know this sound crazy but it is exacly what happens. If i use my laptop with a mouse, it is stuck to 3.1ghz. I will try to look for a fix
EDIT3: (blacklist intel_lpss_pci
sorry for that)
found a workaround for the issue
i created a blacklist.conf file to prevent the kernel module to load, IDK if there are any side-effects, for now seems fine.
sudo nano /etc/modprobe.d/blacklist.conf
blacklist intel_lpss_pci
save and reboot
Last edited by mijorus (2019-10-30 13:00:52)
Offline
@mijorus: as you can see from this and other threads, there are a myriad of issues. From other thread off the top of my head, things range from:
- you don't understand how pstates work, so frequency is misleading and you can only really assess by % of the time in c0-c1 states
- upgrade the BIOS
- futz with acpi and other power management tools
My issue ultimately went away, and I thought it might have been kernel related, but in re-reading this thread I see I installed powertop. I just removed that as I didn't even realize I had it installed. I'll let you know if the problem comes back.
Given you have a Dell and I'm on an HP, I'm not sure how to diagnose the mouse issue, and I'd have to find my equivalent driver or at least dig in. Sorry I can't help more. On one hand it bugs me, on the other hand my problem isn't there anymore!
Offline