You are not logged in.
Thanks for your help with this one, as finally we learn some things and make progress.
Barring a BIOS-level fix, I suppose the best thing to do would be to write some sort of resume hook that zeros out the 0x19a register during the resume process.
There have been two patches proposed. One involves adding specific MSR register initialization based on specific BIOS identifiers. I think this would be slow making its way to distribution kernels. Another was to specifically initialize these registers, but there was some opposition to that idea.
My purpose in attempting to find supporting evidence is to demonstrate that this issue is much more wide spread than is thought.
Would you be willing to do another test? Using the acpi-cpufreq scaling driver, do the suspend and resume thing and see if Clock Modulation is enabled in that case. The resulting effect on performance would be much less obvious, and some users might not even notice.
Offline
Thanks for your help with this one, as finally we learn some things and make progress.
jat255 wrote:Barring a BIOS-level fix, I suppose the best thing to do would be to write some sort of resume hook that zeros out the 0x19a register during the resume process.
There have been two patches proposed. One involves adding specific MSR register initialization based on specific BIOS identifiers. I think this would be slow making its way to distribution kernels. Another was to specifically initialize these registers, but there was some opposition to that idea.
My purpose in attempting to find supporting evidence is to demonstrate that this issue is much more wide spread than is thought.
Would you be willing to do another test? Using the acpi-cpufreq scaling driver, do the suspend and resume thing and see if Clock Modulation is enabled in that case. The resulting effect on performance would be much less obvious, and some users might not even notice.
Sure thing, I'd be glad to try this out, but it will have to wait a bit (should be able to get to it tomorrow or this weekend). Now that my laptop can resume again I need to get some real work done
Offline
Sure thing, I'd be glad to try this out, but it will have to wait a bit (should be able to get to it tomorrow or this weekend). Now that my laptop can resume again I need to get some real work done
jat255: I would be grateful if you would do the test. I am wanting to confirm or deny that Clock Modulation is also used when running with the acpi-cpufreq driver, but that most users just don't notice.
Offline
Excellent work.
I need to do some more testing but that seems to stopped my CPU bumping around below its minimum frequency.
Just a small note - rdmsr/wrmsr need the msr module/functionality in the kernel. On my kernel (linux-ck) that isn't loaded by default but a simple modprobe does the trick - otherwise the script does nothing.
And.... fixed!
The following setup allowed me to run the commands upon resume from suspend, and I can confirm that after a suspend/resume cycle on battery, the CPU frequencies are no longer locked down. My laptop is usable again!
This method will depend on an installation of rdmsr/wrmsr (version 1.3) available somewhere on the path. If you are using Arch, feel free to use the following AUR package to get these commands.
Save the following as a script somewhere (I used ~/bin/zero_clock_mod_msr), and make it executable (chmod 755 ~/bin/zero_clock_mod_msr) (note, the echo commands are not necessary, but are useful to confirm the script is doing what you want):
#!/bin/bash logfile="$HOME/zero.log" #echo "$(date)" >> $logfile #echo "before writing" >> $logfile #rdmsr -a 0x19a >> $logfile wrmsr -a 0x19a 0x0 #echo "after writing" >> $logfile #rdmsr -a 0x19a >> $logfile #echo "" >> $logfile
Now write a systemd unit file and place it in /etc/systemd/system/. I saved the following as /etc/systemd/system/clock_mod-fix.service:
[Unit] Description=Flushes the cpu clock modulation MSR to relase cpu lock caused by BIOS bug After=suspend.target [Service] User=root Type=oneshot ExecStart=/home/josh/bin/zero_clock_mod_msr TimeoutSec=0 StandardOutput=syslog [Install] WantedBy=suspend.target
Enable the unit by running:
$ sudo systemctl enable clock_mod-fix.service
By telling it to run after a suspend, it will run whatever script you put in the ExecStart line upon resume from sleep. If you have the logging functions un-commented in the script, you'll see that the MSR values are all zeroed out upon resume, and the system should be snappy and responsive.
EDIT: Doug, do you know if there is a bug report somewhere that has anybody working on this? I would be happy to contribute the following code there or help debug so we can get a real fix, rather than this hack-ish workaround.
Offline
@Shino Nage: What were your CPU frequencies before you applied the fix? And what value did you read from the registers? Would you be willing to do the test using the acpi-cpufreq driver, as jat255 never has? Is your computer also a Dell? Were you on battery power when the issue occurred?
Last edited by Doug Smythies (2015-09-30 20:27:18)
Offline
Hi Doug,
Yes I can help out.
I can confirm I've got a Dell laptop a Lattitude E6220 - I've seen that it's not running the latest BIOS from Dell. I was going to upgrade the BIOS but I'll hold off to help out with some testing.
The problem does occur on battery power - I will have to check whether it is a problem on AC (of course my psu is in the office right now!).
I see CPU frequencies in the range 630000 - 790000, but my i5's minimum frequency is 800000.
rdmsr -a returns 1c for each of the 4 cores.
I've tried the acpi-cpufreq and I don't see the problem, but it's probably not a valid test - even from a fresh reboot the frequency seems to be set on the minimum regardless of the load. I'm not sure if I need some daemon running to take control of the CPU frequency?
Offline
@Shino Nage: O.K. thanks. All of the additional information you supplied makes sense.
For the acpi-cpufreq test: Please note that it is what the register reads after a resume from suspend on battery that I want. You wouldn't necessarily notice issues with CPU frequencies, as that driver responds different to Clock Modulation. The theory is that this Clock Modulation issue also occurs with the acpi-cpufreq scaling driver, but most users don't even notice. I am trying to gather evidence to support the theory. There are two root issues here: First, it seems that Dell uses Clock Modulation when resuming on battery, and it shouldn't; Secondly, in its current form, the intel_pstate driver is completely NOT compatible with Clock Modulation and will always drive the target pstate to minimum (force the CPU frequency to minimum, regardless of load).
By far, it seems that most people are merely disabling the intel_pstate driver and moving on using the acpi-cpufreq driver. I am attempting to determine the magnitude of the issue.
Offline
@Doug Smythies: rdmsr returns 0x1c under acpi-cpufreq too.
Thanks for the detailed explanation.
Note that on my laptop switching to acpi-cpufreq seems to bring its own problem - the CPU is locked to minimum frequency. So if that's a solution people are using it wouldn't be acceptable for me.
This is different to pstate where the cpu frequency scales correctly until I suspend/resume the machine. Once resumed the frequencies do vary but they are reported as being less than the minimum frequency for the CPU and subjectively that's how it feels.
There's a chance that acpi-cpufreq is being affected by laptop-mode, which I've got installed, but the behaviour is different under the same conditions between acpi & pstate.
Let me know if there's anything else you'd like me to test - I'm curious to know if a BIOS upgrade will make a difference.
Offline
@Shiho Nage: Thanks again for the additional information. Please go ahead and upgrade your BIOS and report back here. I do not know what your stuck CPU frequencies using the acpi-cpufreq driver is about. However, typically it is due to BIOS limit. To know for sure, observe:
cat /sys/devices/system/cpu/cpu0/cpufreq/bios_limit
Offline
@Doug Smythies: OK, I've now gone from BIOS A04 -> A13 and my cpu clock seems to be doing the right thing and rdmsr returns 0 following a resume.
Hurrah!
If there's anything else I can help with then please let me know.
Offline
@Doug Smythies.
So, today I bring my laptop up from resume and we're back to slow performance and 0x1c in the msr registers - looks like the new bios isn't a reliable fix after all....
Fortunately, the script described here still fixes the issue.
Offline
I know this is possibly dragging up a slightly old post, but just wanted to pass on my thanks to everyone for this solution.
I'm running Ubuntu 14.04 on a Dell Latitude E6420, and had the same issue with very slow performance on resume when running on battery. CPU freq was down to ~600MHz. Setting the 0x19a register back to 0x0 solved it for me as well.
Just in case it helps anyone else, I had to go a slightly different route to run this fix on resume, as Ubuntu 14.04 doesn't use systemd.
Make a new file '/etc/pm/sleep.d/zero_clock_mod_msr':
#!/bin/bash
case "$1" in
thaw|resume)
modprobe msr
wrmsr -a 0x19a 0x0
;;
esac
Make file executable with 'chmod 755 /etc/pm/sleep.d/zero_clock_mod_msr'.
Thanks!
Offline
I have the same issue now, but wrmsr -a 0x19a 0x0 does not work for me
Offline
I had a very similar issue on my Lenovo Y410P that got fixed in commit f772b404a62f9039323faaff271ef11c6cba0aab, upstream commit 81ad4276b505e987dd8ebbdf63605f92cd172b52 shipped in 4.5.1 which has made it into the Arch repos now (https://cdn.kernel.org/pub/linux/kernel … eLog-4.5.1).
In particular, note that clock modulation had nothing to do with the issue I faced, though the symptoms were very similar.
"Behind every theorem lies an inequality" - A N Kolmogorov
Offline
I again encountered the problem just today and this time I was able to use the MSR register to get my CPU (in a DELL Latitude E6430...) from 900 back to normal (thank you guys!). But the GUI still felt very, very sluggish. Perhaps the GPU (dedicated Nvidia ) was also clocked down? The kernel in my case is already a 4.5.1. I'll try to give it a closer look when I'm not in a hurry (happened at work).
Offline
Hello,
I also had this issue on the Dell XPS 13 2016 9350 (lots of keywords so search engines users can land here) :
After resume :
$ uname -a
Linux malta 4.6.3-1-ARCH #1 SMP PREEMPT Fri Jun 24 21:19:13 CEST 2016 x86_64 GNU/Linux
$ sudo rdmsr -a 0x19a
10
10
10
10
$ sudo cpupower frequency-info
analyzing CPU 0:
driver: intel_pstate
CPUs which run at the same hardware frequency: 0
CPUs which need to have their frequency coordinated by software: 0
maximum transition latency: Cannot determine or is not supported.
hardware limits: 400 MHz - 2.80 GHz
available cpufreq governors: performance powersave
current policy: frequency should be within 400 MHz and 2.80 GHz.
The governor "powersave" may decide which speed to use
within this range.
current CPU frequency: 264 MHz (asserted by call to hardware)
boost state support:
Supported: yes
Active: yes
$ sudo wrmsr -a 0x19a 0x0
$ sudo rdmsr -a 0x19a
0
0
0
0
$ sudo cpupower frequency-info
analyzing CPU 0:
driver: intel_pstate
CPUs which run at the same hardware frequency: 0
CPUs which need to have their frequency coordinated by software: 0
maximum transition latency: Cannot determine or is not supported.
hardware limits: 400 MHz - 2.80 GHz
available cpufreq governors: performance powersave
current policy: frequency should be within 400 MHz and 2.80 GHz.
The governor "powersave" may decide which speed to use
within this range.
current CPU frequency: 2.09 GHz (asserted by call to hardware)
boost state support:
Supported: yes
Active: yes
Yaaay! I can now switch between firefox tabs without having to go take a coffee break!
I'll go ahead and create my systemd unit.
Oh, and a little context on this issue : it is not observed every time I resume from suspend, but yesterday evening I suspended on battery, and this morning (when it was observed) I resumed on battery also. Before going ahead and writing MSR registries, I tried several suspend (closing lid)/resume cycles on AC, and it did not help.
And my BIOS is up to date (1.4.4) as stated here : https://wiki.archlinux.org/index.php/De … OS_updates
I will add a quick note to this wiki page, pointing here.
Many thanks to you guys for the debug information!
Offline
Wow. I "fixed" this by turning off SpeedStep in my BIOS (Dell XPS 13 9350), but that also limited my CPU to 2.2Ghz from the turbo speed of 3.2Ghz. Looking forward to trying the msr "fix".
Offline
Well, this only partially fixes it for me. I ended up filing a kernel bug report here:
Offline
su root -c "echo 1 > /sys/devices/system/cpu/intel_pstate/no_turbo"
su root -c "echo 0 > /sys/devices/system/cpu/intel_pstate/no_turbo"
Turbo boost disabled after suspend as well for me, so you have to add these lines to systemd script. The value itself is not changed (always 0), you just have to toggle it.
It's all happens on Dell M4700 laptop. Also useful to trick BIOS if you don't use original power adapter from Dell. Mine is not reporting its wattage, so I get power adapter warning on startup and the CPU is downclocked to 900Mhz. Previously I've had to remove the power cord before boot, but now it is all fixed.
Offline