You are not logged in.

#1 2019-08-16 11:20:02

manu34
Member
Registered: 2019-08-16
Posts: 4

Ryzen CPU freq dropping to low freq and getting stuck there

Hello,

I'm experiencing a problem with my new laptop. After I boot, login into Arch, and start using my PC, all cores, at some point, slow down to ~400MHz.
This always happens, consistently, as soon as I start opening and using applications (e.g. browser + compiling some sources)

Here are my laptop model and basic specs:

- Laptop Model: ASUS G GA502DU (ROG Zephyrus)
- CPU: AMD® Ryzen™ 7 3750H CPU (1.4 - 2.4 GHz with 4GHz boost mode)
- Discrete GPU: GeForce® GTX 1660 Ti

By running

 # cpupower monitor 

I get:

    | Mperf              || Idle_Stats         
 CPU| C0   | Cx   | Freq  || POLL | C1   | C2    
   0| 11.87| 88.13|   399||  0.00|  3.65| 84.70
   1|  1.09| 98.91|   399||  0.00|  0.37| 98.62
   2| 10.76| 89.24|   398||  0.00| 21.29| 68.49
   3|  1.15| 98.85|   400||  0.00|  0.21| 98.68
   4|  6.57| 93.43|   399||  0.00|  1.63| 91.86
   5|  0.20| 99.80|   398||  0.00|  0.00| 99.82
   6| 44.99| 55.01|   399||  0.00|  1.60| 53.45
   7|  2.39| 97.61|   398||  0.00|  0.86| 96.87

whereas

 # lscpu 

shows this:

Architecture:                    x86_64
CPU op-mode(s):                  32-bit, 64-bit
Byte Order:                      Little Endian
Address sizes:                   43 bits physical, 48 bits virtual
CPU(s):                          8
On-line CPU(s) list:             0-7
Thread(s) per core:              2
Core(s) per socket:              4
Socket(s):                       1
NUMA node(s):                    1
Vendor ID:                       AuthenticAMD
CPU family:                      23
Model:                           24
Model name:                      AMD Ryzen 7 3750H with Radeon Vega Mobile Gfx
Stepping:                        1
Frequency boost:                 enabled
CPU MHz:                         399.129
CPU max MHz:                     2300.0000
CPU min MHz:                     1400.0000
BogoMIPS:                        4593.97
Virtualization:                  AMD-V
L1d cache:                       128 KiB
L1i cache:                       256 KiB
L2 cache:                        2 MiB
L3 cache:                        4 MiB
NUMA node0 CPU(s):               0-7
Vulnerability L1tf:              Not affected
Vulnerability Mds:               Not affected
Vulnerability Meltdown:          Not affected
Vulnerability Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl and seccomp
Vulnerability Spectre v1:        Mitigation; usercopy/swapgs barriers and __user pointer sanitization
Vulnerability Spectre v2:        Mitigation; Full AMD retpoline, IBPB conditional, STIBP disabled, RSB filling
Flags:                           fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_t
                                 sc rep_good nopl nonstop_tsc cpuid extd_apicid aperfmperf pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm 
                                 cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_llc mwaitx cpb hw_ps
                                 tate sme ssbd sev ibpb vmmcall fsgsbase bmi1 avx2 smep bmi2 rdseed adx smap clflushopt sha_ni xsaveopt xsavec xgetbv1 xsaves clzero irperf xsaveerptr arat n
                                 pt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold avic v_vmsave_vmload vgif overflow_recov succor smca

and

# cpupower frequency-info 

outputs

analyzing CPU 0:
  driver: acpi-cpufreq
  CPUs which run at the same hardware frequency: 0
  CPUs which need to have their frequency coordinated by software: 0
  maximum transition latency:  Cannot determine or is not supported.
  hardware limits: 1.40 GHz - 2.30 GHz
  available frequency steps:  2.30 GHz, 1.70 GHz, 1.40 GHz
  available cpufreq governors: ondemand performance schedutil
  current policy: frequency should be within 1.40 GHz and 2.30 GHz.
                  The governor "ondemand" may decide which speed to use
                  within this range.
  current CPU frequency: 1.40 GHz (asserted by call to hardware)
  boost state support:
    Supported: no
    Active: no
    Boost States: 0
    Total States: 3
    Pstate-P0:  2300MHz
    Pstate-P1:  1700MHz
    Pstate-P2:  1400MHz

I have done some research on the web. I found some posts suggesting that it might be related to:

- Fast Boot enabled in the bios (I have disabled it, but nothing changed).
- Faulty AC charger that doesn't generate enough power to charge the laptop (mine seems to be charging it).

I also tried a few more things:

- I updated the BIOS to the latest available version with no change in behavior.
- I changed the governor from schedutil to ondemand and performance.

but I had no luck and I'm out of ideas, anybody has more?
Thanks

UPDATE:
I can see there are errors logged on the kernel buffer around the same time the frequency changes:

  150.186662] ucsi_ccg 6-0008: failed to reset PPM!
[  150.186664] ucsi_ccg 6-0008: PPM init failed (-110)
[  160.356808] ACPI BIOS Error (bug): Could not resolve symbol [\_SB.PCI0.SBRG.EC0._QE8.TEMF], AE_NOT_FOUND (20190509/psargs-330)
[  160.356827] ACPI Error: Aborting method \_SB.PCI0.SBRG.EC0._QE8 due to previous error (AE_NOT_FOUND) (20190509/psparse-529)

I don't know if them are related to the issue

Last edited by manu34 (2019-08-16 15:34:43)

Offline

#2 2019-08-16 21:23:16

gnox
Member
Registered: 2013-05-18
Posts: 49

Re: Ryzen CPU freq dropping to low freq and getting stuck there

I have the same laptop, for me the  boost of the processor that have some spikes that increases the temps to > +95 C makes the bios to throttle to 400Mhz and that happens with high load (games, all cpus usage with compiling), browsing you will only see spikes of +15-20C.

Supposedly the laptop can handle those temps (>90C), but I had the same behavior in a clean install with windows,  until I installed the asus apps and services on windows (armoury crate, etc) it no longer occurred and the temps were over 90 most of the  time when playing, so they are handling it in some way with that software.

In Arch until we have this or a bios/microcode/firmware update, I managed to work with the following :

- Remember to add amd-ucode (microcode) as boot parameter.

- Disable boost when you are doing normal tasks, the max frequency will be 2.3Ghz the temps will increase depending on the load but not so high as with boost.
/etc/tmpfiles.d/disable_boost.conf

w /sys/devices/system/cpu/cpufreq/boost - - - - 0

You can also enable/disable with a script

#!/bin/sh
#
if [ "$1" == "on" ]; then
  echo -n 1 > /sys/devices/system/cpu/cpufreq/boost
else
  echo -n 0 > /sys/devices/system/cpu/cpufreq/boost
fi

- With boost enable or disable you can limit the freqs only to the P1 and P2 states (1.7Ghz, 1.4Ghz) with cpupower (good when using on battery), if you limit to 2.3Ghz and boost flag is active it will enable the normal behavior of boost:

sudo cpupower frequency-set -u 1.4Ghz

- blacklist the ucsi_ccg, is the usb type C that according to this : https://download.nvidia.com/XFree86/Lin … ement.html , "USB Type-C UCSI controller drivers present in most Linux distributions do not fully support runtime power management"

/etc/modprobe.d/blacklist.conf
blacklist ucsi_ccg

Also according to the link create the nvidia power management rules:
/etc/udev/rules.d/80-nvidia-pm.rules

# Remove NVIDIA USB xHCI Host Controller devices, if present
ACTION=="add", SUBSYSTEM=="pci", ATTR{vendor}=="0x10de", ATTR{class}=="0x0c0330", ATTR{remove}="1"

# Remove NVIDIA USB Type-C UCSI devices, if present
ACTION=="add", SUBSYSTEM=="pci", ATTR{vendor}=="0x10de", ATTR{class}=="0x0c8000", ATTR{remove}="1"

# Remove NVIDIA Audio devices, if present
ACTION=="add", SUBSYSTEM=="pci", ATTR{vendor}=="0x10de", ATTR{class}=="0x040300", ATTR{remove}="1"

# Enable runtime PM for NVIDIA VGA/3D controller devices on driver bind
ACTION=="bind", SUBSYSTEM=="pci", ATTR{vendor}=="0x10de", ATTR{class}=="0x030000", TEST=="power/control", ATTR{power/control}="auto"
ACTION=="bind", SUBSYSTEM=="pci", ATTR{vendor}=="0x10de", ATTR{class}=="0x030200", TEST=="power/control", ATTR{power/control}="auto"

# Disable runtime PM for NVIDIA VGA/3D controller devices on driver unbind
ACTION=="unbind", SUBSYSTEM=="pci", ATTR{vendor}=="0x10de", ATTR{class}=="0x030000", TEST=="power/control", ATTR{power/control}="on"
ACTION=="unbind", SUBSYSTEM=="pci", ATTR{vendor}=="0x10de", ATTR{class}=="0x030200", TEST=="power/control", ATTR{power/control}="on"

- To use the higher freqs and no throttle to 400Mhz, I use the following *BE CAREFUL WITH THIS*:
So far my understanding of what I read on the  internet is that the boost in these kind of processors is like an  overclock changing freqs and voltage and the high temps are "normal". I installed a program called zenstates-git (aur) that allow me to change freqs, I didnt want to mess with voltages:

yay -s zenstate-git

And looking into the calculator excel on this forum, I can use the higher frequencies modifying only the FID for Pstate 0
88 = 3.4Ghz
8c = 3.5Ghz
90 = 3.6Ghz
94 = 3.7Ghz
98 = 3.8Ghz
9c = 3.9Ghz

#with boost disabled change pstate 0
sudo zenstates -p 0 -f 8c

Will change the max freq instead of 2.3Ghz to 3.5Ghz. The temps will increase but no so fast and not so high as with boost.
With schedutil governor the speed will vary from min - 3.5Ghz, with performance governor it will be almost always close to the 3.5Ghz.

3.6 - 3.7Ghz was a good level to keep for me on temps and fan noise with high load (Games).

In summary for this laptop
- for normal use : I recommend to disable boost if you want low temps (45 - 60C) . with boost enabled (60 - 75C).
- for games/high usage : disable boost + change pstate 0 + change to performance governor, to avoid the throttle to 400Mhz. (65 - 90C)
- and a good laptop cooling pad (you get - 10-20C)

EDIT: I noticed that they published a new bios on 8/15 version 208, I tested it and no longer occurs the throttle to 400Mhz with high temps, the acpi error still appears but the speed is only reduced to ~2.3 - 2.5Ghz and then it returns to normal, also no spikes of > 90C. Looks like it is fixed.

Also in the bios there's a new option for UMA Memory : 1G

EDIT2: tested again today with other programs, the new bios on turbo mode only make it last more until the throttling of 400Mhz.
and there is a hit on performance, seems the rumors were true:
Geekbench with bios 207 score single/multi: 4622/15289
Geekbench with bios 208 score single/multi: 4461/14442

Last edited by gnox (2019-08-19 14:28:11)

Offline

#3 2019-08-18 08:37:29

manu34
Member
Registered: 2019-08-16
Posts: 4

Re: Ryzen CPU freq dropping to low freq and getting stuck there

gnox wrote:

I have the same laptop, for me the  boost of the processor that have some spikes that increases the temps to > +95 C makes the bios to throttle to 400Mhz and that happens with high load (games, all cpus usage with compiling), browsing you will only see spikes of +15-20C.

Supposedly the laptop can handle those temps (>90C), but I had the same behavior in a clean install with windows,  until I installed the asus apps and services on windows (armoury crate, etc) it no longer occurred and the temps were over 90 most of the  time when playing, so they are handling it in some way with that software.

In Arch until we have this or a bios/microcode/firmware update, I managed to work with the following :

- Remember to add amd-ucode (microcode) as boot parameter.

- Disable boost when you are doing normal tasks, the max frequency will be 2.3Ghz the temps will increase depending on the load but not so high as with boost.
/etc/tmpfiles.d/disable_boost.conf

w /sys/devices/system/cpu/cpufreq/boost - - - - 0

You can also enable/disable with a script

#!/bin/sh
#
if [ "$1" == "on" ]; then
  echo -n 1 > /sys/devices/system/cpu/cpufreq/boost
else
  echo -n 0 > /sys/devices/system/cpu/cpufreq/boost
fi

- With boost enable or disable you can limit the freqs only to the P1 and P2 states (1.7Ghz, 1.4Ghz) with cpupower (good when using on battery), if you limit to 2.3Ghz and boost flag is active it will enable the normal behavior of boost:

sudo cpupower frequency-set -u 1.4Ghz

- blacklist the ucsi_ccg, is the usb type C that according to this : https://download.nvidia.com/XFree86/Lin … ement.html , "USB Type-C UCSI controller drivers present in most Linux distributions do not fully support runtime power management"

/etc/modprobe.d/blacklist.conf
blacklist ucsi_ccg

Also according to the link create the nvidia power management rules:
/etc/udev/rules.d/80-nvidia-pm.rules

# Remove NVIDIA USB xHCI Host Controller devices, if present
ACTION=="add", SUBSYSTEM=="pci", ATTR{vendor}=="0x10de", ATTR{class}=="0x0c0330", ATTR{remove}="1"

# Remove NVIDIA USB Type-C UCSI devices, if present
ACTION=="add", SUBSYSTEM=="pci", ATTR{vendor}=="0x10de", ATTR{class}=="0x0c8000", ATTR{remove}="1"

# Remove NVIDIA Audio devices, if present
ACTION=="add", SUBSYSTEM=="pci", ATTR{vendor}=="0x10de", ATTR{class}=="0x040300", ATTR{remove}="1"

# Enable runtime PM for NVIDIA VGA/3D controller devices on driver bind
ACTION=="bind", SUBSYSTEM=="pci", ATTR{vendor}=="0x10de", ATTR{class}=="0x030000", TEST=="power/control", ATTR{power/control}="auto"
ACTION=="bind", SUBSYSTEM=="pci", ATTR{vendor}=="0x10de", ATTR{class}=="0x030200", TEST=="power/control", ATTR{power/control}="auto"

# Disable runtime PM for NVIDIA VGA/3D controller devices on driver unbind
ACTION=="unbind", SUBSYSTEM=="pci", ATTR{vendor}=="0x10de", ATTR{class}=="0x030000", TEST=="power/control", ATTR{power/control}="on"
ACTION=="unbind", SUBSYSTEM=="pci", ATTR{vendor}=="0x10de", ATTR{class}=="0x030200", TEST=="power/control", ATTR{power/control}="on"

- To use the higher freqs and no throttle to 400Mhz, I use the following *BE CAREFUL WITH THIS*:
So far my understanding of what I read on the  internet is that the boost in these kind of processors is like an  overclock changing freqs and voltage and the high temps are "normal". I installed a program called zenstates-git (aur) that allow me to change freqs, I didnt want to mess with voltages:

yay -s zenstate-git

And looking into the calculator excel on this forum, I can use the higher frequencies modifying only the FID for Pstate 0
88 = 3.4Ghz
8c = 3.5Ghz
90 = 3.6Ghz
94 = 3.7Ghz
98 = 3.8Ghz
9c = 3.9Ghz

#with boost disabled change pstate 0
sudo zenstates -p 0 -f 8c

Will change the max freq instead of 2.3Ghz to 3.5Ghz. The temps will increase but no so fast and not so high as with boost.
With schedutil governor the speed will vary from min - 3.5Ghz, with performance governor it will be almost always close to the 3.5Ghz.

3.6 - 3.7Ghz was a good level to keep for me on temps and fan noise with high load (Games).

In summary for this laptop
- for normal use : I recommend to disable boost if you want low temps (45 - 60C) . with boost enabled (60 - 75C).
- for games/high usage : disable boost + change pstate 0 + change to performance governor, to avoid the throttle to 400Mhz. (65 - 90C)
- and a good laptop cooling pad (you get - 10-20C)

EDIT: I noticed that they published a new bios on 8/15 version 208, I tested it and no longer occurs the throttle to 400Mhz with high temps, the acpi error still appears but the speed is only reduced to ~2.3 - 2.5Ghz and then it returns to normal, also no spikes of > 90C. Looks like it is fixed.

Also in the bios there's a new option for UMA Memory : 1G

EDIT2: tested again today with other programs, the new bios on turbo mode only make it last more until the throttling of 400Mhz.
and there is a hit on performance, seems the rumors were true:
Geekbench with bios 207 score single/multi: 4622/15289
Geekbench with bios 208 score single/multi: 4461/14442

Also the latest amd-ucode 20190815.07b925b-1 breaks nvidia prime offload on that laptop.

Thanks, gnox. I tried your workarounds, unfortunately though I still get the issue. But now I noticed that the issue happens not when the CPU is under load, but when the CPU is almost entirely idle. The CPU frequency lowers to around 1.4GHz and stays there for a few minutes before dropping to 0.4GHz.

Offline

#4 2019-08-18 14:57:46

gnox
Member
Registered: 2013-05-18
Posts: 49

Re: Ryzen CPU freq dropping to low freq and getting stuck there

manu34 wrote:

Thanks, gnox. I tried your workarounds, unfortunately though I still get the issue. But now I noticed that the issue happens not when the CPU is under load, but when the CPU is almost entirely idle. The CPU frequency lowers to around 1.4GHz and stays there for a few minutes before dropping to 0.4GHz.

With the bios 208 yesterday I left the laptop doing nothing just a firefox with 1 tab, I went to the store and when I came back the cpu was at 400Mhz, I replaced the bios back to the 207 version.

Offline

#5 2019-08-18 16:28:05

manu34
Member
Registered: 2019-08-16
Posts: 4

Re: Ryzen CPU freq dropping to low freq and getting stuck there

gnox wrote:
manu34 wrote:

Thanks, gnox. I tried your workarounds, unfortunately though I still get the issue. But now I noticed that the issue happens not when the CPU is under load, but when the CPU is almost entirely idle. The CPU frequency lowers to around 1.4GHz and stays there for a few minutes before dropping to 0.4GHz.

With the bios 208 yesterday I left the laptop doing nothing just a firefox with 1 tab, I went to the store and when I came back the cpu was at 400Mhz, I replaced the bios back to the 207 version.

I'm currently running 208. I might downgrade as well. Can I ask you what's the environment temperature of the place where you left the laptop, roughly? I think I understand what the set of conditions are to cause this to happen (with the 208 firmware and on Linux). I'm currently in a quite hot place (I haven't noticed this problem before I came here). What happens seem to me is that the thermal throttling is checking the CPU temperature and the CPU load. If the CPU temperature is ~50 degrees Celsius or above and the CPU temperature doesn't decrease quickly while the load is very low , as it happens after I run a benchmark, then the CPU enters that damn low frequency state and never recovers.

I came to this conclusion because the throttling:
- Only happens at low loads
- Doesn't happen if  I just turn my PC on and I do nothing.
- Doesn't happen (or it happens but then the PC quickly recovers from it) during high loads.
- Happens a couple of minutes after I quit the benchmark or I stop using my PC.
- It doesn't happen if I use a fan to quickly cool down my PC after I run the benchmark.

As far a I understand, the thermal throttling happens on top of P states and it's entirely HW (firmware) controlled. Am I right? I'm just thinking on buying a cooling stand.

Last edited by manu34 (2019-08-18 16:41:00)

Offline

#6 2019-08-18 17:01:06

gnox
Member
Registered: 2013-05-18
Posts: 49

Re: Ryzen CPU freq dropping to low freq and getting stuck there

manu34 wrote:

I'm currently running 208. I might downgrade as well. Can I ask you what's the environment temperature of the place where you left the laptop, roughly? I think I understand what the set of conditions are to cause this to happen (with the 208 firmware and on Linux). I'm currently in a quite hot place (I haven't noticed this problem before I came here). What happens seem to me is that the thermal throttling is checking the CPU temperature and the CPU load. If the CPU temperature is ~50 degrees Celsius or above and the CPU temperature doesn't decrease quickly while the load is very low , as it happens after I run a benchmark, then the CPU enters that damn low frequency state and never recovers.

I came to this conclusion because the throttling:
- Only happens at low loads
- Doesn't happen if  I just turn my PC on and I do nothing.
- Doesn't happen (or it happens but then the PC quickly recovers from it) during high loads.
- Happens a couple of minutes after I quit the benchmark or I stop using my PC.
- It doesn't happen if I use a fan to quickly cool down my PC after I run the benchmark.

It was at 48-51C the laptop when I left (turbo was disabled), when I returned, it was at 44C at 400Mhz, the env temp because here is winter is ~16-19, I leave it with a cooling laptop pad running at low speed. I think the 208 bios is broken, I rebooted 3 times and it was again at 400Mhz, the acpi error appears at any time with or without turbo enabled. With 207 I only get the acpi error only using turbo and exceeding the 80C and the spikes that the boost produces of +10C - 15C even 20C.

As far a I understand, the thermal throttling happens on top of P states and it's entirely HW (firmware) controlled. Am I right? I'm just thinking on buying a cooling station.

As I mentioned, the same happens on Windows without the asus software, you get also the 400Mhz, when the asus software is installed you can get over 95 without problems. A cooling pad helps a lot.

Now I'm using the laptop with the pstate 0 to 3.5Ghz, turbo disabled, no cooling pad, just browsing and programming and is at 48-50C steady.

Offline

#7 2019-08-18 17:48:44

manu34
Member
Registered: 2019-08-16
Posts: 4

Re: Ryzen CPU freq dropping to low freq and getting stuck there

gnox wrote:

It was at 48-51C the laptop when I left (turbo was disabled), when I returned, it was at 44C at 400Mhz, the env temp because here is winter is ~16-19, I leave it with a cooling laptop pad running at low speed. I think the 208 bios is broken, I rebooted 3 times and it was again at 400Mhz, the acpi error appears at any time with or without turbo enabled. With 207 I only get the acpi error only using turbo and exceeding the 80C and the spikes that the boost produces of +10C - 15C even 20C.

Yes, I just downgraded to 207 and I can confirm that the issue only happens if you the CPU temperature goes above ~80C. I had 205 installed when I bought the laptop, I wonder how that performs, but I can't go back that far as the download of that firmware version is not available.

gnox wrote:

As far a I understand, the thermal throttling happens on top of P states and it's entirely HW (firmware) controlled. Am I right? I'm just thinking on buying a cooling station.

As I mentioned, the same happens on Windows without the asus software, you get also the 400Mhz, when the asus software is installed you can get over 95 without problems. A cooling pad helps a lot.

Now I'm using the laptop with the pstate 0 to 3.5Ghz, turbo disabled, no cooling pad, just browsing and programming and is at 48-50C steady.

I'm running with turbo disabled too now, but no pstate changes. I can even run the Unigine heaven benchmark from start to end like this, as it never go above 80C (I get a bit lower FPS ofc).
I guess it's the only workaround for the moment, but I'm happy I can at least do some stuff on my laptop. I was totally unusable with 208.

Last edited by manu34 (2019-08-18 17:49:17)

Offline

Board footer

Powered by FluxBB