You are not logged in.

#1 2021-08-03 16:09:52

BlackMastermind
Member
Registered: 2017-01-17
Posts: 45

Kernel 5.13.x amdgpu power cap not working

I have a Radeon RX6800XT and when I'm not using my GPU for gaming, I use it to mine ethereum. To keep both noise and temperatures reasonable, I manually set a power cap of 160W and a fixed fan speed of 50%. This usually results in mid 50s edge/junction temps and low 70s memory temps.

Today, when checking my temperatures, I noticed that they were sky high (98°C memory temperature, 89°C edge temperature, 99°C junction temperature) and the GPU was consuming 240W.

To set the power cap, I used the following commands:

# Set manual performance control
echo manual > /sys/class/drm/card0/device/power_dpm_force_performance_level

# Set power cap to 160W
echo 160000000 > /sys/class/drm/card0/device/hwmon/hwmon*/power1_cap

However when I checked the actual value in power1_cap, it returned this:

cat /sys/class/drm/card0/device/hwmon/hwmon*/power1_cap
1271490560

That's a 1271W power "cap" !

My kernel version is 5.13.7-arch1-1, I tested with a few other 5.13 releases as well, and they behave the same: it seems that no matter which value I send to power1_cap, it always returns 1271490560.

Then I tested with 5.12.15.arch1-1, and it behaves as expected:

cat /sys/class/drm/card0/device/hwmon/hwmon*/power1_cap
160000000

I believe this could issue be very dangerous and damage people's hardware. My GPU ran for several hours at these temperatures before I noticed, I can only hope there is no permanent damage.

Offline

#2 2021-08-05 11:12:03

BlackMastermind
Member
Registered: 2017-01-17
Posts: 45

Re: Kernel 5.13.x amdgpu power cap not working

Update:

The issue has apparently already been reported here: https://gitlab.freedesktop.org/drm/amd/-/issues/1657

Offline

Board footer

Powered by FluxBB