You are not logged in.

#1 2018-11-14 00:55:55

Void_Walker
Member
Registered: 2009-06-17
Posts: 53

After last kernel update, AMDGPU gaming performance issues

After updating the kernel to 4.19 (and lot of other stuff -Syu) today, my vega56 will not try to run the games in any playable framerates, as if it was it had maxed out at power state 1 on core and 2 on memory.
(runs on 991MHz core and 700Mhz memory)

I don't even know where to begin the debugging process.
This is the output while playing dota:

====================    ROCm System Management Interface    ====================
================================================================================
GPU[0] 		: GPU ID: 0x687f
================================================================================
================================================================================
GPU[0] 		: Temperature: 52.0c
================================================================================
================================================================================
GPU[0] 		: GPU Clock Level: 1 (991Mhz)
GPU[0] 		: GPU Memory Clock Level: 2 (700Mhz)
================================================================================
================================================================================
GPU[0] 		: Fan Level: 33 (12.94)%
================================================================================
================================================================================
GPU[0] 		: Current PowerPlay Level: auto
================================================================================
================================================================================
GPU[0] 		: Current GPU OverDrive value: 0%
================================================================================
================================================================================
GPU[0] 		: 
NUM        MODE_NAME BUSY_SET_POINT FPS USE_RLC_BUSY MIN_ACTIVE_LEVEL
  0 3D_FULL_SCREEN :             70  60          1              3
  1   POWER_SAVING :             90  60          0              0
  2          VIDEO*:             70  60          0              0
  3             VR :             70  90          0              0
  4        COMPUTE :             30  60          0              6
  5         CUSTOM :              0   0          0              0
================================================================================
================================================================================
GPU[0] 		: Average GPU Power: 17.0 W
================================================================================
================================================================================
GPU[0] 		: Supported GPU clock frequencies on GPU0
GPU[0] 		: 0: 852Mhz 
GPU[0] 		: 1: 991Mhz *
GPU[0] 		: 2: 1138Mhz 
GPU[0] 		: 3: 1269Mhz 
GPU[0] 		: 4: 1312Mhz 
GPU[0] 		: 5: 1474Mhz 
GPU[0] 		: 6: 1538Mhz 
GPU[0] 		: 7: 1590Mhz 
GPU[0] 		: 
GPU[0] 		: Supported GPU Memory clock frequencies on GPU0
GPU[0] 		: 0: 167Mhz 
GPU[0] 		: 1: 500Mhz 
GPU[0] 		: 2: 700Mhz *
GPU[0] 		: 3: 800Mhz 
GPU[0] 		: 
================================================================================
====================           End of ROCm SMI Log          ====================

As you can see, power draw is 17W instead of 165W set by the cards bios.

It was working fine yesterday.
I tested Dark Souls 3, Shadow of the Tomb Raider and Dota2.
The first 2 were via proton, dota2 was native.
Dark Souls 3 and dota2 ran fine yesterday.
Also the GPU reports insignificant power draws by the 'power draw leds' on the card (2/8)

Thank you for any suggestions.

Last edited by Void_Walker (2018-11-14 01:23:35)

Offline

#2 2018-11-14 01:09:25

progandy
Member
Registered: 2012-05-17
Posts: 5,184

Re: After last kernel update, AMDGPU gaming performance issues


| alias CUTF='LANG=en_XX.UTF-8@POSIX ' |

Offline

#3 2018-11-14 01:26:21

Void_Walker
Member
Registered: 2009-06-17
Posts: 53

Re: After last kernel update, AMDGPU gaming performance issues

Thank you.
Yes, this seems to be it. The first one is really identical to my problem.
So the options are to compile a custom kernel or wait for an fix?
Alternatively use an older kernel?

Offline

#4 2018-11-14 07:57:51

progandy
Member
Registered: 2012-05-17
Posts: 5,184

Re: After last kernel update, AMDGPU gaming performance issues

Compiling a custom kernel dosn't necessarily fix it. You need a patch first or maybe you are lucky and it is already fixed in 4.20. 4.20-rc2 is availble in the AUR as linux-mainline.


| alias CUTF='LANG=en_XX.UTF-8@POSIX ' |

Offline

#5 2018-11-14 13:44:44

ibrokemypie
Member
Registered: 2016-06-27
Posts: 16

Re: After last kernel update, AMDGPU gaming performance issues

Could this have anything to do with artifacting?
https://bugs.freedesktop.org/show_bug.cgi?id=108671

Offline

#6 2018-11-16 10:08:08

jasondaigo
Member
Registered: 2017-04-21
Posts: 9

Re: After last kernel update, AMDGPU gaming performance issues

i noticed while playing dota it consumes very little watt. about 53 i think. but still going over 60°C on my vega56 (with strangle 60 limit). however when i start a benchmark (superposition gets the same results a s always.shaderclock 1490mhz )or mining (same as before 33Mh/s ethminer)  it goes all the way up to 180 watt or 165 watt. so i have a problem specific with dota. that game also start locking my entire pc 2 weeks ago. sound remains but video stops.

that game works flawless for 2to 3years for me; but i switched recently from 1060 to v56.

(sadly) i dont have any other games that doesnt work too xD
mankind divided for example runs with 60fps locked and 60°C. but i will check the power consumption in multiple scenarios tonight.
now i wanna know :-)

Last edited by jasondaigo (2018-11-16 10:08:32)

Offline

#7 2018-11-16 14:53:23

Void_Walker
Member
Registered: 2009-06-17
Posts: 53

Re: After last kernel update, AMDGPU gaming performance issues

Well, rolling back to kernel 4.18.16(if I am not wrong) did solve the low power draw issue, now all the games do torture the card. So there has to be some regression in the kernel for AMD driver.
If you have vsync enabled, then there is no wonder that the power draw is as low. Even with the bug dota2 ran just fine (70-80FPS)

As for crashing, what are your system specs? I noticed that dota is extremely sensitive to SEGFAULTS. Do you have ryzen CPU? Don't you have the ryzen segfault bug? Do you have your RAM OC-ed (even via XMP?) I had the same issues not so long ago. Then I noticed that I have indeed a buggy CPU and the XMP profile was either wrong, or my RAM stincks are flawed (memtest ran OK)  or my board has a shitty/old bios (which is the case)
To check for segfaults, use

dmesg -wH

OT: How did you get the power draw on vega56 to 180W? The power-bios has a limit of 165 and turbo mode is available on linux. Did you patch your kernel?

Last edited by Void_Walker (2018-11-16 14:54:29)

Offline

#8 2018-11-16 17:43:53

jasondaigo
Member
Registered: 2017-04-21
Posts: 9

Re: After last kernel update, AMDGPU gaming performance issues

u r right about dota; i was not using vsync but strangle; so the game only need 15 watt for 60 fps then alright; what i dont understand is that the card takes so little power and runs at 800mhz or even less for that. and still gets 58degrees like im running a benchmark. im not an engeneer obviously but how the fuck is that correct?

for the lock ups i didnt find any log yet who show me any issue whatsoever. hte oc is mild at best. i can run any torture all night and i did even set the dimm mhz a good amount lower than the max possible.
i have a ryzen 2600x (only pbo). is that still a thing for zen+?
however with the 1060 before no problem; thats clearly not the issue here.

180w is the primary bios limit for the pulse, 165w is for 2nd bios; i think thats pretty standard no?

edit:if i limit deus ex to 80 fps and if im in an area where i  get like 60 fps only it runs at 1400mhz but it still do not draw more than 90 watt; im pretty sure thats incorrect too.
if i force max mhz 1500+ it runs with 114 w consumption and not a single fps more :-)

Last edited by jasondaigo (2018-11-16 17:56:57)

Offline

#9 2018-11-16 19:01:49

ewaller
Administrator
From: Pasadena, CA
Registered: 2009-07-13
Posts: 19,739

Re: After last kernel update, AMDGPU gaming performance issues

Jasondaigo: Your shift key is broken 'I' MHz DIMM Watt Ryzen BIOS

I agree there must be something else going on.  What about active cooling?  Any chance some cooling device is waiting until the temperature rises before coming on line?
Where is the temperature measured, and what is it measuring?  Junction (die) temperature? Case temperature? heatsink temperature? Exhaust cooling air temperature?

If it is junction temperature, 56 °C is not outlandish. That is only a 33 ° rise above ambient -- that is only a 12% increase.   I would expect that part to be could to at least 80°, if not 100°.


Nothing is too wonderful to be true, if it be consistent with the laws of nature -- Michael Faraday
Sometimes it is the people no one can imagine anything of who do the things no one can imagine. -- Alan Turing
---
How to Ask Questions the Smart Way

Offline

#10 2018-11-16 19:19:26

WorMzy
Forum Moderator
From: Scotland
Registered: 2010-06-16
Posts: 11,783
Website

Re: After last kernel update, AMDGPU gaming performance issues

To further ewaller's point about capitalisation, please read https://wiki.archlinux.org/index.php/Co … ow_to_post, specifically the second bullet point, and apply it to your posts going forward.

EDIT: And please use your original account, 'friet'. Duplicate accounts are not permitted. Locked old account on appeal.

Last edited by WorMzy (2018-11-17 14:07:45)


Sakura:-
Mobo: MSI MAG X570S TORPEDO MAX // Processor: AMD Ryzen 9 5950X @4.9GHz // GFX: AMD Radeon RX 5700 XT // RAM: 32GB (4x 8GB) Corsair DDR4 (@ 3000MHz) // Storage: 1x 3TB HDD, 6x 1TB SSD, 2x 120GB SSD, 1x 275GB M2 SSD

Making lemonade from lemons since 2015.

Offline

#11 2018-12-08 19:12:09

Void_Walker
Member
Registered: 2009-06-17
Posts: 53

Re: After last kernel update, AMDGPU gaming performance issues

OK, I previously managed to ignore the bug by using linux-mainline kernel (4.20rc4) the 4.20rc5 reintroduced it again and now I have no idea how to fix it.
The GK behaves as described in the opening post. It will not go much over 60W and stays stuck on power state 0 to 1.
I tried using the kernels linux-mainline 4.20rc5, linux-amd-git, linuc from core. All of them have this weird regression. The last that did not have this issue was linux-mainline 4.20rc4.
Is there a way how to reinstall rc4?
Is there a way how to debug this, or figure out why this happens at all?

OT, unrelated to the problem:
On the other hand I now can force a higher power consumption on the vega card, but it helps not, since it refuses to go over power state 1 (only 2 lights lit on the vega56 "tachometer")
After enabling OC via kernel parameter I can now:

echo 220000000 > /sys/class/drm/card0/device/hwmon/hwmon2/power1_cap

which should enable a power draw up to 220W.

Last edited by Void_Walker (2018-12-08 19:12:31)

Offline

#12 2018-12-08 19:22:46

loqs
Member
Registered: 2014-03-06
Posts: 17,192

Re: After last kernel update, AMDGPU gaming performance issues

Bisecting between 4.20-rc4 and 4.20-rc5 should be the quickest way to find the cause of the regression.
Edit:
This was the merge commit for drm for rc5 https://git.kernel.org/pub/scm/linux/ke … ac749e6b19

Last edited by loqs (2018-12-08 19:24:55)

Offline

#13 2018-12-08 20:45:11

Void_Walker
Member
Registered: 2009-06-17
Posts: 53

Re: After last kernel update, AMDGPU gaming performance issues

Something else has to be wrong.
Rolled back to rc4, and the graphics card still won't go over power state 1.
Dota and DS3 are still payable, but DS3 isn't really that great with way too variable frame-rate.

Any idea how to debug amdgpu?

Offline

Board footer

Powered by FluxBB