You are not logged in.

#1 2021-11-14 17:20:29

sklorpion
Member
From: Cracovia
Registered: 2013-02-03
Posts: 9

[ Solved ] Random performance AMDGPU 6600 XT

Good day!

I'm facing bizzare behavior of my arch system GPU usage while lunching games or benchmarks. Most cases it works normally 100% but some times ( randomly absolutely randomly ) it   works around 30% of it's power - fps are low and power consumption is about 19W, should be up to 170W.
Example : i run Unigine Supertition "4k Optimized" i have 55-56 fps, then i quit and start benchmark again [i do nothing more]  and get 17-18 fps. It can be 5 times good and 3 times bad. Same is with regular gamming, once it's ok another time its unplayable.  Dmesg is silent no new output or so.

What i did: I've already tried different mesa, kernels, window managers, fresh install (3 times), so i thought maybe something wrong with my card and tested with windows  - run like a charm. I got back to linux  and picked Debian ( testing ) and it worked well, no problems like i faced in Arch. I know nothing about Debian and it took me hours to set it up properly.

I really dont know where to look for solution i'm now sure it's arch problem, can something be wrong with firmware? 

I'll be pleased with some hints where to look for solution. Tell me what you need and i'll give you everything if it helps.

Best regards,
Skłorpią

Last edited by sklorpion (2021-11-17 18:52:44)

Offline

#2 2021-11-14 17:25:50

Head_on_a_Stick
Member
From: Belsize Park
Registered: 2014-02-20
Posts: 8,250
Website

Re: [ Solved ] Random performance AMDGPU 6600 XT

sklorpion wrote:

Dmesg is silent no new output or so

How about the systemd journal?


Velocitas Eradico

Offline

#3 2021-11-14 18:53:14

sklorpion
Member
From: Cracovia
Registered: 2013-02-03
Posts: 9

Re: [ Solved ] Random performance AMDGPU 6600 XT

when i do

 journalctl -f 

then i'm spammed with

lis 14 19:02:46 archie sudo[25718]: pam_systemd_home(sudo:account): systemd-homed is not available: Unit dbus-org.freedesktop.home1.service not found.
lis 14 19:02:46 archie sudo[25718]:     caca : PWD=/home/caca ; USER=root ; COMMAND=/usr/bin/nvme smart-log /dev/nvme0
lis 14 19:02:46 archie sudo[25718]: pam_unix(sudo:session): session opened for user root(uid=0) by (uid=1000)
lis 14 19:02:46 archie sudo[25718]: pam_unix(sudo:session): session closed for user root
lis 14 19:02:46 archie dbus-daemon[578]: [system] Activating via systemd: service name='org.freedesktop.home1' unit='dbus-org.freedesktop.home1.service' requested by ':1.543' (uid=0 pid=25752 comm="sudo nvme smart-log /dev/nvme0 ")
lis 14 19:02:46 archie dbus-daemon[578]: [system] Activation via systemd failed for unit 'dbus-org.freedesktop.home1.service': Unit dbus-org.freedesktop.home1.service not found.

i don't know how i can focus journalctl on GPU only but I managed to cut out that crap
i solved above with this https://bbs.archlinux.org/viewtopic.php?id=258297  to come to this

 
lis 14 19:30:25 archie sudo[27469]:     caca : PWD=/home/caca ; USER=root ; COMMAND=/usr/bin/nvme smart-log /dev/nvme0
lis 14 19:30:25 archie sudo[27469]: pam_unix(sudo:session): session opened for user root(uid=0) by (uid=1000)
lis 14 19:30:25 archie sudo[27469]: pam_unix(sudo:session): session closed for user root

i killed my conky where i exec

 sudo nvme smart-log /dev/nvme0 | grep temperature | awk '{printf $3}'  

and it stopped spitting 10x a sec. Now i could analyse journal.

It worked 12 times in row but then broke for 1 pass and give me this:

lis 14 19:42:38 archie systemd[1]: Starting Cleanup of Temporary Directories...
lis 14 19:42:38 archie systemd[1]: systemd-tmpfiles-clean.service: Deactivated successfully.
lis 14 19:42:38 archie systemd[1]: Finished Cleanup of Temporary Directories. 

again 7 good and 1 bad and no output nothing, so my effort was pointless.
i feel like it happens rarely now, i know i still know shit.

Offline

#4 2021-11-14 19:02:00

Head_on_a_Stick
Member
From: Belsize Park
Registered: 2014-02-20
Posts: 8,250
Website

Re: [ Solved ] Random performance AMDGPU 6600 XT

Is the amdgpu kernel module in use when the performance is poor? The -k option for lspci will show this.

Off-topic:

sudo nvme smart-log /dev/nvme0 | grep temperature | awk '{printf $3}'

Did you know awk can pattern match?

sudo nvme smart-log /dev/nvme0 | awk '/temperature/{printf $3}'

Velocitas Eradico

Offline

#5 2021-11-14 21:53:06

sklorpion
Member
From: Cracovia
Registered: 2013-02-03
Posts: 9

Re: [ Solved ] Random performance AMDGPU 6600 XT

Is the amdgpu kernel module in use when the performance is poor? The -k option for lspci will show this.

yes, i did

watch -n 0.5 lspci -k | grep 'Kernel driver in use' 

and nothing changes when performance is lower.

i need to find better way to test it, i need to automate it. Now i did like 20 test while running radeontop all were good. I will update in 24h.

Off-topic:

Did you know awk can pattern match?

- no, thank You.

Offline

#6 2021-11-15 19:14:27

sklorpion
Member
From: Cracovia
Registered: 2013-02-03
Posts: 9

Re: [ Solved ] Random performance AMDGPU 6600 XT

So...  i've made some simple script to test my gpu, it runs unigine-superstition 50 times and 15 runs are bad rest are with good fps. The best thing - this is hilarious - is if i turn on radeontop program a have 50\50 good results, i dont know if its luck or what, i'm sure i'll go nutts with this one.
During all runs i was checking if  kernel driver in use is amdgpu - always true.

My suspicion is that gpu may sometimes enter some kind of low power consumption state or some other craazy state?
Pure luck, i had 6 bad in row ...

Last edited by sklorpion (2021-11-15 19:24:55)

Offline

#7 2021-11-15 20:05:05

Head_on_a_Stick
Member
From: Belsize Park
Registered: 2014-02-20
Posts: 8,250
Website

Re: [ Solved ] Random performance AMDGPU 6600 XT

We could try tweaking the module parameters, for example

amdgpu.runpm=0

^ That kernel command line parameter disables "runtime power management control for dGPUs in PX/HG laptops", which sounds like it might be relevant.

Check the parameters applied in Debian, either manually (/sys/module/amdgpu/parameters/) or with modinfo(8).


Velocitas Eradico

Offline

#8 2021-11-16 04:52:12

Ropid
Member
Registered: 2015-03-09
Posts: 1,069

Re: [ Solved ] Random performance AMDGPU 6600 XT

I'm running a script at boot on my 6700XT that sets things to "3D_FULL_SCREEN" or "COMPUTE" in a file "pp_power_profile_mode" in /sys/class/drm/card0/device/. That's solving some weird stutter issues I have here.

Doing that change manually on the command line looks like this:

cd /sys/class/drm/card0/device/
echo manual | sudo tee power_dpm_force_performance_level
echo 1 | sudo tee pp_power_profile_mode

While you are in that /sys/class/drm/card0/device/ location, take a look at the contents of the pp_power_profile_mode file. There's a '*' next to the name of the profile that's currently used by the driver.

Here's an example shell script that searches for the right sub-folder in /sys/class/drm and does the change:

#!/bin/bash

if (( $UID != 0 )); then
    echo "$0: needs to run as root!" 1>&2
    exit 1
fi

for device in /sys/class/drm/card?/device; do
    if [[ -e "$device"/pp_power_profile_mode ]]; then

        echo manual > "$device"/power_dpm_force_performance_level
        echo 1 > "$device"/pp_power_profile_mode

        # The other power profile modes are:
        #   1 = 3D_FULL_SCREEN
        #   4 = VR
        #   5 = COMPUTE

    fi
done

For my card here, the "3D_FULL_SCREEN" setting mostly solves the stutter issues but I found examples where it didn't fully work. The problem was still there with old games that don't stress the card much. Using the "COMPUTE" setting solves the stutter issues fully for me.

Using "COMPUTE", the card doesn't seem to use any in-between speeds for the core clock. It's either the lowest when idle on the desktop, or it's max clock speed. The memory clock speed still seems to work like normal, with the card using lower speeds for example when scrolling in the web browser. I was uncomfortable with the card's max core boost speed being used with the COMPUTE mode, so I then looked into how undervolting/underclocking works and limited the max core clock of my card by a lot.

There's a bug report that might be about this problem:

https://gitlab.freedesktop.org/drm/amd/-/issues/1500

Here's the documentation for the files /sys/class/drm:

https://www.kernel.org/doc/html/v5.15/gpu/amdgpu.html

Offline

#9 2021-11-16 21:08:42

sklorpion
Member
From: Cracovia
Registered: 2013-02-03
Posts: 9

Re: [ Solved ] Random performance AMDGPU 6600 XT

Ropid thank you, first tests are promissing, great post, links and hints.
I tried to turn "0 BOOTUP_DEFAULT", "1 3D_FULL_SCREEN*", "2   POWER_SAVING", "3  VIDEO", "5 COMPUTE" there is no difference in performance at all but  when i choose "0 BOOTUP_DEFAULT" i face my problem time to time, and if i choose something else it works well.
Strangely  while

watch -n 1 cat pp_power_profile_mode

  and benchmarking on "0 BOOTUP_DEFAULT " poor fps  system is not changing  pp_power_profile_mode.
I'm after 50 runs and all were correct i hope to do about 100 more see if it really helped. YES after 100 more benchmarks it works.

Head_on_a_Stick thank you, this

 amdgpu.runpm=0 

doesn't work. I 'll look at modinfo tomorrow.

If someone is looking for testing script that runs in loop unigine_superstition (windowed 1920x1080) for 20 sec to get Avg fps  and exit, this is my poor code:

#!/bin/bash
cd ~/path/where/to/write/screenshots         
x=1
while [ $x -le 50 ]
do
    notify-send "Zaczynam testy Numer  $x  "
    #notify about number of test
    sleep 1
    xdotool mousemove 2450 1352
    xdotool click 1
    # above is click RUN button on unigine_supertition values are X and Y where the button is you can get it running xdotool getmouselocation --shell
    sleep 20
    # i need 20 sec to start unigine - depends on overall system speed.
    wartoscWatt=`sensors | grep 'power1:' | cut -c15-16`
    # that one above is for my gpu power consumption, you cant cut it out.

    #now we make a screenshot cropped to rectangle 130x81 that upper left corner is located X:2424 Y:344 of my screen
    #that area is filled with FPS, Min,Max, Avg FPS 
    import -window root -crop 130x81+2424+344 "$x".png

    #I change colors from black to white (negate colors) for better OCR and reduce its quality (which truly is pointless)
    convert "$x".png -quality 80% -negate "$x".png
    #run OCR and dont give me errors output, just OCR png to txt, it will make from 1.png t1.txt file
    tesseract -c debug_file=/dev/null "$x".png t"$x"
    #close unigine by pressing Esc key
    xdotool key 'Escape'

    #give me shell output about test nr, avg fps, and power consumption 
    echo "Przejście nr $x " `cat t"$x".txt | grep Avg` " power consumption $wartoscWatt"
    x=$(($x + 1))
done

Last edited by sklorpion (2021-11-16 21:49:03)

Offline

#10 2021-11-17 18:32:43

sklorpion
Member
From: Cracovia
Registered: 2013-02-03
Posts: 9

Re: [ Solved ] Random performance AMDGPU 6600 XT

This is it! Can be marked as SOLVED!

I run 50 test  "0 BOOTUP_DEFAULT "  with result of 10 low fps.
i run 200 test " 1 3D_FULL_SCREEN " all good.

Ropid post #8 is a solution, thank you!

Offline

Board footer

Powered by FluxBB