You are not logged in.

#1 2024-08-08 17:13:58

baal
Member
Registered: 2023-11-02
Posts: 32

Regression; Video instability after *recent* update, AMDGPU

Hi,

After a *recent* update I experience some intermittent video related instability. After a fresh reboot things look all good, but after *some* time or after suspend/resume cycles problems can come.

I will start seeing artifacts in hardware accelerated playback of h264 videos in Firefox, see below:

20240808-173956.jpg

20240808-174030.jpg

Sometimes X hangs and restarts by itself so that it throws me to the X login.
Sometimes the computer freezes with a black screen and I have to hard reset.

The troubles started after an update. The regression happened in the past month or so. It is hard to pinpoint which exact update caused it as I do not update too often.

Does anyone experience similar issues? Any help is appreciated. Cheers.

vainfo 
Trying display: wayland
Trying display: x11
vainfo: VA-API version: 1.22 (libva 2.22.0)
vainfo: Driver version: Mesa Gallium driver 24.1.5-arch1.1 for ATI FirePro W5000 (radeonsi, pitcairn, LLVM 18.1.8, DRM 3.57, 6.10.3-arch1-2)
vainfo: Supported profile and entrypoints
      VAProfileMPEG2Simple            :	VAEntrypointVLD
      VAProfileMPEG2Main              :	VAEntrypointVLD
      VAProfileVC1Simple              :	VAEntrypointVLD
      VAProfileVC1Main                :	VAEntrypointVLD
      VAProfileVC1Advanced            :	VAEntrypointVLD
      VAProfileH264ConstrainedBaseline:	VAEntrypointVLD
      VAProfileH264Main               :	VAEntrypointVLD
      VAProfileH264High               :	VAEntrypointVLD
      VAProfileNone                   :	VAEntrypointVideoProc

Offline

#2 2024-08-08 19:08:34

cryptearth
Member
Registered: 2024-02-03
Posts: 1,110

Re: Regression; Video instability after *recent* update, AMDGPU

looks like a broken gpu to me

Offline

#3 2024-08-10 13:31:30

baal
Member
Registered: 2023-11-02
Posts: 32

Re: Regression; Video instability after *recent* update, AMDGPU

Sometimes it is the easiest to just blame the hardware isn't it. I downgraded to some mid May era drivers, kernel, etc. and now the system seems to be working fine.

vainfo 
Trying display: wayland
Trying display: x11
vainfo: VA-API version: 1.21 (libva 2.21.0)
vainfo: Driver version: Mesa Gallium driver 24.0.6-arch1.2 for ATI FirePro W5000 (radeonsi, pitcairn, LLVM 17.0.6, DRM 3.57, 6.8.9-arch1-2)
vainfo: Supported profile and entrypoints
      VAProfileMPEG2Simple            :	VAEntrypointVLD
      VAProfileMPEG2Main              :	VAEntrypointVLD
      VAProfileVC1Simple              :	VAEntrypointVLD
      VAProfileVC1Main                :	VAEntrypointVLD
      VAProfileVC1Advanced            :	VAEntrypointVLD
      VAProfileH264ConstrainedBaseline:	VAEntrypointVLD
      VAProfileH264Main               :	VAEntrypointVLD
      VAProfileH264High               :	VAEntrypointVLD
      VAProfileNone                   :	VAEntrypointVideoProc

Offline

#4 2024-08-12 15:36:52

noctavian
Member
Registered: 2013-07-11
Posts: 17

Re: Regression; Video instability after *recent* update, AMDGPU

I've been experiencing issues with my AMD iGPU (Ryzen 9 7900 CPU).

After kernel 6.10 was released, my system started to freeze occasionally when playing videos in either VLC of Firefox and I've had to hard reset the system. These freezes are somewhat frequent, although I don't know how to replicate them. I've also been experiencing some minor artifacts when playing videos. Looking at the systemd logs it seems like the amdgpu driver had a bug when trying to reset the GPU hence the system freeze rather than just crashing the application or the GNOME session. I'm also seeing a lot of DMCUB and ring vcn_dec_0 timeout related errors.

Is there a bug report somewhere upstream? I couldn't find one.
After further digging, it looks like this might be the fix, comming to an updated Mesa version in a few days?

Aug 12 17:38:26 takina kernel: [drm] Register(0) [mmUVD_POWER_STATUS] failed to reach value 0x00000001 != 0x00000002n
Aug 12 17:38:26 takina kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring vcn_dec_0 timeout, signaled seq=261292, emitted seq=261292
Aug 12 17:38:26 takina kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process RDD Process pid 5690 thread firefox:cs0 pid 10943
Aug 12 17:38:26 takina kernel: amdgpu 0000:13:00.0: amdgpu: GPU reset begin!
Aug 12 17:38:26 takina kernel: ------------[ cut here ]------------
Aug 12 17:38:26 takina kernel: WARNING: CPU: 20 PID: 110804 at drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c:630 amdgpu_irq_put+0x46/0x70 [amdgpu]
Aug 12 17:38:26 takina kernel: Modules linked in: veth nf_conntrack_netlink xt_nat xt_tcpudp xt_conntrack xt_MASQUERADE xfrm_user xfrm_algo xt_addrtype nft_c>
Aug 12 17:38:26 takina kernel:  crct10dif_pclmul crc32_pclmul crc32c_intel polyval_clmulni polyval_generic amdxcp gf128mul drm_ttm_helper ghash_clmulni_intel>
Aug 12 17:38:26 takina kernel: CPU: 20 PID: 110804 Comm: kworker/u96:1 Not tainted 6.10.3-arch1-2 #1 20bffa7dc84b9a89fd543afbd712f49dca71b693
Aug 12 17:38:26 takina kernel: Hardware name: Gigabyte Technology Co., Ltd. B650 GAMING X AX/B650 GAMING X AX, BIOS F30 05/22/2024
Aug 12 17:38:26 takina kernel: Workqueue: amdgpu-reset-dev drm_sched_job_timedout [gpu_sched]
Aug 12 17:38:26 takina kernel: RIP: 0010:amdgpu_irq_put+0x46/0x70 [amdgpu]
[...]
Aug 12 17:38:26 takina kernel: amdgpu 0000:13:00.0: amdgpu: MODE2 reset
Aug 12 17:38:26 takina kernel: amdgpu 0000:13:00.0: amdgpu: GPU reset succeeded, trying to resume
Aug 12 17:38:26 takina kernel: [drm] PCIE GART of 1024M enabled (table at 0x000000F41FC00000).
Aug 12 17:38:26 takina kernel: [drm] VRAM is lost due to GPU reset!
Aug 12 17:38:26 takina kernel: amdgpu 0000:13:00.0: amdgpu: PSP is resuming...
Aug 12 17:38:26 takina kernel: amdgpu 0000:13:00.0: amdgpu: reserve 0xa00000 from 0xf41e000000 for PSP TMR
Aug 12 17:38:26 takina kernel: amdgpu 0000:13:00.0: amdgpu: RAS: optional ras ta ucode is not available
Aug 12 17:38:26 takina kernel: amdgpu 0000:13:00.0: amdgpu: RAP: optional rap ta ucode is not available
Aug 12 17:38:26 takina kernel: amdgpu 0000:13:00.0: amdgpu: SECUREDISPLAY: securedisplay ta ucode is not available
Aug 12 17:38:26 takina kernel: amdgpu 0000:13:00.0: amdgpu: SMU is resuming...
Aug 12 17:38:26 takina kernel: amdgpu 0000:13:00.0: amdgpu: SMU is resumed successfully!
Aug 12 17:38:26 takina kernel: [drm] DMUB hardware initialized: version=0x05001900
Aug 12 17:38:26 takina kernel: amdgpu 0000:13:00.0: [drm] *ERROR* dc_dmub_srv_log_diagnostic_data: DMCUB error - collecting diagnostic data
Aug 12 17:38:26 takina kernel: [drm] kiq ring mec 2 pipe 1 q 0
Aug 12 17:38:26 takina kernel: amdgpu 0000:13:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring vcn_dec_0 test failed (-110)
Aug 12 17:38:26 takina kernel: [drm:amdgpu_device_ip_resume_phase2 [amdgpu]] *ERROR* resume of IP block <vcn_v3_0> failed -110
Aug 12 17:38:26 takina kernel: amdgpu 0000:13:00.0: amdgpu: GPU reset(2) failed
Aug 12 17:38:26 takina kernel: amdgpu 0000:13:00.0: amdgpu: GPU reset end with ret = -110
Aug 12 17:38:26 takina kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* GPU Recovery Failed: -110
Aug 12 17:38:27 takina kernel: [drm] Register(0) [mmUVD_POWER_STATUS] failed to reach value 0x00000001 != 0x00000002n
Aug 12 17:38:28 takina kernel: [drm] Register(0) [mmUVD_RBC_RB_RPTR] failed to reach value 0x00000010 != 0x00000000n
Aug 12 17:38:36 takina kernel: [drm] Register(0) [mmUVD_POWER_STATUS] failed to reach value 0x00000001 != 0x00000002n
Aug 12 17:38:36 takina kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring vcn_dec_0 timeout, signaled seq=261292, emitted seq=261292
Aug 12 17:38:36 takina kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process RDD Process pid 5

Last edited by noctavian (2024-08-12 16:04:22)

Offline

#5 2024-08-13 00:56:50

justJack
Member
Registered: 2024-05-03
Posts: 5

Re: Regression; Video instability after *recent* update, AMDGPU

I also have this issue on a HD7770, artifacts and freezes when playing videos using VAAPI.
It started with kernel 6.10 and it's still not fixed in 6.10.4. No problems on LTS kernel.

Offline

#6 2024-08-13 01:08:43

cryptearth
Member
Registered: 2024-02-03
Posts: 1,110

Re: Regression; Video instability after *recent* update, AMDGPU

justJack wrote:

I also have this issue on a HD7770

RX 7000 and HD 7000 a two very different technologies
RX 7000 is from Q4/22
HD 7000 is from Q4/11 - an architechture 12 years old now - and the oldest series supported by amdgpu (only if forced, otherwise defaults to old radeon driver)

Offline

#7 2024-08-13 01:16:26

dobie2564
Member
Registered: 2011-09-05
Posts: 27

Re: Regression; Video instability after *recent* update, AMDGPU

This started happening when I upgraded to 6.10.3 and got worse with 6.10.4.   I've downgraded to 6.10.2 and it seems to be stable now.   This leads to a random system reboot which I reported in the other topic.
                  -`                    tim@rosetta
                  .o+`                   -----------
                 `ooo/                   OS: Arch Linux x86_64
                `+oooo:                  Host: A7
               `+oooooo:                 Kernel: 6.10.2-arch1-2
               -+oooooo+:                Uptime: 3 hours, 10 mins
             `/:-:++oooo+:               Packages: 1104 (pacman), 7 (flatpak)
            `/++++/+++++++:              Shell: bash 5.2.32
           `/++++++++++++++:             Resolution: 3840x2160
          `/+++ooooooooooooo/`           DE: GNOME 46.4
         ./ooosssso++osssssso+`          WM: Mutter
        .oossssso-````/ossssss+`         WM Theme: Adwaita
       -osssssso.      :ssssssso.        Theme: Adwaita-dark [GTK2/3]
      :osssssss/        osssso+++.       Icons: Adwaita [GTK2/3]
     /ossssssss/        +ssssooo/-       Terminal: gnome-terminal
   `/ossssso+/:-        -:/+osssso+-     CPU: AMD Ryzen 9 7940HS w/ Radeon 780M Graphics (16) @ 5.263GHz
  `+sso+:-`                 `.-/+oso:    GPU: AMD ATI 65:00.0 Phoenix1
`++:.                           `-/+/   Memory: 5505MiB / 31377MiB

Last edited by dobie2564 (2024-08-13 01:16:58)

Offline

#8 2024-08-13 06:49:06

cryptearth
Member
Registered: 2024-02-03
Posts: 1,110

Re: Regression; Video instability after *recent* update, AMDGPU

@OP

baal wrote:
ATI FirePro W5000

as I see it just now - W5000 - yet another card from 2012
guys - what's wrong with all you? you run decade old GPUs and expect them to work fine with latest 2024 drivers designed for CURRENT hardware
also

baal wrote:

Sometimes it is the easiest to just blame the hardware isn't it.

well - when we talk about a 12 year old gpu - then, yea, it's quite possible that in fact it really is faulty hardware

TLDR: please either upgrade your GPUs or use older drivers / hardware-era appropriate linux versions
you can't expect that driver devs still have 10+ year old hardware and do test them if they break by changes meant to improve current hardware

Offline

#9 2024-08-13 07:04:12

seth
Member
Registered: 2012-09-03
Posts: 60,080

Re: Regression; Video instability after *recent* update, AMDGPU

Sounds eerily related to https://bbs.archlinux.org/viewtopic.php … 5#p2189645
The thread has a bunch of bisection kernels, you could try whether your issue falls into the same sector.

Offline

#10 2024-08-13 09:58:51

justJack
Member
Registered: 2024-05-03
Posts: 5

Re: Regression; Video instability after *recent* update, AMDGPU

Yes, HD7770, 12 years old card and it's not faulty hardware since it works fine on LTS kernel. 

AMDGPU is forced but it worked fine until kernel 6.10 and I wouldn't have bring the issue if didn't saw other people with a similar problem.

Offline

#11 2024-08-13 10:50:52

cryptearth
Member
Registered: 2024-02-03
Posts: 1,110

Re: Regression; Video instability after *recent* update, AMDGPU

have you tried with the default radeon driver meant for this hardware? there's a reason why to switch from radeon to amdgpu has to be enforced by kernel-parameter
point is: current amdgpu developement seems not to be compatible with decade old hardware - why should developement for current hardware slowed down by a few who still use hardware from 10+ years ago which no dev has at hand anymore to test against?
to me demanding a regression just for a few oldtimers isn't justified when compared the vast majority of users with current hardware which the current amdgpu is meant for
if you want to keep use you old hardware use time appropriate software - like some pre-previous debian from the 4.x or 5.x era
your hardware is just too old for arch - deal with it - and don'T stop others

Offline

#12 2024-08-13 11:47:48

WorMzy
Administrator
From: Scotland
Registered: 2010-06-16
Posts: 12,499
Website

Re: Regression; Video instability after *recent* update, AMDGPU

Does downgrading vulkan-radeon to 1:24.1.2-1 resolve the problem?

pacman -U https://archive.archlinux.org/packages/v/vulkan-radeon/vulkan-radeon-1%3A24.1.2-1-x86_64.pkg.tar.zst

Sakura:-
Mobo: MSI MAG X570S TORPEDO MAX // Processor: AMD Ryzen 9 5950X @4.9GHz // GFX: AMD Radeon RX 5700 XT // RAM: 32GB (4x 8GB) Corsair DDR4 (@ 3000MHz) // Storage: 1x 3TB HDD, 6x 1TB SSD, 2x 120GB SSD, 1x 275GB M2 SSD

Making lemonade from lemons since 2015.

Offline

#13 2024-08-13 13:04:31

seth
Member
Registered: 2012-09-03
Posts: 60,080

Re: Regression; Video instability after *recent* update, AMDGPU

Sometimes X hangs and restarts by itself so that it throws me to the X login.
Sometimes the computer freezes with a black screen and I have to hard reset.

Do you use xf86-video-amdgpu or the modesetting driver and does it make any difference?

@noctavian & esp. @dobie2564 you may face a completely different issue, see https://bbs.archlinux.org/viewtopic.php … 5#p2189645

@cryptearth

can't expect that driver devs still have 10+ year old hardware and do test them if they break by changes meant to improve current hardware

no, but that's completely unrelated to the question whether you can expect the driver to be maintained and bugs be fixed and

your hardware is just too old for arch - deal with it - and don'T stop others

I suggest we're not going there again (since we know where it's gonna end…), but just fyi, right now there's -most likely- an AMDGPU related situation that's *way* worse than the one in this thread and affects newer/st hardware. In-tree drivers are supposed to be maintned, it something falls out of maintanance it's also gonna fall out of the kernel tree. "Your luser hardware is too old lol" is not an available bug resolution.

Offline

#14 2024-08-13 13:59:22

justJack
Member
Registered: 2024-05-03
Posts: 5

Re: Regression; Video instability after *recent* update, AMDGPU

cryptearth wrote:

have you tried with the default radeon driver meant for this hardware? there's a reason why to switch from radeon to amdgpu..s

Radeon driver sucks, VAAPI - hardware acceleration didn't work, neither the 3D acceleration for older games and emulators. 

Again, I'm not complaining, this PC is used mostly for some office stuff and for streaming movies and emulators to TV. It was my first PC and I'm kinda attached to it and i don't want to throw it away while it works.  I have a gaming PC which has new hardware, but don't have a AMD GPU to test the issue. 

I find it funny that I ran from Windows with it's forced TPM and hardware requirements to find Linux is kinda the same. I mean, the distributions that you can do some up to date stuff, like using hardware acceleration, play some old games using Proton, etc. Also the people in the community are the same, kinda salty because i don't upgrade my hardware.

Offline

#15 2024-08-13 14:26:02

baal
Member
Registered: 2023-11-02
Posts: 32

Re: Regression; Video instability after *recent* update, AMDGPU

Thank you for all your inputs and confirming similar behaviour in some cases.

To try WorMzy's suggestion, I upgraded my system to current. Since my very first post a driver and a kernel update took place:

vainfo
Trying display: wayland
Trying display: x11
vainfo: VA-API version: 1.22 (libva 2.22.0)
vainfo: Driver version: Mesa Gallium driver 24.1.5-arch1.2 for ATI FirePro W5000 (radeonsi, pitcairn, LLVM 18.1.8, DRM 3.57, 6.10.4-arch2-1)
vainfo: Supported profile and entrypoints
      VAProfileMPEG2Simple            :	VAEntrypointVLD
      VAProfileMPEG2Main              :	VAEntrypointVLD
      VAProfileVC1Simple              :	VAEntrypointVLD
      VAProfileVC1Main                :	VAEntrypointVLD
      VAProfileVC1Advanced            :	VAEntrypointVLD
      VAProfileH264ConstrainedBaseline:	VAEntrypointVLD
      VAProfileH264Main               :	VAEntrypointVLD
      VAProfileH264High               :	VAEntrypointVLD
      VAProfileNone                   :	VAEntrypointVideoProc

At this point vulkaninfo gave a healthy output.

Right after the upgrade I found that Vaapi acceleration did not work in Firefox. And it did not work in Vlc either. Below is some error trace from Vlc:

[00007e2d90c12360] avcodec decoder: Using Mesa Gallium driver 24.1.5-arch1.2 for ATI FirePro W5000 (radeonsi, pitcairn, LLVM 18.1.8, DRM 3.57, 6.10.4-arch2-1) for hardware decoding
amdgpu: The CS has been rejected, see dmesg for more information (-22).
Aborted (core dumped)

And the relevant part of dmsg:

[   89.302331] [drm:amdgpu_uvd_cs_pass2 [amdgpu]] *ERROR* msg/fb buffer ff00eb6000-ff00eb8000 out of 256MB segment!

I did try downgrading vulkan-radeon to 1:24.1.2-1, but it did not make any difference.


As per cryptearth suggestion, I tried the radeon driver. Vaapi acceleration now seems to work in both Vlc and Firefox. I am yet to find about overall stability though...

vainfo 
Trying display: wayland
Trying display: x11
vainfo: VA-API version: 1.22 (libva 2.22.0)
vainfo: Driver version: Mesa Gallium driver 24.1.5-arch1.2 for PITCAIRN (radeonsi, , LLVM 18.1.8, DRM 2.50, 6.10.4-arch2-1)
vainfo: Supported profile and entrypoints
      VAProfileMPEG2Simple            :	VAEntrypointVLD
      VAProfileMPEG2Main              :	VAEntrypointVLD
      VAProfileVC1Simple              :	VAEntrypointVLD
      VAProfileVC1Main                :	VAEntrypointVLD
      VAProfileVC1Advanced            :	VAEntrypointVLD
      VAProfileH264ConstrainedBaseline:	VAEntrypointVLD
      VAProfileH264ConstrainedBaseline:	VAEntrypointEncSlice
      VAProfileH264Main               :	VAEntrypointVLD
      VAProfileH264Main               :	VAEntrypointEncSlice
      VAProfileH264High               :	VAEntrypointVLD
      VAProfileH264High               :	VAEntrypointEncSlice
      VAProfileNone                   :	VAEntrypointVideoProc

At this point vulkaninfo does not give a healthy output. Vulkan does not seem to work. I will look into troubleshoot that later. (Although I may find that I do not need Vulkan altogether...)

@seth:
I have AMD FirePro W5000, which is South Island Graphics Core Next 1 (GCN1). It is essentially AMD HD7000 series. I had to force AMDGPU driver to run with boot time parameters (radeon.si_support=0 amdgpu.si_support=1); otherwise the system prefers starting the Radeon driver. (I also have xf86-video-amdgpu installed; I believe the way it works is that the kernel parameter 'turns it on'.)


@justJack
Back then when I installed Arch on this computer I tried the Radeon driver first. Vaapi did not work, so that I moved to AMDGPU. Maybe in the era of installation the Radon driver had a buggy spell for Vaapi at SIGCN1 or something, but it seems to be working now. I do not know about the 3D acceleration for older games, but Vaapi seems to be ok. You should give it a try. (I am yet to see about the stability though, but so far so good...)

Last edited by baal (2024-08-13 14:28:32)

Offline

#16 2024-08-13 14:44:25

seth
Member
Registered: 2012-09-03
Posts: 60,080

Re: Regression; Video instability after *recent* update, AMDGPU

I also have xf86-video-amdgpu installed; I believe the way it works is that the kernel parameter 'turns it on'.

No, the driver figures whether it can operate your hardware - feel free to try the amdgpu kernel module but w/o the xf86-video-amdgpu X11 driver

Offline

#17 2024-08-13 18:50:30

justJack
Member
Registered: 2024-05-03
Posts: 5

Re: Regression; Video instability after *recent* update, AMDGPU

@baal 
Everything works fine for me using the LTS kernel and amdgpu driver.

Yes, Vulkan was a issue as well with radeon driver, games would not start or run at 10-20 fps while using amdgpu they ran at 50-60fps.
I'm also using radeon.si_support=0 amdgpu.si_support=1 amdgpu.dc=0, the last parameter is needed for the sound to work through HDMI.

Offline

#18 2024-08-17 08:33:54

Beiruty
Member
Registered: 2024-08-17
Posts: 14

Re: Regression; Video instability after *recent* update, AMDGPU

I have the same exact issue with my  12 years old AMD [6818] Pitcairn XT [Radeon HD 7870 GHz Edition]
Basically, the AV HD Decode is broken with Kernel 6.10.x series.
I am using the AMDGPU module first, then radeon.
I rolled back to 6.9.7 and everything works as before.

Operating System: EndeavourOS
KDE Plasma Version: 6.1.4
KDE Frameworks Version: 6.5.0
Qt Version: 6.7.2
Kernel Version: 6.9.7-x64v2-xanmod1-MANJARO (64-bit)
Graphics Platform: Wayland
Processors: 8 × Intel® Core™ i7-2600K CPU @ 3.40GHz
Memory: 15.5 GiB of RAM
Graphics Processor: AMD Radeon HD 7800 Series

Last edited by Beiruty (2024-08-17 08:34:28)

Offline

#19 2024-08-19 04:21:49

Beiruty
Member
Registered: 2024-08-17
Posts: 14

Re: Regression; Video instability after *recent* update, AMDGPU

If I can help to debug and test this regression, I am all ears and willing to help.

Offline

#20 2024-08-19 08:17:54

seth
Member
Registered: 2012-09-03
Posts: 60,080

Re: Regression; Video instability after *recent* update, AMDGPU

baal wrote:

I tried the radeon driver. Vaapi acceleration now seems to work in both Vlc and Firefox. I am yet to find about overall stability though...
I also have xf86-video-amdgpu installed

Beiruty wrote:

I rolled back to 6.9.7 and everything works as before.

Since this is most likely related to the amdgpu kernel module you should first and foremost clarify whether you're using that and then whether you're also using the xf86-video-amdgpu driver on top.
If you're using the latter, remove it and see whether the issue remains.
If yes, you'd want to narrow down to the exact kernel breaking this.

I am using the AMDGPU module first, then radeon.

Isn't a thing, you're using either or - even though lspci -k will *also* list all that support the device.

Offline

#21 2024-08-19 17:03:25

baal
Member
Registered: 2023-11-02
Posts: 32

Re: Regression; Video instability after *recent* update, AMDGPU

baal wrote:

I am yet to find about overall stability though...

Well, with the Radeon driver I had sporadic freezes. The freezes just came out of the blue, i.e. they were not induced by Vaapi playback etc. They were not completely hard freezes, I could ssh in and get a dmesg, which is bellow.

The temporary solution for me was to revert everything back (packages, kernel etc.) to some mid-May era date, and continue using Amdgpu.

[ 2855.755388] radeon 0000:03:00.0: ring 0 stalled for more than 10130msec
[ 2855.755409] radeon 0000:03:00.0: GPU lockup (current fence id 0x000000000004f492 last fence id 0x000000000004f4b1 on ring 0)
[ 2855.888693] radeon 0000:03:00.0: ring 3 stalled for more than 10134msec
[ 2855.888715] radeon 0000:03:00.0: GPU lockup (current fence id 0x0000000000011fa2 last fence id 0x0000000000011fed on ring 3)
[ 2856.262044] radeon 0000:03:00.0: ring 5 stalled for more than 10134msec
[ 2856.262051] radeon 0000:03:00.0: ring 0 stalled for more than 10637msec
[ 2856.262059] radeon 0000:03:00.0: GPU lockup (current fence id 0x0000000000001b0d last fence id 0x0000000000001b0e on ring 5)
[ 2856.262066] radeon 0000:03:00.0: GPU lockup (current fence id 0x000000000004f492 last fence id 0x000000000004f4b2 on ring 0)
[ 2856.395375] radeon 0000:03:00.0: ring 3 stalled for more than 10640msec
[ 2856.395397] radeon 0000:03:00.0: GPU lockup (current fence id 0x0000000000011fa2 last fence id 0x0000000000011fed on ring 3)
[ 2856.768690] radeon 0000:03:00.0: ring 0 stalled for more than 11144msec
[ 2856.768702] radeon 0000:03:00.0: ring 5 stalled for more than 10640msec
[ 2856.768709] radeon 0000:03:00.0: GPU lockup (current fence id 0x000000000004f492 last fence id 0x000000000004f4b4 on ring 0)
[ 2856.768717] radeon 0000:03:00.0: GPU lockup (current fence id 0x0000000000001b0d last fence id 0x0000000000001b0e on ring 5)
[ 2856.902030] radeon 0000:03:00.0: ring 3 stalled for more than 11147msec
[ 2856.902053] radeon 0000:03:00.0: GPU lockup (current fence id 0x0000000000011fa2 last fence id 0x0000000000011ff1 on ring 3)
[ 2857.275343] radeon 0000:03:00.0: ring 5 stalled for more than 11147msec
[ 2857.275357] radeon 0000:03:00.0: GPU lockup (current fence id 0x0000000000001b0d last fence id 0x0000000000001b0e on ring 5)
[ 2857.275396] radeon 0000:03:00.0: ring 0 stalled for more than 11650msec
[ 2857.275400] radeon 0000:03:00.0: GPU lockup (current fence id 0x000000000004f492 last fence id 0x000000000004f4b6 on ring 0)
[ 2857.408709] radeon 0000:03:00.0: ring 3 stalled for more than 11654msec
[ 2857.408729] radeon 0000:03:00.0: GPU lockup (current fence id 0x0000000000011fa2 last fence id 0x0000000000011ff7 on ring 3)
[ 2857.782037] radeon 0000:03:00.0: ring 0 stalled for more than 12157msec
[ 2857.782049] radeon 0000:03:00.0: ring 5 stalled for more than 11654msec
[ 2857.782053] radeon 0000:03:00.0: GPU lockup (current fence id 0x000000000004f492 last fence id 0x000000000004f4b6 on ring 0)
[ 2857.782064] radeon 0000:03:00.0: GPU lockup (current fence id 0x0000000000001b0d last fence id 0x0000000000001b0e on ring 5)
[ 2857.915338] radeon 0000:03:00.0: ring 3 stalled for more than 12160msec
[ 2857.915359] radeon 0000:03:00.0: GPU lockup (current fence id 0x0000000000011fa2 last fence id 0x0000000000011ff7 on ring 3)
[ 2858.288688] radeon 0000:03:00.0: ring 5 stalled for more than 12160msec
[ 2858.288694] radeon 0000:03:00.0: ring 0 stalled for more than 12664msec
[ 2858.288704] radeon 0000:03:00.0: GPU lockup (current fence id 0x000000000004f492 last fence id 0x000000000004f4b9 on ring 0)
[ 2858.288709] radeon 0000:03:00.0: GPU lockup (current fence id 0x0000000000001b0d last fence id 0x0000000000001b0e on ring 5)
[ 2858.421996] radeon 0000:03:00.0: ring 3 stalled for more than 12667msec
[ 2858.422009] radeon 0000:03:00.0: GPU lockup (current fence id 0x0000000000011fa2 last fence id 0x0000000000012000 on ring 3)
[ 2858.795348] radeon 0000:03:00.0: ring 5 stalled for more than 12667msec
[ 2858.795371] radeon 0000:03:00.0: GPU lockup (current fence id 0x0000000000001b0d last fence id 0x0000000000001b0e on ring 5)
[ 2858.795379] radeon 0000:03:00.0: ring 0 stalled for more than 13170msec
[ 2858.795393] radeon 0000:03:00.0: GPU lockup (current fence id 0x000000000004f492 last fence id 0x000000000004f4ba on ring 0)
[ 2858.928697] radeon 0000:03:00.0: ring 3 stalled for more than 13174msec
[ 2858.928712] radeon 0000:03:00.0: GPU lockup (current fence id 0x0000000000011fa2 last fence id 0x0000000000012003 on ring 3)
[ 2859.302020] radeon 0000:03:00.0: ring 0 stalled for more than 13677msec
[ 2859.302035] radeon 0000:03:00.0: GPU lockup (current fence id 0x000000000004f492 last fence id 0x000000000004f4bb on ring 0)
[ 2859.302035] radeon 0000:03:00.0: ring 5 stalled for more than 13174msec
[ 2859.302050] radeon 0000:03:00.0: GPU lockup (current fence id 0x0000000000001b0d last fence id 0x0000000000001b0e on ring 5)
[ 2859.435335] radeon 0000:03:00.0: ring 3 stalled for more than 13680msec
[ 2859.435350] radeon 0000:03:00.0: GPU lockup (current fence id 0x0000000000011fa2 last fence id 0x0000000000012006 on ring 3)
[ 2859.808636] radeon 0000:03:00.0: ring 5 stalled for more than 13680msec
[ 2859.808651] radeon 0000:03:00.0: GPU lockup (current fence id 0x0000000000001b0d last fence id 0x0000000000001b0e on ring 5)
[ 2859.808689] radeon 0000:03:00.0: ring 0 stalled for more than 14184msec
[ 2859.808695] radeon 0000:03:00.0: GPU lockup (current fence id 0x000000000004f492 last fence id 0x000000000004f4bc on ring 0)
[ 2859.942001] radeon 0000:03:00.0: ring 3 stalled for more than 14187msec
[ 2859.942017] radeon 0000:03:00.0: GPU lockup (current fence id 0x0000000000011fa2 last fence id 0x0000000000012008 on ring 3)
[ 2860.315321] radeon 0000:03:00.0: ring 5 stalled for more than 14187msec
[ 2860.315335] radeon 0000:03:00.0: GPU lockup (current fence id 0x0000000000001b0d last fence id 0x0000000000001b0e on ring 5)
[ 2860.315372] radeon 0000:03:00.0: ring 0 stalled for more than 14690msec
[ 2860.315377] radeon 0000:03:00.0: GPU lockup (current fence id 0x000000000004f492 last fence id 0x000000000004f4bd on ring 0)
[ 2860.448652] radeon 0000:03:00.0: ring 3 stalled for more than 14694msec
[ 2860.448666] radeon 0000:03:00.0: GPU lockup (current fence id 0x0000000000011fa2 last fence id 0x000000000001200c on ring 3)
[ 2860.821999] radeon 0000:03:00.0: ring 5 stalled for more than 14694msec
[ 2860.822007] radeon 0000:03:00.0: ring 0 stalled for more than 15197msec
[ 2860.822013] radeon 0000:03:00.0: GPU lockup (current fence id 0x0000000000001b0d last fence id 0x0000000000001b0e on ring 5)
[ 2860.822021] radeon 0000:03:00.0: GPU lockup (current fence id 0x000000000004f492 last fence id 0x000000000004f4bd on ring 0)
[ 2860.955311] radeon 0000:03:00.0: ring 3 stalled for more than 15200msec
[ 2860.955325] radeon 0000:03:00.0: GPU lockup (current fence id 0x0000000000011fa2 last fence id 0x000000000001200c on ring 3)
[ 2861.328629] radeon 0000:03:00.0: ring 5 stalled for more than 15200msec
[ 2861.328642] radeon 0000:03:00.0: GPU lockup (current fence id 0x0000000000001b0d last fence id 0x0000000000001b0e on ring 5)
[ 2861.328674] radeon 0000:03:00.0: ring 0 stalled for more than 15704msec
[ 2861.328688] radeon 0000:03:00.0: GPU lockup (current fence id 0x000000000004f492 last fence id 0x000000000004f4be on ring 0)
[ 2861.461961] radeon 0000:03:00.0: ring 3 stalled for more than 15707msec
[ 2861.461975] radeon 0000:03:00.0: GPU lockup (current fence id 0x0000000000011fa2 last fence id 0x0000000000012010 on ring 3)
[ 2861.835328] radeon 0000:03:00.0: ring 5 stalled for more than 15707msec
[ 2861.835336] radeon 0000:03:00.0: ring 0 stalled for more than 16210msec
[ 2861.835342] radeon 0000:03:00.0: GPU lockup (current fence id 0x0000000000001b0d last fence id 0x0000000000001b0e on ring 5)
[ 2861.835350] radeon 0000:03:00.0: GPU lockup (current fence id 0x000000000004f492 last fence id 0x000000000004f4bf on ring 0)
[ 2861.968644] radeon 0000:03:00.0: ring 3 stalled for more than 16214msec
[ 2861.968658] radeon 0000:03:00.0: GPU lockup (current fence id 0x0000000000011fa2 last fence id 0x0000000000012014 on ring 3)
[ 2862.341959] radeon 0000:03:00.0: ring 5 stalled for more than 16214msec
[ 2862.341966] radeon 0000:03:00.0: ring 0 stalled for more than 16717msec
[ 2862.341973] radeon 0000:03:00.0: GPU lockup (current fence id 0x0000000000001b0d last fence id 0x0000000000001b0e on ring 5)
[ 2862.341981] radeon 0000:03:00.0: GPU lockup (current fence id 0x000000000004f492 last fence id 0x000000000004f4bf on ring 0)
[ 2862.475325] radeon 0000:03:00.0: ring 3 stalled for more than 16720msec
[ 2862.475342] radeon 0000:03:00.0: GPU lockup (current fence id 0x0000000000011fa2 last fence id 0x0000000000012014 on ring 3)
[ 2862.848637] radeon 0000:03:00.0: ring 5 stalled for more than 16720msec
[ 2862.848653] radeon 0000:03:00.0: GPU lockup (current fence id 0x0000000000001b0d last fence id 0x0000000000001b0e on ring 5)
[ 2862.848680] radeon 0000:03:00.0: ring 0 stalled for more than 17224msec
[ 2862.848694] radeon 0000:03:00.0: GPU lockup (current fence id 0x000000000004f492 last fence id 0x000000000004f4c0 on ring 0)
[ 2862.981975] radeon 0000:03:00.0: ring 3 stalled for more than 17227msec
[ 2862.981989] radeon 0000:03:00.0: GPU lockup (current fence id 0x0000000000011fa2 last fence id 0x0000000000012017 on ring 3)
[ 2863.355287] radeon 0000:03:00.0: ring 5 stalled for more than 17227msec
[ 2863.355293] radeon 0000:03:00.0: ring 0 stalled for more than 17730msec
[ 2863.355296] radeon 0000:03:00.0: GPU lockup (current fence id 0x0000000000001b0d last fence id 0x0000000000001b0e on ring 5)
[ 2863.355303] radeon 0000:03:00.0: GPU lockup (current fence id 0x000000000004f492 last fence id 0x000000000004f4c1 on ring 0)
[ 2863.488630] radeon 0000:03:00.0: ring 3 stalled for more than 17734msec
[ 2863.488643] radeon 0000:03:00.0: GPU lockup (current fence id 0x0000000000011fa2 last fence id 0x000000000001201a on ring 3)
[ 2863.861944] radeon 0000:03:00.0: ring 5 stalled for more than 17734msec
[ 2863.861957] radeon 0000:03:00.0: GPU lockup (current fence id 0x0000000000001b0d last fence id 0x0000000000001b0e on ring 5)
[ 2863.861978] radeon 0000:03:00.0: ring 0 stalled for more than 18237msec
[ 2863.861992] radeon 0000:03:00.0: GPU lockup (current fence id 0x000000000004f492 last fence id 0x000000000004f4c1 on ring 0)
[ 2863.995294] radeon 0000:03:00.0: ring 3 stalled for more than 18240msec
[ 2863.995309] radeon 0000:03:00.0: GPU lockup (current fence id 0x0000000000011fa2 last fence id 0x000000000001201a on ring 3)
[ 2864.368586] radeon 0000:03:00.0: ring 5 stalled for more than 18240msec
[ 2864.368600] radeon 0000:03:00.0: GPU lockup (current fence id 0x0000000000001b0d last fence id 0x0000000000001b0e on ring 5)
[ 2864.368633] radeon 0000:03:00.0: ring 0 stalled for more than 18744msec
[ 2864.368647] radeon 0000:03:00.0: GPU lockup (current fence id 0x000000000004f492 last fence id 0x000000000004f4c2 on ring 0)
[ 2864.501949] radeon 0000:03:00.0: ring 3 stalled for more than 18747msec
[ 2864.501963] radeon 0000:03:00.0: GPU lockup (current fence id 0x0000000000011fa2 last fence id 0x000000000001201d on ring 3)
[ 2864.875263] radeon 0000:03:00.0: ring 5 stalled for more than 18747msec
[ 2864.875277] radeon 0000:03:00.0: GPU lockup (current fence id 0x0000000000001b0d last fence id 0x0000000000001b0e on ring 5)
[ 2864.875315] radeon 0000:03:00.0: ring 0 stalled for more than 19250msec
[ 2864.875320] radeon 0000:03:00.0: GPU lockup (current fence id 0x000000000004f492 last fence id 0x000000000004f4c4 on ring 0)
[ 2865.008609] radeon 0000:03:00.0: ring 3 stalled for more than 19254msec
[ 2865.008622] radeon 0000:03:00.0: GPU lockup (current fence id 0x0000000000011fa2 last fence id 0x0000000000012021 on ring 3)
[ 2865.381945] radeon 0000:03:00.0: ring 0 stalled for more than 19757msec
[ 2865.381959] radeon 0000:03:00.0: GPU lockup (current fence id 0x000000000004f492 last fence id 0x000000000004f4c5 on ring 0)
[ 2865.381966] radeon 0000:03:00.0: ring 5 stalled for more than 19254msec
[ 2865.381980] radeon 0000:03:00.0: GPU lockup (current fence id 0x0000000000001b0d last fence id 0x0000000000001b0e on ring 5)
[ 2865.515265] radeon 0000:03:00.0: ring 3 stalled for more than 19760msec
[ 2865.515279] radeon 0000:03:00.0: GPU lockup (current fence id 0x0000000000011fa2 last fence id 0x0000000000012024 on ring 3)
[ 2865.888572] radeon 0000:03:00.0: ring 5 stalled for more than 19760msec
[ 2865.888588] radeon 0000:03:00.0: GPU lockup (current fence id 0x0000000000001b0d last fence id 0x0000000000001b0e on ring 5)
[ 2865.888629] radeon 0000:03:00.0: ring 0 stalled for more than 20264msec
[ 2865.888633] radeon 0000:03:00.0: GPU lockup (current fence id 0x000000000004f492 last fence id 0x000000000004f4c6 on ring 0)
[ 2866.021955] radeon 0000:03:00.0: ring 3 stalled for more than 20267msec
[ 2866.021968] radeon 0000:03:00.0: GPU lockup (current fence id 0x0000000000011fa2 last fence id 0x0000000000012026 on ring 3)
[ 2866.395246] radeon 0000:03:00.0: ring 5 stalled for more than 20267msec
[ 2866.395253] radeon 0000:03:00.0: ring 0 stalled for more than 20770msec
[ 2866.395260] radeon 0000:03:00.0: GPU lockup (current fence id 0x0000000000001b0d last fence id 0x0000000000001b0e on ring 5)
[ 2866.395268] radeon 0000:03:00.0: GPU lockup (current fence id 0x000000000004f492 last fence id 0x000000000004f4c7 on ring 0)
[ 2866.528630] radeon 0000:03:00.0: ring 3 stalled for more than 20774msec
[ 2866.528644] radeon 0000:03:00.0: GPU lockup (current fence id 0x0000000000011fa2 last fence id 0x000000000001202a on ring 3)
[ 2866.901920] radeon 0000:03:00.0: ring 5 stalled for more than 20774msec
[ 2866.901942] radeon 0000:03:00.0: GPU lockup (current fence id 0x0000000000001b0d last fence id 0x0000000000001b0e on ring 5)
[ 2866.901955] radeon 0000:03:00.0: ring 0 stalled for more than 21277msec
[ 2866.901969] radeon 0000:03:00.0: GPU lockup (current fence id 0x000000000004f492 last fence id 0x000000000004f4c7 on ring 0)
[ 2867.035258] radeon 0000:03:00.0: ring 3 stalled for more than 21280msec
[ 2867.035272] radeon 0000:03:00.0: GPU lockup (current fence id 0x0000000000011fa2 last fence id 0x000000000001202e on ring 3)
[ 2867.408652] radeon 0000:03:00.0: ring 0 stalled for more than 21784msec
[ 2867.408661] radeon 0000:03:00.0: GPU lockup (current fence id 0x000000000004f492 last fence id 0x000000000004f4c9 on ring 0)
[ 2867.411868] radeon 0000:03:00.0: ring 5 stalled for more than 21284msec
[ 2867.411882] radeon 0000:03:00.0: GPU lockup (current fence id 0x0000000000001b0d last fence id 0x0000000000001b0e on ring 5)
[ 2867.541947] radeon 0000:03:00.0: ring 3 stalled for more than 21787msec
[ 2867.541961] radeon 0000:03:00.0: GPU lockup (current fence id 0x0000000000011fa2 last fence id 0x000000000001202f on ring 3)
[ 2867.915230] radeon 0000:03:00.0: ring 5 stalled for more than 21787msec
[ 2867.915245] radeon 0000:03:00.0: GPU lockup (current fence id 0x0000000000001b0d last fence id 0x0000000000001b0e on ring 5)
[ 2867.915297] radeon 0000:03:00.0: ring 0 stalled for more than 22290msec
[ 2867.915326] radeon 0000:03:00.0: GPU lockup (current fence id 0x000000000004f492 last fence id 0x000000000004f4cb on ring 0)
[ 2868.048573] radeon 0000:03:00.0: ring 3 stalled for more than 22294msec
[ 2868.048583] radeon 0000:03:00.0: GPU lockup (current fence id 0x0000000000011fa2 last fence id 0x0000000000012036 on ring 3)
[ 2868.421909] radeon 0000:03:00.0: ring 0 stalled for more than 22797msec
[ 2868.421911] radeon 0000:03:00.0: ring 5 stalled for more than 22294msec
[ 2868.421921] radeon 0000:03:00.0: GPU lockup (current fence id 0x0000000000001b0d last fence id 0x0000000000001b0e on ring 5)
[ 2868.421927] radeon 0000:03:00.0: GPU lockup (current fence id 0x000000000004f492 last fence id 0x000000000004f4cb on ring 0)
[ 2868.558560] radeon 0000:03:00.0: ring 3 stalled for more than 22804msec
[ 2868.558574] radeon 0000:03:00.0: GPU lockup (current fence id 0x0000000000011fa2 last fence id 0x0000000000012039 on ring 3)
[ 2868.928584] radeon 0000:03:00.0: ring 0 stalled for more than 23304msec
[ 2868.928598] radeon 0000:03:00.0: GPU lockup (current fence id 0x000000000004f492 last fence id 0x000000000004f4cd on ring 0)
[ 2868.928601] radeon 0000:03:00.0: ring 5 stalled for more than 22800msec
[ 2868.928615] radeon 0000:03:00.0: GPU lockup (current fence id 0x0000000000001b0d last fence id 0x0000000000001b0e on ring 5)
[ 2869.061910] radeon 0000:03:00.0: ring 3 stalled for more than 23307msec
[ 2869.061924] radeon 0000:03:00.0: GPU lockup (current fence id 0x0000000000011fa2 last fence id 0x000000000001203a on ring 3)
[ 2869.435185] radeon 0000:03:00.0: ring 5 stalled for more than 23307msec
[ 2869.435199] radeon 0000:03:00.0: GPU lockup (current fence id 0x0000000000001b0d last fence id 0x0000000000001b0e on ring 5)
[ 2869.435231] radeon 0000:03:00.0: ring 0 stalled for more than 23810msec
[ 2869.435245] radeon 0000:03:00.0: GPU lockup (current fence id 0x000000000004f492 last fence id 0x000000000004f4cf on ring 0)
[ 2869.568568] radeon 0000:03:00.0: ring 3 stalled for more than 23814msec
[ 2869.568581] radeon 0000:03:00.0: GPU lockup (current fence id 0x0000000000011fa2 last fence id 0x000000000001203f on ring 3)
[ 2869.941883] radeon 0000:03:00.0: ring 5 stalled for more than 23814msec
[ 2869.941896] radeon 0000:03:00.0: GPU lockup (current fence id 0x0000000000001b0d last fence id 0x0000000000001b0e on ring 5)
[ 2869.941934] radeon 0000:03:00.0: ring 0 stalled for more than 24317msec
[ 2869.941939] radeon 0000:03:00.0: GPU lockup (current fence id 0x000000000004f492 last fence id 0x000000000004f4cf on ring 0)
[ 2870.075235] radeon 0000:03:00.0: ring 3 stalled for more than 24320msec
[ 2870.075248] radeon 0000:03:00.0: GPU lockup (current fence id 0x0000000000011fa2 last fence id 0x0000000000012042 on ring 3)
[ 2870.448551] radeon 0000:03:00.0: ring 5 stalled for more than 24320msec
[ 2870.448558] radeon 0000:03:00.0: ring 0 stalled for more than 24824msec
[ 2870.448565] radeon 0000:03:00.0: GPU lockup (current fence id 0x0000000000001b0d last fence id 0x0000000000001b0e on ring 5)
[ 2870.448572] radeon 0000:03:00.0: GPU lockup (current fence id 0x000000000004f492 last fence id 0x000000000004f4d0 on ring 0)
[ 2870.581897] radeon 0000:03:00.0: ring 3 stalled for more than 24827msec
[ 2870.581911] radeon 0000:03:00.0: GPU lockup (current fence id 0x0000000000011fa2 last fence id 0x0000000000012042 on ring 3)
[ 2870.955222] radeon 0000:03:00.0: ring 5 stalled for more than 24827msec
[ 2870.955235] radeon 0000:03:00.0: GPU lockup (current fence id 0x0000000000001b0d last fence id 0x0000000000001b0e on ring 5)
[ 2870.955268] radeon 0000:03:00.0: ring 0 stalled for more than 25330msec
[ 2870.955282] radeon 0000:03:00.0: GPU lockup (current fence id 0x000000000004f492 last fence id 0x000000000004f4d1 on ring 0)
[ 2871.088549] radeon 0000:03:00.0: ring 3 stalled for more than 25334msec
[ 2871.088563] radeon 0000:03:00.0: GPU lockup (current fence id 0x0000000000011fa2 last fence id 0x0000000000012045 on ring 3)
[ 2871.461891] radeon 0000:03:00.0: ring 5 stalled for more than 25334msec
[ 2871.461904] radeon 0000:03:00.0: GPU lockup (current fence id 0x0000000000001b0d last fence id 0x0000000000001b0e on ring 5)
[ 2871.461926] radeon 0000:03:00.0: ring 0 stalled for more than 25837msec
[ 2871.461940] radeon 0000:03:00.0: GPU lockup (current fence id 0x000000000004f492 last fence id 0x000000000004f4d1 on ring 0)
[ 2871.595211] radeon 0000:03:00.0: ring 3 stalled for more than 25840msec
[ 2871.595225] radeon 0000:03:00.0: GPU lockup (current fence id 0x0000000000011fa2 last fence id 0x0000000000012048 on ring 3)
[ 2871.968546] radeon 0000:03:00.0: ring 0 stalled for more than 26344msec
[ 2871.968560] radeon 0000:03:00.0: GPU lockup (current fence id 0x000000000004f492 last fence id 0x000000000004f4d2 on ring 0)
[ 2871.968564] radeon 0000:03:00.0: ring 5 stalled for more than 25840msec
[ 2871.968578] radeon 0000:03:00.0: GPU lockup (current fence id 0x0000000000001b0d last fence id 0x0000000000001b0e on ring 5)
[ 2872.101875] radeon 0000:03:00.0: ring 3 stalled for more than 26347msec
[ 2872.101889] radeon 0000:03:00.0: GPU lockup (current fence id 0x0000000000011fa2 last fence id 0x0000000000012048 on ring 3)
[ 2872.475149] radeon 0000:03:00.0: ring 5 stalled for more than 26347msec
[ 2872.475163] radeon 0000:03:00.0: GPU lockup (current fence id 0x0000000000001b0d last fence id 0x0000000000001b0e on ring 5)
[ 2872.475184] radeon 0000:03:00.0: ring 0 stalled for more than 26850msec
[ 2872.475198] radeon 0000:03:00.0: GPU lockup (current fence id 0x000000000004f492 last fence id 0x000000000004f4d3 on ring 0)
[ 2872.608544] radeon 0000:03:00.0: ring 3 stalled for more than 26854msec
[ 2872.608557] radeon 0000:03:00.0: GPU lockup (current fence id 0x0000000000011fa2 last fence id 0x000000000001204c on ring 3)
[ 2872.981847] radeon 0000:03:00.0: ring 5 stalled for more than 26854msec
[ 2872.981860] radeon 0000:03:00.0: GPU lockup (current fence id 0x0000000000001b0d last fence id 0x0000000000001b0e on ring 5)
[ 2872.981899] radeon 0000:03:00.0: ring 0 stalled for more than 27357msec
[ 2872.981903] radeon 0000:03:00.0: GPU lockup (current fence id 0x000000000004f492 last fence id 0x000000000004f4d3 on ring 0)
[ 2873.115197] radeon 0000:03:00.0: ring 3 stalled for more than 27360msec
[ 2873.115211] radeon 0000:03:00.0: GPU lockup (current fence id 0x0000000000011fa2 last fence id 0x000000000001204f on ring 3)
[ 2873.488515] radeon 0000:03:00.0: ring 5 stalled for more than 27360msec
[ 2873.488520] radeon 0000:03:00.0: ring 0 stalled for more than 27864msec
[ 2873.488532] radeon 0000:03:00.0: GPU lockup (current fence id 0x000000000004f492 last fence id 0x000000000004f4d4 on ring 0)
[ 2873.488537] radeon 0000:03:00.0: GPU lockup (current fence id 0x0000000000001b0d last fence id 0x0000000000001b0e on ring 5)
[ 2873.621881] radeon 0000:03:00.0: ring 3 stalled for more than 27867msec
[ 2873.621896] radeon 0000:03:00.0: GPU lockup (current fence id 0x0000000000011fa2 last fence id 0x000000000001204f on ring 3)
[ 2873.995198] radeon 0000:03:00.0: ring 5 stalled for more than 27867msec
[ 2873.995205] radeon 0000:03:00.0: ring 0 stalled for more than 28370msec
[ 2873.995215] radeon 0000:03:00.0: GPU lockup (current fence id 0x000000000004f492 last fence id 0x000000000004f4d5 on ring 0)
[ 2873.995216] radeon 0000:03:00.0: GPU lockup (current fence id 0x0000000000001b0d last fence id 0x0000000000001b0e on ring 5)
[ 2874.128526] radeon 0000:03:00.0: ring 3 stalled for more than 28374msec
[ 2874.128540] radeon 0000:03:00.0: GPU lockup (current fence id 0x0000000000011fa2 last fence id 0x0000000000012054 on ring 3)
[ 2874.501848] radeon 0000:03:00.0: ring 5 stalled for more than 28374msec
[ 2874.501856] radeon 0000:03:00.0: ring 0 stalled for more than 28877msec
[ 2874.501862] radeon 0000:03:00.0: GPU lockup (current fence id 0x0000000000001b0d last fence id 0x0000000000001b0e on ring 5)
[ 2874.501869] radeon 0000:03:00.0: GPU lockup (current fence id 0x000000000004f492 last fence id 0x000000000004f4d6 on ring 0)
[ 2874.635191] radeon 0000:03:00.0: ring 3 stalled for more than 28880msec
[ 2874.635205] radeon 0000:03:00.0: GPU lockup (current fence id 0x0000000000011fa2 last fence id 0x0000000000012059 on ring 3)
[ 2875.008541] radeon 0000:03:00.0: ring 0 stalled for more than 29384msec
[ 2875.008556] radeon 0000:03:00.0: GPU lockup (current fence id 0x000000000004f492 last fence id 0x000000000004f4d8 on ring 0)
[ 2875.008560] radeon 0000:03:00.0: ring 5 stalled for more than 28880msec
[ 2875.008574] radeon 0000:03:00.0: GPU lockup (current fence id 0x0000000000001b0d last fence id 0x0000000000001b0e on ring 5)
[ 2875.141872] radeon 0000:03:00.0: ring 3 stalled for more than 29387msec
[ 2875.141886] radeon 0000:03:00.0: GPU lockup (current fence id 0x0000000000011fa2 last fence id 0x0000000000012059 on ring 3)
[ 2875.515197] radeon 0000:03:00.0: ring 5 stalled for more than 29387msec
[ 2875.515203] radeon 0000:03:00.0: ring 0 stalled for more than 29890msec
[ 2875.515215] radeon 0000:03:00.0: GPU lockup (current fence id 0x0000000000001b0d last fence id 0x0000000000001b0e on ring 5)
[ 2875.515220] radeon 0000:03:00.0: GPU lockup (current fence id 0x000000000004f492 last fence id 0x000000000004f4d9 on ring 0)
[ 2875.648534] radeon 0000:03:00.0: ring 3 stalled for more than 29894msec
[ 2875.648547] radeon 0000:03:00.0: GPU lockup (current fence id 0x0000000000011fa2 last fence id 0x000000000001205d on ring 3)

Offline

#22 2024-08-19 20:37:09

Beiruty
Member
Registered: 2024-08-17
Posts: 14

Re: Regression; Video instability after *recent* update, AMDGPU

lspci -k 
01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Pitcairn XT [Radeon HD 7870 GHz Edition]
        Subsystem: Gigabyte Technology Co., Ltd Device 2554
        Kernel driver in use: amdgpu
        Kernel modules: radeon, amdgpu

vainfo                                                                                                                                                                                                                                                                               
Trying display: wayland
vainfo: VA-API version: 1.22 (libva 2.22.0)
vainfo: Driver version: Mesa Gallium driver 24.3.0-devel for AMD Radeon HD 7800 Series (radeonsi, pitcairn, ACO, DRM 3.57, 6.9.7-x64v2-xanmod1-MANJARO)
vainfo: Supported profile and entrypoints
      VAProfileMPEG2Simple            : VAEntrypointVLD
      VAProfileMPEG2Main              : VAEntrypointVLD
      VAProfileVC1Simple              : VAEntrypointVLD
      VAProfileVC1Main                : VAEntrypointVLD
      VAProfileVC1Advanced            : VAEntrypointVLD
      VAProfileH264ConstrainedBaseline: VAEntrypointVLD
      VAProfileH264Main               : VAEntrypointVLD
      VAProfileH264High               : VAEntrypointVLD
      VAProfileNone                   : VAEntrypointVideoProc

extra/xf86-video-amdgpu 23.0.0-2 (69.3 KiB 166.2 KiB) [xorg-drivers] (Installed)
Removed that package

Last edited by Beiruty (2024-08-19 21:01:10)

Offline

#23 2024-08-19 20:51:22

Beiruty
Member
Registered: 2024-08-17
Posts: 14

Re: Regression; Video instability after *recent* update, AMDGPU

Booting into kernel:  6.10.5-x64v2-xanmod1

mpv crash log:

/usr/bin/mpv --no-quiet --terminal --no-msg-color --input-ipc-server=/tmp/smplayer-mpv-15d9 --msg-level=ffmpeg/demuxer=error --video-rotate=no --no-config --no-fs --vd-lavc-threads=8 --hwdec=auto-copy-safe --sub-auto=fuzzy --vo=gpu-next, --gpu-context=x11egl --ao=pipewire, --no-stop-screensaver --no-input-default-bindings --input-vo-keyboard=no --no-input-cursor --cursor-autohide=no --no-keepaspect --wid=8388804 --monitorpixelaspect=1 --osd-level=1 --osd-scale=1 --osd-bar-align-y=0.6 --sub-ass --embeddedfonts --sub-ass-line-spacing=0 --sub-scale=1 --sub-font=Arial --sub-color=#ffffffff --sub-shadow-color=#ff000000 --sub-border-color=#ff000000 --sub-border-size=0.75 --sub-shadow-offset=2.5 --sub-font-size=50 --sub-bold=no --sub-italic=no --sub-margin-y=8 --sub-margin-x=20 --sub-codepage=ISO-8859-1 --vid=1 --sid=auto --sub-pos=100 --vf-add=@veq:lavfi=[eq=1:0:1:1,hue=h=0] --volume=110 --cache=auto --start=51 --screenshot-template=cap_%F_%p_%02n --screenshot-format=jpg --screenshot-directory=/home/********/Pictures/smplayer_screenshots --audio-pitch-correction=yes --volume-max=110 --term-playing-msg=MPV_VERSION=${=mpv-version:}
INFO_VIDEO_WIDTH=${=width}
INFO_VIDEO_HEIGHT=${=height}
INFO_VIDEO_ASPECT=${=video-params/aspect}
INFO_VIDEO_FPS=${=container-fps:${=fps}}
INFO_VIDEO_FORMAT=${=video-format}
INFO_VIDEO_CODEC=${=video-format}
INFO_DEMUX_ROTATION=${=track-list/0/demux-rotation}
INFO_AUDIO_FORMAT=${=audio-codec-name}
INFO_AUDIO_CODEC=${=audio-codec-name}
INFO_AUDIO_RATE=${=audio-params/samplerate}
INFO_AUDIO_NCH=${=audio-params/channel-count}
INFO_LENGTH=${=duration:${=length}}
INFO_DEMUXER=${=current-demuxer:${=demuxer}}
INFO_SEEKABLE=${=seekable}
INFO_TITLES=${=disc-titles}
INFO_CHAPTERS=${=chapters}
INFO_TRACKS_COUNT=${=track-list/count}
METADATA_TITLE=${metadata/by-key/title:}
METADATA_ARTIST=${metadata/by-key/artist:}
METADATA_ALBUM=${metadata/by-key/album:}
METADATA_GENRE=${metadata/by-key/genre:}
METADATA_DATE=${metadata/by-key/date:}
METADATA_TRACK=${metadata/by-key/track:}
METADATA_COPYRIGHT=${metadata/by-key/copyright:}
INFO_MEDIA_TITLE=${=media-title:}
INFO_STREAM_PATH=${stream-path}
 --audio-client-name=SMPlayer --term-status-msg=STATUS: ${=time-pos} / ${=duration:${=length:0}} P: ${=pause} B: ${=paused-for-cache} I: ${=core-idle} VB: ${=video-bitrate:0} AB: ${=audio-bitrate:0} /home/********/Videos/I Am Legend - Trailer.mp4

 (+) Video --vid=1 (*) (h264 1920x816 23.976fps)
     Video --vid=2 [P] (mjpeg 1.000fps)
 (+) Audio --aid=1 (*) (aac 6ch 48000Hz)
File tags:
 Artist: Warner Bros.
 Date: 2007
 Genre: Science-Fiction
 Title: I Am Legend - Trailer
[ffmpeg] AVHWDeviceContext: cu->cuInit(0) failed -> CUDA_ERROR_NO_DEVICE: no CUDA-capable device is detected
crocus: driver missing
[vaapi] libva: /usr/lib/dri/radeonsi_drv_video.so init failed
amdgpu: The CS has been rejected, see dmesg for more information (-22).

 sudo dmesg | grep amdgpu

[    1.678600] [drm] amdgpu kernel modesetting enabled.
[    1.679037] amdgpu: Virtual CRAT table created for CPU
[    1.679386] amdgpu: Topology: Add CPU node
[    1.705309] amdgpu 0000:01:00.0: amdgpu: Fetched VBIOS from ROM BAR
[    1.705313] amdgpu: ATOM BIOS: 113-xxx-xxx
[    1.705325] kfd kfd: amdgpu: PITCAIRN  not supported in kfd
[    1.749050] amdgpu 0000:01:00.0: vgaarb: deactivate vga console
[    1.928234] amdgpu 0000:01:00.0: amdgpu: Trusted Memory Zone (TMZ) feature not supported
[    1.928238] amdgpu 0000:01:00.0: amdgpu: PCIE atomic ops is not supported
[    1.929037] amdgpu 0000:01:00.0: amdgpu: VRAM: 2048M 0x000000F400000000 - 0x000000F47FFFFFFF (2048M used)
[    1.929050] amdgpu 0000:01:00.0: amdgpu: GART: 1024M 0x000000FF00000000 - 0x000000FF3FFFFFFF
[    1.929137] [drm] amdgpu: 2048M of VRAM memory ready
[    1.929152] [drm] amdgpu: 7935M of GTT memory ready.
[    1.929804] amdgpu 0000:01:00.0: amdgpu: PCIE GART of 1024M enabled (table at 0x000000F400400000).
[    1.931283] [drm] amdgpu: dpm initialized
[    2.491772] amdgpu 0000:01:00.0: amdgpu: SE 2, SH per SE 2, CU per SH 5, active_cu_number 20
[    2.791878] amdgpu 0000:01:00.0: amdgpu: overdrive feature is not supported
[    2.792523] amdgpu 0000:01:00.0: amdgpu: Runtime PM not available
[    2.793338] [drm] Initialized amdgpu 3.57.0 20150101 for 0000:01:00.0 on minor 2
[    2.874498] fbcon: amdgpudrmfb (fb1) is primary device
[    3.268751] amdgpu 0000:01:00.0: [drm] fb1: amdgpudrmfb frame buffer device
[   33.990294] [drm:amdgpu_uvd_cs_pass2 [amdgpu]] *ERROR* msg/fb buffer ff0125c000-ff0125e000 out of 256MB segment!
[   37.518643] [drm:amdgpu_uvd_cs_pass2 [amdgpu]] *ERROR* msg/fb buffer ff012b6000-ff012b8000 out of 256MB segment!
[   52.270266] [drm:amdgpu_uvd_cs_pass2 [amdgpu]] *ERROR* msg/fb buffer ff01348000-ff0134a000 out of 256MB segment!

/etc/modprob.d/radeon.conf
options radeon si_support=0
options radeon cik_support=0

/etc/modprob.d/amdgpu.conf
options amdgpu si_support=1
options amdgpu cik_support=0

Last edited by Beiruty (2024-08-19 21:01:50)

Offline

#24 2024-08-19 20:57:05

seth
Member
Registered: 2012-09-03
Posts: 60,080

Re: Regression; Video instability after *recent* update, AMDGPU

Please use [code][/code] tags. Edit your post in this regard.
Please post your Xorg log, https://wiki.archlinux.org/title/Xorg#General
Also the output of "glxinfo -B"

Offline

#25 2024-08-19 21:05:24

Beiruty
Member
Registered: 2024-08-17
Posts: 14

Re: Regression; Video instability after *recent* update, AMDGPU

I am running Wayland. Is Xorg log relavent? If not, is there a wayland log?

glxinfo -B                                                                                                                                                                                                                                                                            
name of display: :0
display: :0  screen: 0
direct rendering: Yes
Extended renderer info (GLX_MESA_query_renderer):
    Vendor: AMD (0x1002)
    Device: AMD Radeon HD 7800 Series (radeonsi, pitcairn, ACO, DRM 3.57, 6.10.5-x64v2-xanmod1) (0x6818)
    Version: 24.3.0
    Accelerated: yes
    Video memory: 2048MB
    Unified memory: no
    Preferred profile: core (0x1)
    Max core profile version: 4.6
    Max compat profile version: 4.6
    Max GLES1 profile version: 1.1
    Max GLES[23] profile version: 3.2
Memory info (GL_ATI_meminfo):
    VBO free memory - total: 835 MB, largest block: 835 MB
    VBO free aux. memory - total: 7713 MB, largest block: 7713 MB
    Texture free memory - total: 835 MB, largest block: 835 MB
    Texture free aux. memory - total: 7713 MB, largest block: 7713 MB
    Renderbuffer free memory - total: 835 MB, largest block: 835 MB
    Renderbuffer free aux. memory - total: 7713 MB, largest block: 7713 MB
Memory info (GL_NVX_gpu_memory_info):
    Dedicated video memory: 2048 MB
    Total available memory: 9983 MB
    Currently available dedicated video memory: 835 MB
OpenGL vendor string: AMD
OpenGL renderer string: AMD Radeon HD 7800 Series (radeonsi, pitcairn, ACO, DRM 3.57, 6.10.5-x64v2-xanmod1)
OpenGL core profile version string: 4.6 (Core Profile) Mesa 24.3.0-devel (git-93e96da945)
OpenGL core profile shading language version string: 4.60
OpenGL core profile context flags: (none)
OpenGL core profile profile mask: core profile

OpenGL version string: 4.6 (Compatibility Profile) Mesa 24.3.0-devel (git-93e96da945)
OpenGL shading language version string: 4.60
OpenGL context flags: (none)
OpenGL profile mask: compatibility profile

OpenGL ES profile version string: OpenGL ES 3.2 Mesa 24.3.0-devel (git-93e96da945)
OpenGL ES profile shading language version string: OpenGL ES GLSL ES 3.20

Offline

Board footer

Powered by FluxBB