You are not logged in.

#1 2023-12-29 07:25:44

rootpeer
Member
Registered: 2019-04-07
Posts: 45

AMD 7950x iGPU crashes

Hello, I have been getting crashes by my iGPU to the point that the system is unusable.

Kernel 6.6.5-arch1-1

Dec 28 19:15:11 sarch kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx_0.0.0 timeout, signaled seq=3001, emitted seq=3004
Dec 28 19:15:11 sarch kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process kwin_x11 pid 6095 thread kwin_x11:cs0 pid 6134
Dec 28 19:15:11 sarch kernel: amdgpu 0000:0f:00.0: amdgpu: GPU reset begin!
Dec 28 19:15:11 sarch kernel: amdgpu 0000:0f:00.0: amdgpu: MODE2 reset
Dec 28 19:15:11 sarch kernel: amdgpu 0000:0f:00.0: amdgpu: GPU reset succeeded, trying to resume
Dec 28 19:15:11 sarch kernel: [drm] PCIE GART of 1024M enabled (table at 0x000000F41FC00000).
Dec 28 19:15:11 sarch kernel: [drm] PSP is resuming...
Dec 28 19:15:11 sarch kernel: [drm] reserve 0xa00000 from 0xf41e000000 for PSP TMR
Dec 28 19:15:11 sarch kernel: amdgpu 0000:0f:00.0: amdgpu: RAS: optional ras ta ucode is not available
Dec 28 19:15:11 sarch kernel: amdgpu 0000:0f:00.0: amdgpu: RAP: optional rap ta ucode is not available
Dec 28 19:15:11 sarch kernel: amdgpu 0000:0f:00.0: amdgpu: SECUREDISPLAY: securedisplay ta ucode is not available
Dec 28 19:15:11 sarch kernel: amdgpu 0000:0f:00.0: amdgpu: SMU is resuming...
Dec 28 19:15:11 sarch kernel: amdgpu 0000:0f:00.0: amdgpu: SMU is resumed successfully!
Dec 28 19:15:11 sarch kernel: [drm] DMUB hardware initialized: version=0x05000F00
Dec 28 19:15:11 sarch kernel: [drm] kiq ring mec 2 pipe 1 q 0
Dec 28 19:15:11 sarch kernel: [drm] VCN decode and encode initialized successfully(under DPG Mode).
Dec 28 19:15:11 sarch kernel: [drm] JPEG decode initialized successfully.
Dec 28 19:15:11 sarch kernel: amdgpu 0000:0f:00.0: amdgpu: ring gfx_0.0.0 uses VM inv eng 0 on hub 0
Dec 28 19:15:11 sarch kernel: amdgpu 0000:0f:00.0: amdgpu: ring comp_1.0.0 uses VM inv eng 1 on hub 0
Dec 28 19:15:11 sarch kernel: amdgpu 0000:0f:00.0: amdgpu: ring comp_1.1.0 uses VM inv eng 4 on hub 0
Dec 28 19:15:11 sarch kernel: amdgpu 0000:0f:00.0: amdgpu: ring comp_1.2.0 uses VM inv eng 5 on hub 0
Dec 28 19:15:11 sarch kernel: amdgpu 0000:0f:00.0: amdgpu: ring comp_1.3.0 uses VM inv eng 6 on hub 0
Dec 28 19:15:11 sarch kernel: amdgpu 0000:0f:00.0: amdgpu: ring comp_1.0.1 uses VM inv eng 7 on hub 0
Dec 28 19:15:11 sarch kernel: amdgpu 0000:0f:00.0: amdgpu: ring comp_1.1.1 uses VM inv eng 8 on hub 0
Dec 28 19:15:11 sarch kernel: amdgpu 0000:0f:00.0: amdgpu: ring comp_1.2.1 uses VM inv eng 9 on hub 0
Dec 28 19:15:11 sarch kernel: amdgpu 0000:0f:00.0: amdgpu: ring comp_1.3.1 uses VM inv eng 10 on hub 0
Dec 28 19:15:11 sarch kernel: amdgpu 0000:0f:00.0: amdgpu: ring kiq_0.2.1.0 uses VM inv eng 11 on hub 0
Dec 28 19:15:11 sarch kernel: amdgpu 0000:0f:00.0: amdgpu: ring sdma0 uses VM inv eng 12 on hub 0
Dec 28 19:15:11 sarch kernel: amdgpu 0000:0f:00.0: amdgpu: ring vcn_dec_0 uses VM inv eng 0 on hub 8
Dec 28 19:15:11 sarch kernel: amdgpu 0000:0f:00.0: amdgpu: ring vcn_enc_0.0 uses VM inv eng 1 on hub 8
Dec 28 19:15:11 sarch kernel: amdgpu 0000:0f:00.0: amdgpu: ring vcn_enc_0.1 uses VM inv eng 4 on hub 8
Dec 28 19:15:11 sarch kernel: amdgpu 0000:0f:00.0: amdgpu: ring jpeg_dec uses VM inv eng 5 on hub 8
Dec 28 19:15:11 sarch sanoid[6193]: INFO: cache expired - updating from zfs list.
Dec 28 19:15:11 sarch kernel: amdgpu 0000:0f:00.0: amdgpu: recover vram bo from shadow start
Dec 28 19:15:11 sarch kernel: amdgpu 0000:0f:00.0: amdgpu: recover vram bo from shadow done
Dec 28 19:15:11 sarch kernel: amdgpu 0000:0f:00.0: amdgpu: GPU reset(6) succeeded!
Dec 28 19:15:11 sarch kernel: [drm] Skip scheduling IBs!
Dec 28 19:15:11 sarch dbus-daemon[1942]: [system] Activating via systemd: service name='org.freedesktop.ModemManager1' unit='dbus-org.freedesktop.ModemManager1.service' requested by ':1.143' (uid=1000 pid=6094 comm="/usr/bin/kded5")
Dec 28 19:15:11 sarch dbus-daemon[1942]: [system] Activation via systemd failed for unit 'dbus-org.freedesktop.ModemManager1.service': Unit dbus-org.freedesktop.ModemManager1.service not found.
Dec 28 19:15:11 sarch dbus-daemon[1942]: [system] Activating via systemd: service name='org.freedesktop.ModemManager1' unit='dbus-org.freedesktop.ModemManager1.service' requested by ':1.143' (uid=1000 pid=6094 comm="/usr/bin/kded5")
Dec 28 19:15:11 sarch dbus-daemon[1942]: [system] Activation via systemd failed for unit 'dbus-org.freedesktop.ModemManager1.service': Unit dbus-org.freedesktop.ModemManager1.service not found.
Dec 28 19:15:11 sarch kded5[6094]: kf.modemmanagerqt: Failed enumerating MM objects: "org.freedesktop.systemd1.NoSuchUnit" 
                                    "Unit dbus-org.freedesktop.ModemManager1.service not found."
Dec 28 19:15:11 sarch kalendarac[6434]: org.kde.pim.akonadicore: Job error:  "" for collection: QVector()
Dec 28 19:15:11 sarch kwin_x11[6095]: kwin_scene_opengl: A graphics reset attributable to the current GL context occurred.
Dec 28 19:15:11 sarch plasmashell[6163]: Cyclic dependency detected between "file:///usr/share/plasma/plasmoids/org.kde.plasma.notifications/contents/ui/global/Globals.qml" and "file:///usr/share/plasma/plasmoids/org.kde.plasma.notifications/contents/ui/NotificationHeader.qml"
Dec 28 19:15:11 sarch plasmashell[6163]: Cyclic dependency detected between "file:///usr/share/plasma/plasmoids/org.kde.plasma.notifications/contents/ui/global/Globals.qml" and "file:///usr/share/plasma/plasmoids/org.kde.plasma.notifications/contents/ui/ThumbnailStrip.qml"
Dec 28 19:15:11 sarch dbus-daemon[3510]: [session uid=1000 pid=3510] Successfully activated service 'org.freedesktop.Notifications'
Dec 28 19:15:11 sarch systemd[1]: Started Process Core Dump (PID 6842/UID 0).
Dec 28 19:15:11 sarch kded5[6094]: kf.coreaddons: "Could not load plugin from /usr/lib/qt/plugins/plasma/geolocationprovider/plasma-geolocation-gps.so: Cannot load library /usr/lib/qt/plugins/plasma/geolocationprovider/plasma-geolocation-gps.so: (libgps.so.30: cannot open shared object file: No such file or directory)"
Dec 28 19:15:11 sarch kded5[6094]: Failed to load GeolocationProvider: "/usr/lib/qt/plugins/plasma/geolocationprovider/plasma-geolocation-gps.so" "Could not load plugin from /usr/lib/qt/plugins/plasma/geolocationprovider/plasma-geolocation-gps.so: Cannot load library /usr/lib/qt/plugins/plasma/geolocationprovider/plasma-geolocation-gps.so: (libgps.so.30: cannot open shared object file: No such file or directory)"
Dec 28 19:15:11 sarch kded5[6094]: "location"
Dec 28 19:15:11 sarch systemd[1]: sanoid.service: Deactivated successfully.
Dec 28 19:15:11 sarch systemd[1]: Finished Snapshot ZFS Pool.
Dec 28 19:15:11 sarch systemd[1]: Starting sanoid-prune.service...
Dec 28 19:15:11 sarch systemd-coredump[6845]: [?] Process 5913 (Xorg) of user 0 dumped core.
                                              
                                              Stack trace of thread 5916:
                                              #0  0x00007ff6705fe83c n/a (libc.so.6 + 0x8e83c)
                                              #1  0x00007ff6705ae668 raise (libc.so.6 + 0x3e668)
                                              #2  0x00007ff6705964b8 abort (libc.so.6 + 0x264b8)
                                              #3  0x00005568eea31a00 OsAbort (Xorg + 0x159a00)
                                              #4  0x00005568eea31d3b FatalError (Xorg + 0x159d3b)
                                              #5  0x00005568eea29cb6 n/a (Xorg + 0x151cb6)
                                              #6  0x00007ff6705ae710 n/a (libc.so.6 + 0x3e710)
                                              #7  0x00007ff670b3dfd0 n/a (libpixman-1.so.0 + 0x6dfd0)
                                              #8  0x00007ff670ae4a0b pixman_fill (libpixman-1.so.0 + 0x14a0b)
                                              #9  0x00005568eeaa3b8b fbFill (Xorg + 0x1cbb8b)
                                              #10 0x00005568eeaa3e6e fbPolyFillRect (Xorg + 0x1cbe6e)
                                              #11 0x00007ff66f8a96e9 n/a (libglamoregl.so + 0x216e9)
                                              #12 0x00005568ee9a711d n/a (Xorg + 0xcf11d)
                                              #13 0x00007ff66fbd51d8 n/a (amdgpu_drv.so + 0xe1d8)
                                              #14 0x00007ff66fbdb9c2 n/a (amdgpu_drv.so + 0x149c2)
                                              #15 0x00005568eea60198 n/a (Xorg + 0x188198)
                                              #16 0x00007ff66fc026b8 n/a (libglx.so + 0x136b8)
                                              #17 0x00005568eea4c596 ddxGiveUp (Xorg + 0x174596)
                                              #18 0x00005568eea31dec FatalError (Xorg + 0x159dec)
                                              #19 0x00005568eea29cb6 n/a (Xorg + 0x151cb6)
                                              #20 0x00007ff6705ae710 n/a (libc.so.6 + 0x3e710)
                                              #21 0x00007ff6705fe83c n/a (libc.so.6 + 0x8e83c)
                                              #22 0x00007ff6705ae668 raise (libc.so.6 + 0x3e668)
                                              #23 0x00007ff6705964b8 abort (libc.so.6 + 0x264b8)
                                              #24 0x00007ff66de9a497 n/a (radeonsi_dri.so + 0x8a5497)
                                              #25 0x00007ff66dea1485 n/a (radeonsi_dri.so + 0x8ac485)
                                              #26 0x00007ff66d70891d n/a (radeonsi_dri.so + 0x11391d)
                                              #27 0x00007ff66d6ffb8c n/a (radeonsi_dri.so + 0x10ab8c)
                                              #28 0x00007ff6705fc9eb n/a (libc.so.6 + 0x8c9eb)
                                              #29 0x00007ff6706807cc n/a (libc.so.6 + 0x1107cc)
                                              
                                              Stack trace of thread 5917:
                                              #0  0x00007ff6705f94ae n/a (libc.so.6 + 0x894ae)
                                              #1  0x00007ff6705fbd40 pthread_cond_wait (libc.so.6 + 0x8bd40)
                                              #2  0x00007ff66d70885c n/a (radeonsi_dri.so + 0x11385c)
                                              #3  0x00007ff66d6ffb8c n/a (radeonsi_dri.so + 0x10ab8c)
                                              #4  0x00007ff6705fc9eb n/a (libc.so.6 + 0x8c9eb)
                                              #5  0x00007ff6706807cc n/a (libc.so.6 + 0x1107cc)
                                              
                                              Stack trace of thread 5913:
                                              #0  0x00007ff67067e73d syscall (libc.so.6 + 0x10e73d)
                                              #1  0x00007ff66d7077fe n/a (radeonsi_dri.so + 0x1127fe)
                                              #2  0x00007ff66de9ee71 n/a (radeonsi_dri.so + 0x8a9e71)
                                              #3  0x00007ff66de727e0 n/a (radeonsi_dri.so + 0x87d7e0)
                                              #4  0x00007ff66d76577a n/a (radeonsi_dri.so + 0x17077a)
                                              #5  0x00007ff66d8b4421 n/a (radeonsi_dri.so + 0x2bf421)
                                              #6  0x00007ff66fbd4eae n/a (amdgpu_drv.so + 0xdeae)
                                              #7  0x00005568ee9b76c3 n/a (Xorg + 0xdf6c3)
                                              #8  0x00005568ee9b4aa4 n/a (Xorg + 0xdcaa4)
                                              #9  0x00005568eea5ef2d n/a (Xorg + 0x186f2d)
                                              #10 0x00005568ee986174 n/a (Xorg + 0xae174)
                                              #11 0x00005568ee93992c miValidateTree (Xorg + 0x6192c)
                                              #12 0x00005568ee9824b2 UnmapWindow (Xorg + 0xaa4b2)
                                              #13 0x00005568ee98250f DeleteWindow (Xorg + 0xaa50f)
                                              #14 0x00005568ee979b6e n/a (Xorg + 0xa1b6e)
                                              #15 0x00005568ee979cf5 FreeResource (Xorg + 0xa1cf5)
                                              #16 0x00005568ee94d319 n/a (Xorg + 0x75319)

Same thing happens on Wayland.

I also have a 7900XTX as a dGPU, disabling the iGPU and using the 7900xtx does get rid of the errors.

I have tried with kernel params  "amdgpu.ppfeaturemask=0xfffd3fff" , "amdgpu.ppfeaturemask=0xffffbffd" and the problem still persists.

Offline

#2 2023-12-31 18:33:42

V1del
Forum Moderator
Registered: 2012-10-16
Posts: 25,167

Re: AMD 7950x iGPU crashes

Can you reproduce on the latest kernel? There have been a bunch of integrated amdgpu work in the last three stable releases.

Offline

#3 2024-01-03 17:19:25

rootpeer
Member
Registered: 2019-04-07
Posts: 45

Re: AMD 7950x iGPU crashes

V1del wrote:

Can you reproduce on the latest kernel? There have been a bunch of integrated amdgpu work in the last three stable releases.

Yes, same issue on 6.6.9-arch1-1.

[  118.041834] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx_0.0.0 timeout, signaled seq=28978, emitted seq=28980
[  118.041993] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process kwin_x11 pid 2765 thread kwin_x11:cs0 pid 2791
[  118.042124] amdgpu 0000:0f:00.0: amdgpu: GPU reset begin!
[  118.095307] amdgpu 0000:0f:00.0: amdgpu: MODE2 reset
[  118.102454] amdgpu 0000:0f:00.0: amdgpu: GPU reset succeeded, trying to resume
[  118.102553] [drm] PCIE GART of 1024M enabled (table at 0x000000F41FC00000).
[  118.102600] [drm] PSP is resuming...
[  118.124186] [drm] reserve 0xa00000 from 0xf41e000000 for PSP TMR
[  118.314140] amdgpu 0000:0f:00.0: amdgpu: RAS: optional ras ta ucode is not available
[  118.319585] amdgpu 0000:0f:00.0: amdgpu: RAP: optional rap ta ucode is not available
[  118.319586] amdgpu 0000:0f:00.0: amdgpu: SECUREDISPLAY: securedisplay ta ucode is not available
[  118.319588] amdgpu 0000:0f:00.0: amdgpu: SMU is resuming...
[  118.319842] amdgpu 0000:0f:00.0: amdgpu: SMU is resumed successfully!
[  118.320383] [drm] DMUB hardware initialized: version=0x05000F00
[  118.331379] [drm] kiq ring mec 2 pipe 1 q 0
[  118.333788] [drm] VCN decode and encode initialized successfully(under DPG Mode).
[  118.333829] [drm] JPEG decode initialized successfully.
[  118.333831] amdgpu 0000:0f:00.0: amdgpu: ring gfx_0.0.0 uses VM inv eng 0 on hub 0
[  118.333832] amdgpu 0000:0f:00.0: amdgpu: ring comp_1.0.0 uses VM inv eng 1 on hub 0
[  118.333833] amdgpu 0000:0f:00.0: amdgpu: ring comp_1.1.0 uses VM inv eng 4 on hub 0
[  118.333834] amdgpu 0000:0f:00.0: amdgpu: ring comp_1.2.0 uses VM inv eng 5 on hub 0
[  118.333835] amdgpu 0000:0f:00.0: amdgpu: ring comp_1.3.0 uses VM inv eng 6 on hub 0
[  118.333836] amdgpu 0000:0f:00.0: amdgpu: ring comp_1.0.1 uses VM inv eng 7 on hub 0
[  118.333836] amdgpu 0000:0f:00.0: amdgpu: ring comp_1.1.1 uses VM inv eng 8 on hub 0
[  118.333837] amdgpu 0000:0f:00.0: amdgpu: ring comp_1.2.1 uses VM inv eng 9 on hub 0
[  118.333838] amdgpu 0000:0f:00.0: amdgpu: ring comp_1.3.1 uses VM inv eng 10 on hub 0
[  118.333839] amdgpu 0000:0f:00.0: amdgpu: ring kiq_0.2.1.0 uses VM inv eng 11 on hub 0
[  118.333840] amdgpu 0000:0f:00.0: amdgpu: ring sdma0 uses VM inv eng 12 on hub 0
[  118.333841] amdgpu 0000:0f:00.0: amdgpu: ring vcn_dec_0 uses VM inv eng 0 on hub 8
[  118.333841] amdgpu 0000:0f:00.0: amdgpu: ring vcn_enc_0.0 uses VM inv eng 1 on hub 8
[  118.333842] amdgpu 0000:0f:00.0: amdgpu: ring vcn_enc_0.1 uses VM inv eng 4 on hub 8
[  118.333843] amdgpu 0000:0f:00.0: amdgpu: ring jpeg_dec uses VM inv eng 5 on hub 8
[  118.338095] amdgpu 0000:0f:00.0: amdgpu: recover vram bo from shadow start
[  118.338096] amdgpu 0000:0f:00.0: amdgpu: recover vram bo from shadow done
[  118.338103] amdgpu 0000:0f:00.0: amdgpu: GPU reset(2) succeeded!
[  118.338118] [drm] Skip scheduling IBs!
[  118.338213] [drm] Skip scheduling IBs!
[  118.342309] [drm] Skip scheduling IBs!

Offline

#4 2024-01-04 14:50:29

rootpeer
Member
Registered: 2019-04-07
Posts: 45

Re: AMD 7950x iGPU crashes

Update: My display was set to 4K 120 when it is not supported by the motherboard (ASUS B650 ProArt).

After setting it back to 4K60, the crashes stopped.

Edit: It is not fixed. Still crashing, just not as often it seems.

Last edited by rootpeer (2024-01-04 15:26:59)

Offline

Board footer

Powered by FluxBB