You are not logged in.
Pages: 1
Hello, I have been getting crashes by my iGPU to the point that the system is unusable.
Kernel 6.6.5-arch1-1
Dec 28 19:15:11 sarch kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx_0.0.0 timeout, signaled seq=3001, emitted seq=3004
Dec 28 19:15:11 sarch kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process kwin_x11 pid 6095 thread kwin_x11:cs0 pid 6134
Dec 28 19:15:11 sarch kernel: amdgpu 0000:0f:00.0: amdgpu: GPU reset begin!
Dec 28 19:15:11 sarch kernel: amdgpu 0000:0f:00.0: amdgpu: MODE2 reset
Dec 28 19:15:11 sarch kernel: amdgpu 0000:0f:00.0: amdgpu: GPU reset succeeded, trying to resume
Dec 28 19:15:11 sarch kernel: [drm] PCIE GART of 1024M enabled (table at 0x000000F41FC00000).
Dec 28 19:15:11 sarch kernel: [drm] PSP is resuming...
Dec 28 19:15:11 sarch kernel: [drm] reserve 0xa00000 from 0xf41e000000 for PSP TMR
Dec 28 19:15:11 sarch kernel: amdgpu 0000:0f:00.0: amdgpu: RAS: optional ras ta ucode is not available
Dec 28 19:15:11 sarch kernel: amdgpu 0000:0f:00.0: amdgpu: RAP: optional rap ta ucode is not available
Dec 28 19:15:11 sarch kernel: amdgpu 0000:0f:00.0: amdgpu: SECUREDISPLAY: securedisplay ta ucode is not available
Dec 28 19:15:11 sarch kernel: amdgpu 0000:0f:00.0: amdgpu: SMU is resuming...
Dec 28 19:15:11 sarch kernel: amdgpu 0000:0f:00.0: amdgpu: SMU is resumed successfully!
Dec 28 19:15:11 sarch kernel: [drm] DMUB hardware initialized: version=0x05000F00
Dec 28 19:15:11 sarch kernel: [drm] kiq ring mec 2 pipe 1 q 0
Dec 28 19:15:11 sarch kernel: [drm] VCN decode and encode initialized successfully(under DPG Mode).
Dec 28 19:15:11 sarch kernel: [drm] JPEG decode initialized successfully.
Dec 28 19:15:11 sarch kernel: amdgpu 0000:0f:00.0: amdgpu: ring gfx_0.0.0 uses VM inv eng 0 on hub 0
Dec 28 19:15:11 sarch kernel: amdgpu 0000:0f:00.0: amdgpu: ring comp_1.0.0 uses VM inv eng 1 on hub 0
Dec 28 19:15:11 sarch kernel: amdgpu 0000:0f:00.0: amdgpu: ring comp_1.1.0 uses VM inv eng 4 on hub 0
Dec 28 19:15:11 sarch kernel: amdgpu 0000:0f:00.0: amdgpu: ring comp_1.2.0 uses VM inv eng 5 on hub 0
Dec 28 19:15:11 sarch kernel: amdgpu 0000:0f:00.0: amdgpu: ring comp_1.3.0 uses VM inv eng 6 on hub 0
Dec 28 19:15:11 sarch kernel: amdgpu 0000:0f:00.0: amdgpu: ring comp_1.0.1 uses VM inv eng 7 on hub 0
Dec 28 19:15:11 sarch kernel: amdgpu 0000:0f:00.0: amdgpu: ring comp_1.1.1 uses VM inv eng 8 on hub 0
Dec 28 19:15:11 sarch kernel: amdgpu 0000:0f:00.0: amdgpu: ring comp_1.2.1 uses VM inv eng 9 on hub 0
Dec 28 19:15:11 sarch kernel: amdgpu 0000:0f:00.0: amdgpu: ring comp_1.3.1 uses VM inv eng 10 on hub 0
Dec 28 19:15:11 sarch kernel: amdgpu 0000:0f:00.0: amdgpu: ring kiq_0.2.1.0 uses VM inv eng 11 on hub 0
Dec 28 19:15:11 sarch kernel: amdgpu 0000:0f:00.0: amdgpu: ring sdma0 uses VM inv eng 12 on hub 0
Dec 28 19:15:11 sarch kernel: amdgpu 0000:0f:00.0: amdgpu: ring vcn_dec_0 uses VM inv eng 0 on hub 8
Dec 28 19:15:11 sarch kernel: amdgpu 0000:0f:00.0: amdgpu: ring vcn_enc_0.0 uses VM inv eng 1 on hub 8
Dec 28 19:15:11 sarch kernel: amdgpu 0000:0f:00.0: amdgpu: ring vcn_enc_0.1 uses VM inv eng 4 on hub 8
Dec 28 19:15:11 sarch kernel: amdgpu 0000:0f:00.0: amdgpu: ring jpeg_dec uses VM inv eng 5 on hub 8
Dec 28 19:15:11 sarch sanoid[6193]: INFO: cache expired - updating from zfs list.
Dec 28 19:15:11 sarch kernel: amdgpu 0000:0f:00.0: amdgpu: recover vram bo from shadow start
Dec 28 19:15:11 sarch kernel: amdgpu 0000:0f:00.0: amdgpu: recover vram bo from shadow done
Dec 28 19:15:11 sarch kernel: amdgpu 0000:0f:00.0: amdgpu: GPU reset(6) succeeded!
Dec 28 19:15:11 sarch kernel: [drm] Skip scheduling IBs!
Dec 28 19:15:11 sarch dbus-daemon[1942]: [system] Activating via systemd: service name='org.freedesktop.ModemManager1' unit='dbus-org.freedesktop.ModemManager1.service' requested by ':1.143' (uid=1000 pid=6094 comm="/usr/bin/kded5")
Dec 28 19:15:11 sarch dbus-daemon[1942]: [system] Activation via systemd failed for unit 'dbus-org.freedesktop.ModemManager1.service': Unit dbus-org.freedesktop.ModemManager1.service not found.
Dec 28 19:15:11 sarch dbus-daemon[1942]: [system] Activating via systemd: service name='org.freedesktop.ModemManager1' unit='dbus-org.freedesktop.ModemManager1.service' requested by ':1.143' (uid=1000 pid=6094 comm="/usr/bin/kded5")
Dec 28 19:15:11 sarch dbus-daemon[1942]: [system] Activation via systemd failed for unit 'dbus-org.freedesktop.ModemManager1.service': Unit dbus-org.freedesktop.ModemManager1.service not found.
Dec 28 19:15:11 sarch kded5[6094]: kf.modemmanagerqt: Failed enumerating MM objects: "org.freedesktop.systemd1.NoSuchUnit"
"Unit dbus-org.freedesktop.ModemManager1.service not found."
Dec 28 19:15:11 sarch kalendarac[6434]: org.kde.pim.akonadicore: Job error: "" for collection: QVector()
Dec 28 19:15:11 sarch kwin_x11[6095]: kwin_scene_opengl: A graphics reset attributable to the current GL context occurred.
Dec 28 19:15:11 sarch plasmashell[6163]: Cyclic dependency detected between "file:///usr/share/plasma/plasmoids/org.kde.plasma.notifications/contents/ui/global/Globals.qml" and "file:///usr/share/plasma/plasmoids/org.kde.plasma.notifications/contents/ui/NotificationHeader.qml"
Dec 28 19:15:11 sarch plasmashell[6163]: Cyclic dependency detected between "file:///usr/share/plasma/plasmoids/org.kde.plasma.notifications/contents/ui/global/Globals.qml" and "file:///usr/share/plasma/plasmoids/org.kde.plasma.notifications/contents/ui/ThumbnailStrip.qml"
Dec 28 19:15:11 sarch dbus-daemon[3510]: [session uid=1000 pid=3510] Successfully activated service 'org.freedesktop.Notifications'
Dec 28 19:15:11 sarch systemd[1]: Started Process Core Dump (PID 6842/UID 0).
Dec 28 19:15:11 sarch kded5[6094]: kf.coreaddons: "Could not load plugin from /usr/lib/qt/plugins/plasma/geolocationprovider/plasma-geolocation-gps.so: Cannot load library /usr/lib/qt/plugins/plasma/geolocationprovider/plasma-geolocation-gps.so: (libgps.so.30: cannot open shared object file: No such file or directory)"
Dec 28 19:15:11 sarch kded5[6094]: Failed to load GeolocationProvider: "/usr/lib/qt/plugins/plasma/geolocationprovider/plasma-geolocation-gps.so" "Could not load plugin from /usr/lib/qt/plugins/plasma/geolocationprovider/plasma-geolocation-gps.so: Cannot load library /usr/lib/qt/plugins/plasma/geolocationprovider/plasma-geolocation-gps.so: (libgps.so.30: cannot open shared object file: No such file or directory)"
Dec 28 19:15:11 sarch kded5[6094]: "location"
Dec 28 19:15:11 sarch systemd[1]: sanoid.service: Deactivated successfully.
Dec 28 19:15:11 sarch systemd[1]: Finished Snapshot ZFS Pool.
Dec 28 19:15:11 sarch systemd[1]: Starting sanoid-prune.service...
Dec 28 19:15:11 sarch systemd-coredump[6845]: [?] Process 5913 (Xorg) of user 0 dumped core.
Stack trace of thread 5916:
#0 0x00007ff6705fe83c n/a (libc.so.6 + 0x8e83c)
#1 0x00007ff6705ae668 raise (libc.so.6 + 0x3e668)
#2 0x00007ff6705964b8 abort (libc.so.6 + 0x264b8)
#3 0x00005568eea31a00 OsAbort (Xorg + 0x159a00)
#4 0x00005568eea31d3b FatalError (Xorg + 0x159d3b)
#5 0x00005568eea29cb6 n/a (Xorg + 0x151cb6)
#6 0x00007ff6705ae710 n/a (libc.so.6 + 0x3e710)
#7 0x00007ff670b3dfd0 n/a (libpixman-1.so.0 + 0x6dfd0)
#8 0x00007ff670ae4a0b pixman_fill (libpixman-1.so.0 + 0x14a0b)
#9 0x00005568eeaa3b8b fbFill (Xorg + 0x1cbb8b)
#10 0x00005568eeaa3e6e fbPolyFillRect (Xorg + 0x1cbe6e)
#11 0x00007ff66f8a96e9 n/a (libglamoregl.so + 0x216e9)
#12 0x00005568ee9a711d n/a (Xorg + 0xcf11d)
#13 0x00007ff66fbd51d8 n/a (amdgpu_drv.so + 0xe1d8)
#14 0x00007ff66fbdb9c2 n/a (amdgpu_drv.so + 0x149c2)
#15 0x00005568eea60198 n/a (Xorg + 0x188198)
#16 0x00007ff66fc026b8 n/a (libglx.so + 0x136b8)
#17 0x00005568eea4c596 ddxGiveUp (Xorg + 0x174596)
#18 0x00005568eea31dec FatalError (Xorg + 0x159dec)
#19 0x00005568eea29cb6 n/a (Xorg + 0x151cb6)
#20 0x00007ff6705ae710 n/a (libc.so.6 + 0x3e710)
#21 0x00007ff6705fe83c n/a (libc.so.6 + 0x8e83c)
#22 0x00007ff6705ae668 raise (libc.so.6 + 0x3e668)
#23 0x00007ff6705964b8 abort (libc.so.6 + 0x264b8)
#24 0x00007ff66de9a497 n/a (radeonsi_dri.so + 0x8a5497)
#25 0x00007ff66dea1485 n/a (radeonsi_dri.so + 0x8ac485)
#26 0x00007ff66d70891d n/a (radeonsi_dri.so + 0x11391d)
#27 0x00007ff66d6ffb8c n/a (radeonsi_dri.so + 0x10ab8c)
#28 0x00007ff6705fc9eb n/a (libc.so.6 + 0x8c9eb)
#29 0x00007ff6706807cc n/a (libc.so.6 + 0x1107cc)
Stack trace of thread 5917:
#0 0x00007ff6705f94ae n/a (libc.so.6 + 0x894ae)
#1 0x00007ff6705fbd40 pthread_cond_wait (libc.so.6 + 0x8bd40)
#2 0x00007ff66d70885c n/a (radeonsi_dri.so + 0x11385c)
#3 0x00007ff66d6ffb8c n/a (radeonsi_dri.so + 0x10ab8c)
#4 0x00007ff6705fc9eb n/a (libc.so.6 + 0x8c9eb)
#5 0x00007ff6706807cc n/a (libc.so.6 + 0x1107cc)
Stack trace of thread 5913:
#0 0x00007ff67067e73d syscall (libc.so.6 + 0x10e73d)
#1 0x00007ff66d7077fe n/a (radeonsi_dri.so + 0x1127fe)
#2 0x00007ff66de9ee71 n/a (radeonsi_dri.so + 0x8a9e71)
#3 0x00007ff66de727e0 n/a (radeonsi_dri.so + 0x87d7e0)
#4 0x00007ff66d76577a n/a (radeonsi_dri.so + 0x17077a)
#5 0x00007ff66d8b4421 n/a (radeonsi_dri.so + 0x2bf421)
#6 0x00007ff66fbd4eae n/a (amdgpu_drv.so + 0xdeae)
#7 0x00005568ee9b76c3 n/a (Xorg + 0xdf6c3)
#8 0x00005568ee9b4aa4 n/a (Xorg + 0xdcaa4)
#9 0x00005568eea5ef2d n/a (Xorg + 0x186f2d)
#10 0x00005568ee986174 n/a (Xorg + 0xae174)
#11 0x00005568ee93992c miValidateTree (Xorg + 0x6192c)
#12 0x00005568ee9824b2 UnmapWindow (Xorg + 0xaa4b2)
#13 0x00005568ee98250f DeleteWindow (Xorg + 0xaa50f)
#14 0x00005568ee979b6e n/a (Xorg + 0xa1b6e)
#15 0x00005568ee979cf5 FreeResource (Xorg + 0xa1cf5)
#16 0x00005568ee94d319 n/a (Xorg + 0x75319)Same thing happens on Wayland.
I also have a 7900XTX as a dGPU, disabling the iGPU and using the 7900xtx does get rid of the errors.
I have tried with kernel params "amdgpu.ppfeaturemask=0xfffd3fff" , "amdgpu.ppfeaturemask=0xffffbffd" and the problem still persists.
Offline
Can you reproduce on the latest kernel? There have been a bunch of integrated amdgpu work in the last three stable releases.
Offline
Can you reproduce on the latest kernel? There have been a bunch of integrated amdgpu work in the last three stable releases.
Yes, same issue on 6.6.9-arch1-1.
[ 118.041834] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx_0.0.0 timeout, signaled seq=28978, emitted seq=28980
[ 118.041993] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process kwin_x11 pid 2765 thread kwin_x11:cs0 pid 2791
[ 118.042124] amdgpu 0000:0f:00.0: amdgpu: GPU reset begin!
[ 118.095307] amdgpu 0000:0f:00.0: amdgpu: MODE2 reset
[ 118.102454] amdgpu 0000:0f:00.0: amdgpu: GPU reset succeeded, trying to resume
[ 118.102553] [drm] PCIE GART of 1024M enabled (table at 0x000000F41FC00000).
[ 118.102600] [drm] PSP is resuming...
[ 118.124186] [drm] reserve 0xa00000 from 0xf41e000000 for PSP TMR
[ 118.314140] amdgpu 0000:0f:00.0: amdgpu: RAS: optional ras ta ucode is not available
[ 118.319585] amdgpu 0000:0f:00.0: amdgpu: RAP: optional rap ta ucode is not available
[ 118.319586] amdgpu 0000:0f:00.0: amdgpu: SECUREDISPLAY: securedisplay ta ucode is not available
[ 118.319588] amdgpu 0000:0f:00.0: amdgpu: SMU is resuming...
[ 118.319842] amdgpu 0000:0f:00.0: amdgpu: SMU is resumed successfully!
[ 118.320383] [drm] DMUB hardware initialized: version=0x05000F00
[ 118.331379] [drm] kiq ring mec 2 pipe 1 q 0
[ 118.333788] [drm] VCN decode and encode initialized successfully(under DPG Mode).
[ 118.333829] [drm] JPEG decode initialized successfully.
[ 118.333831] amdgpu 0000:0f:00.0: amdgpu: ring gfx_0.0.0 uses VM inv eng 0 on hub 0
[ 118.333832] amdgpu 0000:0f:00.0: amdgpu: ring comp_1.0.0 uses VM inv eng 1 on hub 0
[ 118.333833] amdgpu 0000:0f:00.0: amdgpu: ring comp_1.1.0 uses VM inv eng 4 on hub 0
[ 118.333834] amdgpu 0000:0f:00.0: amdgpu: ring comp_1.2.0 uses VM inv eng 5 on hub 0
[ 118.333835] amdgpu 0000:0f:00.0: amdgpu: ring comp_1.3.0 uses VM inv eng 6 on hub 0
[ 118.333836] amdgpu 0000:0f:00.0: amdgpu: ring comp_1.0.1 uses VM inv eng 7 on hub 0
[ 118.333836] amdgpu 0000:0f:00.0: amdgpu: ring comp_1.1.1 uses VM inv eng 8 on hub 0
[ 118.333837] amdgpu 0000:0f:00.0: amdgpu: ring comp_1.2.1 uses VM inv eng 9 on hub 0
[ 118.333838] amdgpu 0000:0f:00.0: amdgpu: ring comp_1.3.1 uses VM inv eng 10 on hub 0
[ 118.333839] amdgpu 0000:0f:00.0: amdgpu: ring kiq_0.2.1.0 uses VM inv eng 11 on hub 0
[ 118.333840] amdgpu 0000:0f:00.0: amdgpu: ring sdma0 uses VM inv eng 12 on hub 0
[ 118.333841] amdgpu 0000:0f:00.0: amdgpu: ring vcn_dec_0 uses VM inv eng 0 on hub 8
[ 118.333841] amdgpu 0000:0f:00.0: amdgpu: ring vcn_enc_0.0 uses VM inv eng 1 on hub 8
[ 118.333842] amdgpu 0000:0f:00.0: amdgpu: ring vcn_enc_0.1 uses VM inv eng 4 on hub 8
[ 118.333843] amdgpu 0000:0f:00.0: amdgpu: ring jpeg_dec uses VM inv eng 5 on hub 8
[ 118.338095] amdgpu 0000:0f:00.0: amdgpu: recover vram bo from shadow start
[ 118.338096] amdgpu 0000:0f:00.0: amdgpu: recover vram bo from shadow done
[ 118.338103] amdgpu 0000:0f:00.0: amdgpu: GPU reset(2) succeeded!
[ 118.338118] [drm] Skip scheduling IBs!
[ 118.338213] [drm] Skip scheduling IBs!
[ 118.342309] [drm] Skip scheduling IBs!Offline
Update: My display was set to 4K 120 when it is not supported by the motherboard (ASUS B650 ProArt).
After setting it back to 4K60, the crashes stopped.
Edit: It is not fixed. Still crashing, just not as often it seems.
Last edited by rootpeer (2024-01-04 15:26:59)
Offline
Pages: 1