You are not logged in.

#1 2024-08-22 12:51:33

emil.s
Member
Registered: 2011-05-22
Posts: 11

Corrupt/artifacting screens after suspend [Wayland | Nvidia | Sddm]

Hello!

I have a very strange issue on my workstation. After starting from suspend, my screen always have some kind of corruption or artifact.

Example:
https://imgur.com/a/64aAs17

Sometimes both screen are artifacting, and sometimes it's just some windows. But there is always something that is broken, meaning that suspend is totally useless for me.
If I switch to the virtual console and restart sddm, everything is back to normal.

The problem is that I don't even know where to start troubleshooting. Is it a driver issue, displaymanager issue or a wayland issue?

Any logs that are of particular interest? (Nothing really obvious in dmesg).

Package versions:
nvidia 555.58.02-17
wayland 1.23.0-1
kwayland 6.1.4-1
sddm 0.21.0-4
linux 6.10.6.arch1-1

Any idea about what could cause this?

Best regards

Offline

#2 2024-08-22 15:26:09

seth
Member
Registered: 2012-09-03
Posts: 60,792

Offline

#3 2024-08-22 20:49:22

emil.s
Member
Registered: 2011-05-22
Posts: 11

Re: Corrupt/artifacting screens after suspend [Wayland | Nvidia | Sddm]

Ah, thanks!

Yeah we are absolutely onto something.

root@ThinkStation: /home/emil #> cat /etc/modprobe.d/nvidia.conf 
options nvidia NVreg_PreserveVideoMemoryAllocations=1

Both services are loaded:

root@ThinkStation: /home/emil #> systemctl status nvidia-suspend.service nvidia-resume.service 
○ nvidia-suspend.service - NVIDIA system suspend actions
     Loaded: loaded (/usr/lib/systemd/system/nvidia-suspend.service; enabled; preset: disabled)
     Active: inactive (dead)

○ nvidia-resume.service - NVIDIA system resume actions
     Loaded: loaded (/usr/lib/systemd/system/nvidia-resume.service; enabled; preset: disabled)
     Active: inactive (dead)

And I'm getting some results. However, not the desired ones...

After suspend, the computer automatically tries to resume after just a few seconds. But this time the screens are just black and it's totally unresponsive. (Can't even SSH login).

And I see the following in the journal:

aug 22 22:28:49 ThinkStation systemd-logind[743]: The system will suspend now!
....
aug 22 22:28:50 ThinkStation systemd[1]: Reached target Sleep.
aug 22 22:28:50 ThinkStation systemd[1]: Starting NVIDIA system suspend actions...
aug 22 22:28:50 ThinkStation suspend[3473]: nvidia-suspend.service
aug 22 22:28:50 ThinkStation logger[3473]: <13>Aug 22 22:28:50 suspend: nvidia-suspend.service
aug 22 22:28:50 ThinkStation kwin_wayland[914]: kwin_wayland_drm: Presentation failed! Permission denied
aug 22 22:28:50 ThinkStation kwin_wayland[914]: kwin_wayland_drm: Presentation failed! Permission denied
aug 22 22:28:50 ThinkStation kernel: snd_hda_codec_hdmi hdaudioC2D0: HDMI: invalid ELD data byte 71
aug 22 22:28:50 ThinkStation kernel: ------------[ cut here ]------------
aug 22 22:28:50 ThinkStation kernel: WARNING: CPU: 9 PID: 3475 at include/linux/rwsem.h:80 follow_pte+0x1de/0x200
aug 22 22:28:50 ThinkStation kernel: Modules linked in: rfkill lm92 intel_rapl_msr intel_rapl_common intel_uncore_frequency intel_uncore_frequency_common sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm snd_hda_codec_realtek snd_hda_codec_generic crct10dif_pclmul snd_hda_scodec_component snd_hda_codec_hdmi crc32_pclmul polyval_clmulni polyval_generic gf128mul uvcvideo snd_hda_intel raid456 ghash_clmulni_intel videobuf2_vmalloc vboxnetflt(OE) sha512_ssse3 nls_iso8859_1 snd_intel_dspcfg nvidia_drm(POE) vboxnetadp(OE) snd_usb_audio vfat async_raid6_recov uvc sha1_ssse3 snd_intel_sdw_acpi nvidia_modeset(POE) fat videobuf2_memops aesni_intel async_memcpy cdc_ether snd_usbmidi_lib vboxdrv(OE) snd_hda_codec iTCO_wdt async_pq videobuf2_v4l2 crypto_simd snd_ump video intel_pmc_bxt usbnet async_xor ee1004 pkcs8_key_parser snd_hda_core snd_rawmidi cryptd mei_wdt iTCO_vendor_support async_tx videodev snd_seq_device rapl ixgbe snd_hwdep r8152 videobuf2_common i2c_i801 think_lmi nvidia_uvm(POE) snd_pcm mii i2c_smbus intel_cstate
aug 22 22:28:50 ThinkStation kernel:  mdio_devres rtsx_usb_ms snd_timer memstick intel_uncore libphy intel_wmi_thunderbolt mc firmware_attributes_class wmi_bmof pcspkr intel_pch_thermal i2c_mux mdio e1000e snd mei_me lpc_ich ptp soundcore mei pps_core dca mousedev joydev md_mod mac_hid nvidia(POE) i2c_dev crypto_user loop dm_mod nfnetlink ip_tables x_tables rtsx_usb_sdmmc hid_generic mmc_core usbhid rtsx_usb btrfs blake2b_generic libcrc32c crc32c_generic xor raid6_pq mxm_wmi nvme crc32c_intel sha256_ssse3 sr_mod ata_generic nvme_core cdrom xhci_pci pata_acpi xhci_pci_renesas nvme_auth wmi
aug 22 22:28:50 ThinkStation kernel: CPU: 9 PID: 3475 Comm: nvidia-sleep.sh Tainted: P           OE      6.10.6-arch1-1 #1 703d152c24f1971e36f16e505405e456fc9e23f8
aug 22 22:28:50 ThinkStation kernel: Hardware name: LENOVO 30B4S01W00/102F, BIOS S00KT73A 05/24/2022
aug 22 22:28:50 ThinkStation kernel: RIP: 0010:follow_pte+0x1de/0x200
aug 22 22:28:50 ThinkStation kernel: Code: cc cc cc 48 81 e2 00 00 00 c0 48 09 c2 48 f7 d2 48 85 fa 75 20 e8 b2 f5 ff ff 48 8b 35 6b f1 5c 01 48 81 e6 00 00 00 c0 eb 8d <0f> 0b 48 3b 1f 0f 83 50 fe ff ff bd ea ff ff ff eb b6 49 8b 3c 24
aug 22 22:28:50 ThinkStation kernel: RSP: 0018:ffffbc3804753a70 EFLAGS: 00010246
aug 22 22:28:50 ThinkStation kernel: RAX: 0000000000000000 RBX: 000076ca89100000 RCX: ffffbc3804753ab0
aug 22 22:28:50 ThinkStation kernel: RDX: ffffbc3804753aa8 RSI: 000076ca89100000 RDI: ffff9bf2b62ae450
aug 22 22:28:50 ThinkStation kernel: RBP: ffffbc3804753af0 R08: ffffbc3804753c48 R09: 0000000000000000
aug 22 22:28:50 ThinkStation kernel: R10: ffff9bf1d934b000 R11: ffffffffc39a0080 R12: ffffbc3804753ab0
aug 22 22:28:50 ThinkStation kernel: R13: ffffbc3804753aa8 R14: ffff9bf08b4e2100 R15: 0000000000000000
aug 22 22:28:50 ThinkStation kernel: FS:  000072adf1e58b80(0000) GS:ffff9bf9d7c80000(0000) knlGS:0000000000000000
aug 22 22:28:50 ThinkStation kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
aug 22 22:28:50 ThinkStation kernel: CR2: 000020a0128f34a0 CR3: 0000000336766002 CR4: 00000000003706f0
aug 22 22:28:50 ThinkStation kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
aug 22 22:28:50 ThinkStation kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
aug 22 22:28:50 ThinkStation kernel: Call Trace:
....

And then about 10k lines of crash dumps. :-|

There is some "kwin_wayland_drm: Presentation failed! Permission denied" message just before it crashes, not sure if it's related?

Offline

#4 2024-08-23 06:40:14

seth
Member
Registered: 2012-09-03
Posts: 60,792

Re: Corrupt/artifacting screens after suspend [Wayland | Nvidia | Sddm]

aug 22 22:28:50 ThinkStation kernel: Call Trace:
....

You cut off the most interesting part…
The 550xx and 555xx drivers are frequently crahing, S3 cycles seem to be a trigger: https://bbs.archlinux.org/viewtopic.php?id=293400&p=4
There's an inofficial 550xx driver claimed by nvidia to fix this and you can otherwise try the 535xx drivers or nvidia-open (which might still have general issues w/ S3)

If you've nvidia in the initramfs, make sure to have regenerated that after editing modprobe (though the services are supposed to complain if the parameter isn't set)

Offline

Board footer

Powered by FluxBB