You are not logged in.

#101 2024-03-11 01:31:53

allencch
Member
Registered: 2011-03-25
Posts: 118

Re: NVIDIA - cannot resume from suspend with PreserveVideoMemoryAllocation

obap74 wrote:

I'm getting a black screen with a functional mouse cursor.

I have this issue as well. Then I found that, if I kill "picom" before suspending, then when resuming, I can see the screen properly.

Offline

#102 2024-03-11 07:57:48

seth
Member
Registered: 2012-09-03
Posts: 51,671

Re: NVIDIA - cannot resume from suspend with PreserveVideoMemoryAllocation

That's what the entire PreserveVideoMemoryAllocation is meant to overcome.
If the VRAM doesn't get stored or RAM it won't get refreshed during S3's and start to decay and your GL textures become black (invalid) or garbage (noise)

Offline

#103 2024-03-11 15:46:56

allencch
Member
Registered: 2011-03-25
Posts: 118

Re: NVIDIA - cannot resume from suspend with PreserveVideoMemoryAllocation

I tried with "options nvidia NVreg_PreserveVideoMemoryAllocations=1" on the module config file, switching between TTY still get the black screen with mouse, when picom is running.

Offline

#104 2024-03-11 15:52:20

seth
Member
Registered: 2012-09-03
Posts: 51,671

Re: NVIDIA - cannot resume from suspend with PreserveVideoMemoryAllocation

Did you also enable the relevant services?
Pot. redirect the storage destination (in case you've lots of VRAM and little free RAM/swap)?
https://wiki.archlinux.org/title/NVIDIA … er_suspend

Offline

#105 2024-03-18 01:03:52

juneidy
Member
Registered: 2023-11-14
Posts: 9

Re: NVIDIA - cannot resume from suspend with PreserveVideoMemoryAllocation

seth wrote:

That's what the entire PreserveVideoMemoryAllocation is meant to overcome.
If the VRAM doesn't get stored or RAM it won't get refreshed during S3's and start to decay and your GL textures become black (invalid) or garbage (noise)

Thanks, that solved my problem completely!

Installed
* core/linux 6.8.1.arch1-1
* extra/nvidia 550.54.14-7

this morning and I still had the same issue where lock screen would not show until I swtiched to ctrl+alt+F2 and ctrl+alt+F7 until I applied the parameter seth suggested.

I am now able to sleep and wake normally!

Offline

#106 2024-03-18 11:51:37

lorenzol36
Member
Registered: 2018-10-29
Posts: 14

Re: NVIDIA - cannot resume from suspend with PreserveVideoMemoryAllocation

juneidy wrote:

I am now able to sleep and wake normally!

Can you suspend fine if you don't have all the PreserveVideoMemoryAllocations setup? Because some of us don't use the kernel parameter and yet are unable to suspend (while we could with previous driver versions).

EDIT: I've just gave a look to the changelog of 550.67 version and I noticed this:

Fixed a bug that caused "Flip event timeout" messages to be printed to the system log when the system is suspended without using /usr/bin/nvidia-sleep.sh when nvidia-drm is loaded with the `fbdev=1` kernel module parameter.

I was using nvidia-drm.fbdev=1 as a kernel parameter and this made me suspicious. I tried to remove it and I now can resume the system again without any problems on version 550.54.14.

Last edited by lorenzol36 (2024-03-21 01:22:21)

Offline

#107 2024-04-03 17:19:44

bertieb
Member
Registered: 2023-11-29
Posts: 9

Re: NVIDIA - cannot resume from suspend with PreserveVideoMemoryAllocation

I upgraded to Linux 6.8.2 (from 6.5.8) and nvidia 550.67-1 (from 535.113.01-6) on Apr 1st, and was able to suspend and resume 3-4 times.

However, on resume after a suspend of ~1.5h this afternoon I once again experienced the same symptoms as before and as others in the thread have had: black screens, no input from keyboard, and no response to ssh. I used the ACPI reset button to reboot.

Last boot log shows nothing after 'resuming' from the last suspend, but here it is (trying to include it in a code block caused this tab to be very laggy when responding to input): https://0x0.st/XzkO.log

Offline

#108 2024-04-03 18:20:59

lorenzol36
Member
Registered: 2018-10-29
Posts: 14

Re: NVIDIA - cannot resume from suspend with PreserveVideoMemoryAllocation

Unfortunately I'm also experiencing problems again after some days of a functioning suspension. If I use nvidia-drm.fbdev=1 and  nvidia-drm.modeset=1 kernel parameters I have the same problem as before: black screen and have to switch to tty2 first and then to tty7 to finally see the screen. If I don't use the kernel parameters sometimes I can resume, sometimes I have a black screen with no responding mouse and keyboard that forces me to use the hardware reset button. At least that's what have been happening to me while testing.

Offline

#109 2024-04-03 18:32:18

obap74
Member
Registered: 2021-03-18
Posts: 79

Re: NVIDIA - cannot resume from suspend with PreserveVideoMemoryAllocation

bertieb wrote:

Last boot log shows nothing after 'resuming' from the last suspend

Same here every time this issue occurs.

Thanks for the feedback with 6.8.2 / 550. I'm still on 6.1 / 535 since suspend/hibernate works reliably.

As time goes by, the less likely this issue is going to be fixed.
535 will be EOL in June 2026, 6.1 will be EOL in December 2026. If it's not fixed till then, I'll have to get a new GPU or stop suspending/hibernating I guess.

Offline

#110 2024-04-21 07:16:47

verbbis
Member
Registered: 2009-09-02
Posts: 27

Re: NVIDIA - cannot resume from suspend with PreserveVideoMemoryAllocation

Gooberslot wrote:

I'm using a GTX 980 Ti and I also can't resume from suspend with anything newer than 535.

Also a GTX 980 Ti user here. Just tested with the latest kernel and nvidia packages:

core/linux 6.8.7.arch1-1
extra/nvidia 550.76-1

I've been playing around debugging kernel resume with the help of pm_trace, but I guess it's the Nvidia driver specifically which makes this even harder. My test scenario: no display manager/X11, just console, nvidia_drm.modeset=1. Force suspend with:

sudo sh -c "sync && echo 1 > /sys/power/pm_trace && systemctl suspend"

At least resuming does not result in a hard lockup anymore as it used to e.g. networking still works. I do get just a black screen, though. These are the only lines in dmesg which look relevant:

[   61.626350] nvidia-modeset: ERROR: GPU:0: Failed to bind display engine notify surface descriptor: 0x1a (Ran out of a critical resource, other than memory [NV_ERR_INSUFFICIENT_RESOURCES])
[   61.626484] nvidia-modeset: ERROR: GPU:0: Failed to allocate display engine core DMA push buffer
[   61.626767] nvidia-modeset: ERROR: GPU:0: Failed to bind display engine notify surface descriptor: 0x1a (Ran out of a critical resource, other than memory [NV_ERR_INSUFFICIENT_RESOURCES])
[   61.627006] nvidia-modeset: ERROR: GPU:0: Failed to allocate display engine core DMA push buffer

EDIT: Interestingly, enabling Nvidia's framebuffer implementation with nvidia_drm.fbdev=1 gets rid of that error and resume succeeds. I have to run with this for a while and see how reliable it is. Gnome/X11 does crash horribly, but the errors look like something enabling NVreg_PreserveVideoMemoryAllocations might fix.

EDIT2: Enabled NVreg_PreserveVideoMemoryAllocations and the relevant systemd hooks. Not reliable with errors like these:

[ 1343.717038] [drm:__nv_drm_gem_nvkms_map [nvidia_drm]] *ERROR* [nvidia-drm] [GPU ID 0x00000700] Failed to map NvKmsKapiMemory 0x00000000a2a719a1
[ 1597.364247] INFO: task nvidia-modeset/:474 blocked for more than 122 seconds.
[ 1597.364256]       Tainted: P           OE      6.8.7-arch1-1 #1
[ 1597.364261] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 1597.364265] task:nvidia-modeset/ state:D stack:0     pid:474   tgid:474   ppid:2      flags:0x00004000
[ 1597.364269] Call Trace:
[ 1597.364271]  <TASK>
[ 1597.364273]  __schedule+0x3e6/0x1520
[ 1597.364282]  schedule+0x32/0xd0
[ 1597.364286]  schedule_preempt_disabled+0x15/0x30
[ 1597.364289]  rwsem_down_read_slowpath+0x2aa/0x540
[ 1597.364295]  ? __pfx__main_loop+0x10/0x10 [nvidia_modeset 3fcb72663fb07e8d23115012bbd6cac6605a279b]
[ 1597.364315]  down_read+0x48/0xb0
[ 1597.364318]  nvkms_kthread_q_callback+0x149/0x170 [nvidia_modeset 3fcb72663fb07e8d23115012bbd6cac6605a279b]
[ 1597.364336]  _main_loop+0x99/0x170 [nvidia_modeset 3fcb72663fb07e8d23115012bbd6cac6605a279b]
[ 1597.364355]  kthread+0xe8/0x120
[ 1597.364359]  ? __pfx_kthread+0x10/0x10
[ 1597.364363]  ret_from_fork+0x34/0x50
[ 1597.364366]  ? __pfx_kthread+0x10/0x10
[ 1597.364369]  ret_from_fork_asm+0x1b/0x30
[ 1597.364375]  </TASK>

Sigh. Back to disabling suspend I guess.

Last edited by verbbis (2024-04-21 15:15:49)

Offline

Board footer

Powered by FluxBB