You are not logged in.

#1 2023-02-02 04:00:48

victrid
Member
Registered: 2021-01-04
Posts: 5

nvidia-resume not working when recovering from hibernate

Hello, I'm trying to make hibernation and sleep work on my desktop PC. The PC uses dedicated nvidia GPU (RTX3080) only and having integrated GPU disabled.

The PC has no problem sleep, i.e. suspend to RAM. However, when recovering from hibernation, i.e. suspend to disk, the journal showed content below and failed, backoff to a fresh start and losing states.

When recovering from hibernation, the journalctl and dmesg shows:

PreserveVideoMemoryAllocations module parameter is set. System Power Management attempted without driver procfs suspend interface. Please refer to the 'Configuring Power Management Support' section in the driver README.

The nvidia documentation only hints that the power management related services should be enabled, but when enabled it still doesn't work. both enabling (as instructed by Nvidia) and disabling(as instructed by ArchWiki) nvidia-resume.service didn't work.

I have only one monitor, and uses sddm to handle X related stuff.

To enable hibernation and sleep, I've investigated

ArchWiki: Preserve video memory after suspend
[solved] nvidia - GPU has fallen off the bus when returning from sleep

Having related services enabled:

# systemctl list-unit-files | grep nvidia
nvidia-hibernate.service                                                  enabled         disabled
nvidia-persistenced.service                                               disabled        disabled
nvidia-powerd.service                                                     disabled        disabled
nvidia-resume.service                                                     enabled         disabled
nvidia-suspend.service                                                    enabled         disabled

Added resume parameter to kernel parameters (resume is located on a full-provisioned LVM volume), and pcie_aspm parameter as [solved] nvidia - GPU has fallen off the bus when returning from sleep hinted.

root=/dev/mapper/base-root rw resume=UUID=5faea641-3d4b-43b3-a7ee-b485c949ae9e rw loglevel=3 ibt=off nvidia_drm.modeset=1 pcie_aspm=off

Nvidia related parameters are set in modprobe config (the directory /var/tmp locates in root fs, i.e. /dev/mapper/base-root), which should be loaded with mkinitcpio hook modconf. The initcpio image is regenerated after any modification.

# cat /etc/modprobe.d/nvidia-power-management.conf 
options nvidia NVreg_PreserveVideoMemoryAllocations=1 NVreg_TemporaryFilePath=/var/tmp

The initcpio is configured to enable early KMS. Effective lines are:

MODULES=(nvidia nvidia_modeset nvidia_uvm nvidia_drm tpm_crb)
BINARIES=()
FILES=()
HOOKS=(base systemd sd-encrypt autodetect modconf block lvm2 filesystems resume keyboard fsck)

The driver and kernel information:

# uname -a
Linux Victrid-Desktop 6.1.8-arch1-1 #1 SMP PREEMPT_DYNAMIC Tue, 24 Jan 2023 21:07:04 +0000 x86_64 GNU/Linux
# nvidia-smi
Thu Feb  2 11:44:21 2023       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.85.05    Driver Version: 525.85.05    CUDA Version: 12.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  Off  | 00000000:01:00.0  On |                  N/A |
|  0%   22C    P8    13W / 340W |    590MiB / 10240MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

The journalctl -xb hibernation related part:

Feb 01 22:21:05 archlinux kernel: PM: Image signature found, resuming
Feb 01 22:21:05 archlinux kernel: PM: hibernation: resume from hibernation
Feb 01 22:21:05 archlinux kernel: random: crng reseeded on system resumption
Feb 01 22:21:43 archlinux kernel: Freezing user space processes ... (elapsed 0.001 seconds) done.
Feb 01 22:21:43 archlinux kernel: OOM killer disabled.
Feb 01 22:21:43 archlinux kernel: Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done.
Feb 01 22:21:43 archlinux kernel: PM: hibernation: Marking nosave pages: [mem 0x00000000-0x00000fff]
Feb 01 22:21:43 archlinux kernel: PM: hibernation: Marking nosave pages: [mem 0x0009e000-0x0009efff]
Feb 01 22:21:43 archlinux kernel: PM: hibernation: Marking nosave pages: [mem 0x000a0000-0x000fffff]
Feb 01 22:21:43 archlinux kernel: PM: hibernation: Marking nosave pages: [mem 0x47e95000-0x47e95fff]
Feb 01 22:21:43 archlinux kernel: PM: hibernation: Marking nosave pages: [mem 0x47e9c000-0x47e9cfff]
Feb 01 22:21:43 archlinux kernel: PM: hibernation: Marking nosave pages: [mem 0x5d141000-0x5d141fff]
Feb 01 22:21:43 archlinux kernel: PM: hibernation: Marking nosave pages: [mem 0x5d14e000-0x5d14ffff]
Feb 01 22:21:43 archlinux kernel: PM: hibernation: Marking nosave pages: [mem 0x5d175000-0x5d175fff]
Feb 01 22:21:43 archlinux kernel: PM: hibernation: Marking nosave pages: [mem 0x6ad4c000-0x6ae3ffff]
Feb 01 22:21:43 archlinux kernel: PM: hibernation: Marking nosave pages: [mem 0x6dc4b000-0x6dc4bfff]
Feb 01 22:21:43 archlinux kernel: PM: hibernation: Marking nosave pages: [mem 0x705ae000-0x75ffefff]
Feb 01 22:21:43 archlinux kernel: PM: hibernation: Marking nosave pages: [mem 0x76000000-0xffffffff]
Feb 01 22:21:43 archlinux kernel: PM: hibernation: Basic memory bitmaps created
Feb 01 22:21:43 archlinux kernel: PM: Using 3 thread(s) for decompression
Feb 01 22:21:43 archlinux kernel: PM: Loading and decompressing image data (2529087 pages)...
Feb 01 22:21:43 archlinux kernel: PM: Image loading progress:   0%
Feb 01 22:21:43 archlinux kernel: PM: Image loading progress:  10%
Feb 01 22:21:43 archlinux kernel: PM: Image loading progress:  20%
Feb 01 22:21:43 archlinux kernel: PM: Image loading progress:  30%
Feb 01 22:21:43 archlinux kernel: PM: Image loading progress:  40%
Feb 01 22:21:43 archlinux kernel: PM: Image loading progress:  50%
Feb 01 22:21:43 archlinux kernel: PM: Image loading progress:  60%
Feb 01 22:21:43 archlinux kernel: PM: Image loading progress:  70%
Feb 01 22:21:43 archlinux kernel: PM: Image loading progress:  80%
Feb 01 22:21:43 archlinux kernel: PM: Image loading progress:  90%
Feb 01 22:21:43 archlinux kernel: PM: Image loading progress: 100%
Feb 01 22:21:43 archlinux kernel: PM: Image loading done
Feb 01 22:21:43 archlinux kernel: PM: hibernation: Read 10116348 kbytes in 19.79 seconds (511.18 MB/s)
Feb 01 22:21:43 archlinux kernel: PM: Image successfully loaded
Feb 01 22:21:43 archlinux kernel: printk: Suspending console(s) (use no_console_suspend to debug)
Feb 01 22:21:43 archlinux kernel: serial 00:01: disabled
Feb 01 22:21:43 archlinux kernel: NVRM: GPU 0000:01:00.0: PreserveVideoMemoryAllocations module parameter is set. System Power Management attempted without driver procfs suspend interface. Please refer to the 'Configuring Power Management Support' section in the driver README.
Feb 01 22:21:43 archlinux kernel: nvidia 0000:01:00.0: PM: pci_pm_freeze(): nv_pmops_freeze+0x0/0x20 [nvidia] returns -5
Feb 01 22:21:43 archlinux kernel: nvidia 0000:01:00.0: PM: dpm_run_callback(): pci_pm_freeze+0x0/0xc0 returns -5
Feb 01 22:21:43 archlinux kernel: nvidia 0000:01:00.0: PM: failed to quiesce async: error -5
Feb 01 22:21:43 archlinux kernel: serial 00:01: activated
Feb 01 22:21:43 archlinux kernel: pcieport 10000:e0:06.0: can't derive routing for PCI INT A
Feb 01 22:21:43 archlinux kernel: PM: hibernation: Failed to load image, recovering.
Feb 01 22:21:43 archlinux kernel: nvme 10000:e1:00.0: PCI INT A: no GSI
Feb 01 22:21:43 archlinux kernel: nvme nvme0: Shutdown timeout set to 10 seconds
Feb 01 22:21:43 archlinux kernel: nvme nvme0: 18/0/0 default/read/poll queues
Feb 01 22:21:43 archlinux kernel: PM: hibernation: Basic memory bitmaps freed
Feb 01 22:21:43 archlinux kernel: OOM killer enabled.
Feb 01 22:21:43 archlinux kernel: Restarting tasks ... done.
Feb 01 22:21:43 archlinux kernel: PM: hibernation: resume failed (-5)

And journalctl from last boot, i.e. invoked hibernate:

Feb 01 22:20:02 Victrid-Desktop systemd[1]: Reached target Sleep.
-- Subject: A start job for unit sleep.target has finished successfully
-- Defined-By: systemd
-- Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel
-- 
-- A start job for unit sleep.target has finished successfully.
-- 
-- The job identifier is 3565.
Feb 01 22:20:02 Victrid-Desktop systemd[1]: Starting NVIDIA system hibernate actions...
-- Subject: A start job for unit nvidia-hibernate.service has begun execution
-- Defined-By: systemd
-- Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel
-- 
-- A start job for unit nvidia-hibernate.service has begun execution.
-- 
-- The job identifier is 3566.
Feb 01 22:20:02 Victrid-Desktop hibernate[6282]: nvidia-hibernate.service
Feb 01 22:20:02 Victrid-Desktop logger[6282]: <13>Feb  1 22:20:02 hibernate: nvidia-hibernate.service
Feb 01 22:20:02 Victrid-Desktop bluetoothd[978]: Endpoint unregistered ....
Feb 01 22:20:03 Victrid-Desktop systemd[1]: nvidia-hibernate.service: Deactivated successfully.
-- Subject: Unit succeeded
-- Defined-By: systemd
-- Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel
-- 
-- The unit nvidia-hibernate.service has successfully entered the 'dead' state.
Feb 01 22:20:03 Victrid-Desktop systemd[1]: Finished NVIDIA system hibernate actions.
-- Subject: A start job for unit nvidia-hibernate.service has finished successfully
-- Defined-By: systemd
-- Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel
-- 
-- A start job for unit nvidia-hibernate.service has finished successfully.
-- 
-- The job identifier is 3566.
Feb 01 22:20:03 Victrid-Desktop audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=nvidia-hibernate comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
Feb 01 22:20:03 Victrid-Desktop audit[1]: SERVICE_STOP pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=nvidia-hibernate comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
Feb 01 22:20:03 Victrid-Desktop systemd[1]: Starting Hibernate...
-- Subject: A start job for unit systemd-hibernate.service has begun execution
-- Defined-By: systemd
-- Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel
-- 
-- A start job for unit systemd-hibernate.service has begun execution.
-- 
-- The job identifier is 3562.
Feb 01 22:20:03 Victrid-Desktop systemd-sleep[6291]: Entering sleep state 'hibernate'...
-- Subject: System sleep state hibernate entered
-- Defined-By: systemd
-- Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel
-- 
-- The system has now entered the hibernate sleep state.
Feb 01 22:20:03 Victrid-Desktop kernel: PM: hibernation: hibernation entry

Offline

#2 2023-02-25 17:36:57

aaronrancsik
Member
Registered: 2020-01-27
Posts: 4

Re: nvidia-resume not working when recovering from hibernate

Hi, does anyone have the solution? I have the same exact problem.

Last edited by aaronrancsik (2023-02-25 17:38:16)

Offline

#3 2023-02-26 00:53:50

Soyman
Member
Registered: 2017-06-23
Posts: 25

Re: nvidia-resume not working when recovering from hibernate

I too am facing the exact same problem.
Changing NVreg_PreserveVideoMemoryAllocations=0 makes hibernation work somewhat but that causes other issues I was trying to alleviate by enabling NVreg_PreserveVideoMemoryAllocations=1
Namely that my GPU stops being visible as a CUDA device after resume and demonstrates other somewhat erratic behavior.

Offline

#4 2023-02-27 11:02:27

V1del
Forum Moderator
Registered: 2012-10-16
Posts: 23,044

Re: nvidia-resume not working when recovering from hibernate

What are you setting the path to that nvidia stores it's memory in in  the enabled case? By default this will be /tmp which is a tmpfs which is cleared on shutdown ergo nvidia won't find the stored memory anymore. Change NVreg_TemporaryFilePath to a path with sufficient free space that isn't under /tmp

Online

#5 2023-11-05 13:36:45

ifaigios
Member
Registered: 2012-07-09
Posts: 8

Re: nvidia-resume not working when recovering from hibernate

Sorry for necrobumping, I had the same issue and I found the solution so I found it worthwhile to post it here.

The solution was to have the exact setup as the OP, but also remove all nvidia-related modules from the MODULES array of /etc/mkinitcpio.conf , as well the kms hook from the HOOKS array. However, I kept nvidia_drm.modeset=1 in the kernel parameters.

Now I still have full resolution fb at boot, and both suspend and hibernate work properly. This is with the latest closed-source nvidia package and linux-6.6

Last edited by ifaigios (2023-11-05 14:09:31)

Offline

#6 2023-11-05 14:12:39

seth
Member
Registered: 2012-09-03
Posts: 58,118

Re: nvidia-resume not working when recovering from hibernate

https://bbs.archlinux.org/viewtopic.php?id=285508

Furthermore, I removed nvidia_drm.modeset=1 from the kernel parameters.

You probably want to restore that.

Offline

#7 2024-09-28 13:40:09

Soyman
Member
Registered: 2017-06-23
Posts: 25

Re: nvidia-resume not working when recovering from hibernate

V1del wrote:

What are you setting the path to that nvidia stores it's memory in in  the enabled case? By default this will be /tmp which is a tmpfs which is cleared on shutdown ergo nvidia won't find the stored memory anymore. Change NVreg_TemporaryFilePath to a path with sufficient free space that isn't under /tmp

OP's was set to /var/tmp and mine was set to /opt/tmp
There was always enough space on my drive so I'd be surprised if that was the issue.

Hibernation seems to work for me now, but I don't use it much anymore.

Offline

Board footer

Powered by FluxBB