You are not logged in.

#26 2025-06-27 14:37:11

theo2438
Member
Registered: 2025-06-25
Posts: 13

Re: suspend/resume not working on nvidia gpu

seth wrote:

Do you get away w/

mt7921e.disable_aspm=1

Just tested this & sadly no. I still need pcie_aspm=off to get it to work.

Offline

#27 2025-06-27 15:00:05

seth
Member
From: Don't DM me only for attention
Registered: 2012-09-03
Posts: 68,405

Re: suspend/resume not working on nvidia gpu

Have you tested the S3 ("deep" sleep) behavior after the BIOS update?

Offline

#28 2025-06-28 08:10:51

theo2438
Member
Registered: 2025-06-25
Posts: 13

Re: suspend/resume not working on nvidia gpu

seth wrote:

Have you tested the S3 ("deep" sleep) behavior after the BIOS update?

Yes, it wouldn't wakeup. In fact even sysrq didn't work.

Offline

#29 2025-06-28 16:21:49

seth
Member
From: Don't DM me only for attention
Registered: 2012-09-03
Posts: 68,405

Re: suspend/resume not working on nvidia gpu

'key sad
You want to make sure that the system draws considerably less power during the s2idle that when you're just letting it sit around (you can maybe on a lazy day compare both conditions for a 60m interval and see what impact they have on the battery)

Offline

#30 2025-06-29 17:19:40

theo2438
Member
Registered: 2025-06-25
Posts: 13

Re: suspend/resume not working on nvidia gpu

seth wrote:

'key sad
You want to make sure that the system draws considerably less power during the s2idle that when you're just letting it sit around (you can maybe on a lazy day compare both conditions for a 60m interval and see what impact they have on the battery)

Yes, that's what I was thinking of doing too. Thanks a ton for helping out!

Offline

#31 2025-09-21 16:23:59

freenull
Member
Registered: 2025-09-21
Posts: 2

Re: suspend/resume not working on nvidia gpu

Surprisingly enough, this is actually an NVIDIA issue (or at least it is in the case of the 4070 Mobile, see below).
I have the same laptop model (15ARP9), but with GTX 4070 Mobile graphics instead of 4060.
I basically went through every single step trying to debug this as you @theo2438, never getting wakeup logs as well. I finally landed on the

pcie_aspm=off

workaround, but I didn't like the idea of turning off ASPM and sacrificing battery for something that works OOTB on Windows.
In every single of the tens of journals I've looked at, I've noticed this peculiar line:

ACPI BIOS Error (bug): Could not resolve symbol [\_SB.PCI0.GPP0.PEGP.GPS.NVD1], AE_NOT_FOUND (20250404/psargs-332)

I initially disregarded it as an unrelated bug in the BIOS (or in NVIDIA drivers, but I disregarded NVIDIA as the cause just like you have). However, having no other option I searched more and eventually stumbled on this post on Lenovo forums:
https://forums.lenovo.com/t5/Gaming-Lap … p/10001433

It mentions this exact log line, mentions an infinite loop (this is what really caught my attention) and links to a bug report in NVIDIA open source drivers:

if nvidia driver is allowed to turn off dGPU memory - results in a cycle D0->D3Cold->D0->D3Cold->D0... preventing dGPU from sleeping

https://github.com/NVIDIA/open-gpu-kern … issues/905

I'm a bit lost on the GPU driver terminology, but at this point it's clear that this issue somehow affects suspend:

dGPU can enter suspended state and turn off memory but won't stay cold. dGPU is woken up immediately after ending transition to D3cold. Waiting 15s causes GPU to sleep again, only to repeat the cycle. This behaviour persists even after killing all graphical interface (gdm/gnome/wayland/xorg) and no applications are using the GPU.

I decided to give it a try. There are workarounds in the replies, with the latest workaround involving setting these options in /etc/modprobe.d/nvidia.conf:

options nvidia NVreg_DynamicPowerManagement=0x02
options nvidia NVreg_DynamicPowerManagementVideoMemoryThreshold=0

And the following udev rules (place in /etc/udev/rules.d/nvidia.conf):

# Enable runtime PM for NVIDIA VGA/3D controller devices on driver bind
ACTION=="bind", SUBSYSTEM=="pci", ATTR{vendor}=="0x10de", ATTR{class}=="0x030000", TEST=="power/control", ATTR{power/control}="auto"
ACTION=="bind", SUBSYSTEM=="pci", ATTR{vendor}=="0x10de", ATTR{class}=="0x030200", TEST=="power/control", ATTR{power/control}="auto"

# Disable runtime PM for NVIDIA VGA/3D controller devices on driver unbind
ACTION=="unbind", SUBSYSTEM=="pci", ATTR{vendor}=="0x10de", ATTR{class}=="0x030000", TEST=="power/control", ATTR{power/control}="on"
ACTION=="unbind", SUBSYSTEM=="pci", ATTR{vendor}=="0x10de", ATTR{class}=="0x030200", TEST=="power/control", ATTR{power/control}="on"

If I understand imaGuru correctly, setting dynamic power management to 0x02 will cause the driver to start with no power management. Then, we can use udev to turn on the power management. For some reason, starting with no power management "either triggers a different code path or just delays gpu suspend/wakeup" in comparison to 0x03 (power management on), which fixes the infinite loop the GPU runs into.

I tried it - and it works! Without disabling ASPM or any power management functions. Now according to the NVIDIA docs setting DynamicPowerManagementVideoMemoryThreshold to 0 stops the driver from ever completely turning off video memory, which still may incur some extra power draw, but it should be less than disabling ASPM across the board.
Also, while I am using nvidia-open and this issue is reported for the open source drivers, replies indicate that it's reproducible on the proprietary drivers as well.

Offline

#32 2025-09-25 08:57:18

freenull
Member
Registered: 2025-09-21
Posts: 2

Re: suspend/resume not working on nvidia gpu

I have never had an issue this bizarre.
I installed the latest BIOS (as of 25.09.2025) through Lenovo Vantage a couple of days ago. Coincidentally, soon Bluetooth stability was decimated on both Windows and Linux making gamepads unusable. With a power drain (holding power button when turned off for 60s) I've been able to get it to work well on Linux at least.
Yesterday I realized that my workaround for suspend no longer works and suspending locks up the system just like before. I rebooted with pcie_aspm=off, and to my surprise, disabling ASPM didn't fix it eithezr. I spent another 8 hours scanning through basically everything there is online to find out how to *at the very least* restore suspend with pcie_aspm=off, to no avail until I reset the BIOS (see below).

Some of the things I have tried (including in combination) which did not immediately fix it:
- Restoring linux and nvidia versions to the ones that had worked before (I'd done a pacman update after the BIOS update too, although only the patch version of linux and nvidia changed)
- Switching between nvidia and nvidia-open
- Disabling the ideapad_laptop module
- Setting acpi_osi to "linux", "Linux", "Windows 2022" or "Windows 2024"
- Uninstalling power-profiles-daemon (used placeholder driver), see here
- Forcing NVIDIA into the Kernel driver callback mode by configuring [c]NVreg_PreserveVideoMemoryAllocations=0[/c] and directly requesting sleep through echo mem > /sys/power/state, see NVIDIA docs
- Debugging with pm_test - inconsistent behavior, but typically broke at platform. No logs could be gathered when it hanged.
- Disabling pm_async

During this time the issue was getting worse/more unpredictable - sometimes, the laptop would reboot upon wakeup instead of entering the previous hang state. I've also noticed that the power button LED behaved "sleepy", it would enter the flashing animation basically every time that the backlight turned off even for a second, before quickly going back to normal operation, like it was dozing off. In particular this would happen on reboot and during modesetting, and sometimes this would continue while the display was off after a resume attempt.

Ultimately and surprisingly what fixed it for me was to enter the BIOS and reset to default, then save and quit. It was a last resort, and it actually brought me back to a state where suspend worked with pcie_aspm=off.

While troubleshooting I'd touched a couple of things in the hidden settings* which I was fairly certain I had restored. Unfortunately I have no clue what resetting the BIOS settings actually did.
The power button LEDs are no longer dozing off when the display goes off, so I strongly suspect that BIOS was somehow set in a confused state. Perhaps I did actually leave one of the hidden settings set to the wrong value, or maybe Lenovo Vantage or the BIOS update put the BIOS into a confused state, or maybe Windows Update did. Perhaps the BIOS was in a confused state out of the box.

*To get hidden BIOS settings, you have to enter the settings, press Fn+R+N like 5 times, F10 to save and quit, then immediately reenter BIOS by spamming F2. You should see new tabs like AMD PSP, which is where you can actually configure power saving. Don't get excited for the "S3 Enable" option in Power Management though, it works even worse than S2idle (doesn't wake up, and REISUB doesn't work).

Interestingly enough, I actually don't need any power management workaround now to suspend properly! All it takes is this option in /etc/modprobe.d/nvidia.conf:

option nvidia NVreg_EnableS0ixPowerManagement=1

This is just an option to turn on S0ix power saving in the GPU. I confirmed that EnableS0ixPowerManagement was set to 0 by default. Setting it to 1, rebuilding initramfs (mkinitcpio -P) and rebooting produced a system that can suspend with ASPM and with GPU power saving smile
I have absolutely no clue if this is because of the BIOS update or if I could've done this before if I reset the BIOS settings.

The final setup is as follows:

cat /proc/cmdline
# root=PARTUUID=XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX rw add_efi_memmap quiet splash initrd=\initramfs-linux.img

pacman -Qi linux | grep Version
# Version         : 6.16.8.arch3-1

# pacman -Qi nvidia-open | grep Version
Version         : 580.82.09-5

sudo cat /sys/power/pm_async
# 1
# (1 is the default)

sudo cat /proc/driver/nvidia/params | grep -i s0ix
# EnableS0ixPowerManagement: 1
# S0ixPowerManagementVideoMemoryThreshold: 256

rfkill list
# 0: ideapad_wlan: Wireless LAN
#         Soft blocked: no
#         Hard blocked: no
# 1: ideapad_bluetooth: Bluetooth
#         Soft blocked: no
#         Hard blocked: no
# 2: hci0: Bluetooth
#         Soft blocked: no
#         Hard blocked: no
# 3: phy0: Wireless LAN
#         Soft blocked: no
#         Hard blocked: no
#
# (Sometimes bluetooth/wlan drivers are blamed for suspend issues, but all of mine are unblocked)

Inxi:

System:
  Kernel: 6.16.8-arch3-1 arch: x86_64 bits: 64
  Desktop: KDE Plasma v: 6.4.5 Distro: Arch Linux
Machine:
  Type: Laptop System: LENOVO product: 83JC v: LOQ 15ARP9
    serial: <superuser required>
  Mobo: LENOVO model: LNVNB161216 v: SDK0T76463 WIN
    serial: <superuser required> UEFI: LENOVO v: PQCN24WW date: 06/02/2025
Battery:
  ID-1: BAT1 charge: 21.5 Wh (35.3%) condition: 60.9/60 Wh (101.6%) volts: 15
    min: 15.44
CPU:
  Info: 8-core model: AMD Ryzen 7 7435HS bits: 64 type: MT MCP cache:
    L2: 4 MiB
  Speed (MHz): avg: 1097 min/max: 412/4554 cores: 1: 1097 2: 1097 3: 1097
    4: 1097 5: 1097 6: 1097 7: 1097 8: 1097 9: 1097 10: 1097 11: 1097 12: 1097
    13: 1097 14: 1097 15: 1097 16: 1097
Graphics:
  Device-1: NVIDIA AD106M [GeForce RTX 4070 Max-Q / Mobile] driver: nvidia
    v: 580.82.09
  Device-2: Bison Integrated Camera driver: uvcvideo type: USB
  Display: wayland server: X.org v: 1.21.1.18 with: Xwayland v: 24.1.8
    compositor: kwin_wayland driver: gpu: nv_platform,nvidia,nvidia-nvswitch
    resolution: 1920x1080~144Hz
  API: EGL Message: EGL data requires eglinfo. Check --recommends.
  Info: Tools: de: kscreen-console,kscreen-doctor gpu: nvidia-smi
    x11: xprop,xrandr
Audio:
  Device-1: NVIDIA AD106M High Definition Audio driver: snd_hda_intel
  Device-2: Advanced Micro Devices [AMD] Audio Coprocessor
    driver: snd_pci_acp6x
  Device-3: Advanced Micro Devices [AMD] Family 17h/19h/1ah HD Audio
    driver: snd_hda_intel
  API: ALSA v: k6.16.8-arch3-1 status: kernel-api
  Server-1: PipeWire v: 1.4.8 status: active
Network:
  Device-1: Realtek RTL8111/8168/8211/8411 PCI Express Gigabit Ethernet
    driver: r8169
  IF: enp2s0 state: down mac: <filter>
  Device-2: MEDIATEK MT7921 802.11ax PCI Express Wireless Network Adapter
    driver: mt7921e
  IF: wlan0 state: up mac: <filter>
Bluetooth:
  Device-1: Foxconn / Hon Hai MediaTek Bluetooth Adapter driver: btusb
    type: USB
  Report: btmgmt ID: hci0 state: up address: <filter> bt-v: 5.3
Drives:
  Local Storage: total: 953.87 GiB used: 118.34 GiB (12.4%)
  ID-1: /dev/nvme0n1 vendor: Micron model: MTFDKCD1T0QFM-1BD1AABLA
    size: 953.87 GiB
Partition:
  ID-1: / size: 97.87 GiB used: 16.09 GiB (16.4%) fs: ext4 dev: /dev/nvme0n1p3
  ID-2: /boot size: 1022 MiB used: 776.9 MiB (76.0%) fs: vfat
    dev: /dev/nvme0n1p1
  ID-3: /home size: 404.46 GiB used: 101.49 GiB (25.1%) fs: ext4
    dev: /dev/nvme0n1p4
Swap:
  ID-1: swap-1 type: partition size: 8 GiB used: 0 KiB (0.0%)
    dev: /dev/nvme0n1p2
Sensors:
  System Temperatures: cpu: 48.0 C mobo: 36.0 C
  Fan Speeds (rpm): fan-1: 0
Info:
  Memory: total: 16 GiB available: 15.3 GiB used: 2.42 GiB (15.8%)
  Processes: 328 Uptime: 55m Shell: fish inxi: 3.3.39

Note: It doesn't seem like there are any changes between kernel -arch1 and -arch3 that would affect this device (link) and nvidia-open -arch3 to -arch5 is just a change in the Linux version, so it has to be either that the BIOS update fixed something but required a reset to work properly (Lenovo release notes don't seem to mention anything PM related, but you can't really trust them to be complete) or that even before the update, resetting the BIOS would have fixed suspend with S0ix enabled on the nvidia driver and thus something was broken out of the box or by installing/using/updating Windows.
Also, it works perfectly fine when booting into Windows, sleeping there, resuming and then rebooting back into Linux.

Additionally, I can confirm the device enters S0i3 properly (so it's not a farce). I suspended it for a few seconds, resumed and printed /sys/kernel/debug/amd_pmc/smu_fw_info:

=== SMU Statistics ===
Table Version: 3
Hint Count: 1
Last S0i3 Status: Success
Time (in us) to S0i3: 409183
Time (in us) in S0i3: 17675353
Time (in us) to resume from S0i3: 369574

=== Active time (in us) ===
DISPLAY  : 0
VDD      : 0
ACP      : 0
VCN      : 0
DF       : 0
USB3_0   : 0
USB3_1   : 0

17675353us is about 17s, which sounds right.

Last edited by freenull (2025-09-25 10:48:00)

Offline

#33 2025-09-25 13:42:26

seth
Member
From: Don't DM me only for attention
Registered: 2012-09-03
Posts: 68,405

Re: suspend/resume not working on nvidia gpu

You should probably record your findings for this model at https://wiki.archlinux.org/title/Laptop/Lenovo so they won't get lost in this thread and time wink

Offline

Board footer

Powered by FluxBB