Resume issue - nvme related ?

a-curious-crow · 2025-06-03 03:53:50

Ok i think i'm done tinkering for tonight. None of this is working:

GRUB_CMDLINE_LINUX_DEFAULT="loglevel=3 quiet splash nvme_core.default_ps_max_latency_us=0 pcie_aspm=off pcie_port_pm=off iommu=soft amd_iommu=off nvme.noacpi=1 acpi_rev_override=1 acpi_osi=Linux mem_sleep_default=deep acpiphp.disabled=1"

I also tried:

> lspci
01:00.0 Non-Volatile memory controller: Intel Corporation PCIe Data Center SSD (rev 01)
> echo 0 > /sys/bus/pci/devices/0000:01:00.0/d3cold_allowed

as I found in some other forums.

My drive is `ID-1: /dev/nvme0n1 vendor: Intel model: SSDPEDMW012T4 size: 1.09 TiB`

It seems like the logical next step is to try transferring my install to a new drive and see what happens. I would really prefer not to do this if possible, as I want to use this drive to store stuff anyway and AFAICT it will fail to work on resume even if it is not the root drive. It's so frustrating that everything was working with my old cpu.

seth · 2025-06-03 06:53:18

Do the GRUB_CMDLINE_LINUX_DEFAULT edits show up in /proc/cmdline after a reboot (because editing /etc/default/grub alone doesn't do anything)
Install busybox and start it in a terminal emulator (make it xterm, nothing GL or overly fancy) and also a console on a different TTY *before* sending the system to sleep.
Likewise "dmesg -w"
The processes in RAM should™ not be affected if you're losing the root partition, dmesg might record some errors and busybox provides many standard tools (in the running process) that will allow you to inspect the situation a bit, https://busybox.net/downloads/BusyBox.html

a-curious-crow · 2025-06-04 04:33:54

They do show up in /proc/cmdline, I'm running a grub command to regenerate the config after my changes.

 cat /proc/cmdline                   
BOOT_IMAGE=/boot/vmlinuz-linux root=UUID=3e3bed23-02b9-4d03-83e2-98fd9afebf37 rw loglevel=3 quiet splash nvme_core.default_ps_max_latency_us=0 pcie_aspm=off pcie_port_pm=off iommu=soft amd_iommu=off nvme.noacpi=1 acpi_rev_override=1 acpi_osi=Linux mem_sleep_default=deep acpiphp.disabled=1

xterm stays alive through the sleep/wake! And I'm finally getting some logs!!

[  206.179491] r8169 0000:05:00.0 enp5s0: Link is Down
[  206.209622] PM: suspend entry (deep)
[  206.211562] Filesystems sync: 0.001 seconds
[  206.212168] Freezing user space processes
[  206.213338] Freezing user space processes completed (elapsed 0.001 seconds)
[  206.213341] OOM killer disabled.
[  206.213342] Freezing remaining freezable tasks
[  206.214388] Freezing remaining freezable tasks completed (elapsed 0.001 seconds)
[  206.214408] printk: Suspending console(s) (use no_console_suspend to debug)
[  206.389634] serial 00:05: disabled
[  206.400972] sd 0:0:0:0: [sda] Synchronizing SCSI cache
[  206.400973] sd 5:0:0:0: [sdd] Synchronizing SCSI cache
[  206.400989] sd 1:0:0:0: [sdb] Synchronizing SCSI cache
[  206.400991] sd 10:0:0:0: [sde] Synchronizing SCSI cache
[  206.400991] sd 4:0:0:0: [sdc] Synchronizing SCSI cache
[  206.401316] ata2.00: Entering standby power mode
[  206.401369] ata5.00: Entering standby power mode
[  206.401616] ata1.00: Entering standby power mode
[  206.403417] ata6.00: Entering standby power mode
[  208.837799] amdgpu 0000:08:00.0: amdgpu: MODE1 reset
[  208.837810] amdgpu 0000:08:00.0: amdgpu: GPU mode1 reset
[  208.838329] amdgpu 0000:08:00.0: amdgpu: GPU smu mode1 reset
[  209.354568] amdgpu 0000:08:00.0: Refused to change power state from D0 to D3hot
[  209.354910] ACPI: PM: Preparing to enter system sleep state S3
[  209.860531] ACPI: PM: Saving platform NVS memory
[  209.860981] Disabling non-boot CPUs ...
[  209.863086] smpboot: CPU 11 is now offline
[  209.865693] smpboot: CPU 10 is now offline
[  209.868121] smpboot: CPU 9 is now offline
[  209.870115] smpboot: CPU 8 is now offline
[  209.872019] smpboot: CPU 7 is now offline
[  209.873999] smpboot: CPU 6 is now offline
[  209.874404] Spectre V2 : Update user space SMT mitigation: STIBP off
[  209.876035] smpboot: CPU 5 is now offline
[  209.877867] smpboot: CPU 4 is now offline
[  209.879821] smpboot: CPU 3 is now offline
[  209.881587] smpboot: CPU 2 is now offline
[  209.883399] smpboot: CPU 1 is now offline
[  209.884271] ACPI: PM: Low-level resume complete
[  209.884296] ACPI: PM: Restoring platform NVS memory
[  209.884472] LVT offset 0 assigned for vector 0x400
[  209.885329] Enabling non-boot CPUs ...
[  209.885521] smpboot: Booting Node 0 Processor 1 APIC 0x2
[  209.888464] CPU1 is up
[  209.888798] smpboot: Booting Node 0 Processor 2 APIC 0x4
[  209.891674] CPU2 is up
[  209.891842] smpboot: Booting Node 0 Processor 3 APIC 0x6
[  209.894625] CPU3 is up
[  209.894768] smpboot: Booting Node 0 Processor 4 APIC 0x8
[  209.897565] CPU4 is up
[  209.897721] smpboot: Booting Node 0 Processor 5 APIC 0xa
[  209.900546] CPU5 is up
[  209.900660] smpboot: Booting Node 0 Processor 6 APIC 0x1
[  209.903506] Spectre V2 : Update user space SMT mitigation: STIBP always-on
[  209.903510] CPU6 is up
[  209.903620] smpboot: Booting Node 0 Processor 7 APIC 0x3
[  209.906510] CPU7 is up
[  209.906688] smpboot: Booting Node 0 Processor 8 APIC 0x5
[  209.909561] CPU8 is up
[  209.909673] smpboot: Booting Node 0 Processor 9 APIC 0x7
[  209.912553] CPU9 is up
[  209.912663] smpboot: Booting Node 0 Processor 10 APIC 0x9
[  209.915569] CPU10 is up
[  209.915675] smpboot: Booting Node 0 Processor 11 APIC 0xb
[  209.918601] CPU11 is up
[  209.919845] ACPI: PM: Waking up from system sleep state S3
[  209.920887] nvme 0000:01:00.0: Unable to change power state from D3hot to D0, device inaccessible
[  209.983844] xhci_hcd 0000:02:00.0: xHC error in resume, USBSTS 0x401, Reinit
[  209.983849] usb usb1: root hub lost power or was reset
[  209.983852] usb usb2: root hub lost power or was reset
[  209.984164] [drm] PCIE GART of 512M enabled (table at 0x0000008000F00000).
[  209.984195] amdgpu 0000:08:00.0: amdgpu: PSP is resuming...
[  209.984732] serial 00:05: activated
[  210.039108] nvme 0000:01:00.0: Unable to change power state from D3cold to D0, device inaccessible
[  210.039169] nvme nvme0: Disabling device after reset failure: -19
[  210.062750] amdgpu 0000:08:00.0: amdgpu: reserve 0xa00000 from 0x82fd000000 for PSP TMR
[  210.079069] EXT4-fs (nvme0n1p4): shut down requested (2)
[  210.079072] Aborting journal on device nvme0n1p4-8.
[  210.079078] Buffer I/O error on dev nvme0n1p4, logical block 17334272, lost sync page write
[  210.079081] JBD2: I/O error when updating journal superblock for nvme0n1p4-8.
[  210.165412] amdgpu 0000:08:00.0: amdgpu: RAS: optional ras ta ucode is not available
[  210.179255] amdgpu 0000:08:00.0: amdgpu: SECUREDISPLAY: securedisplay ta ucode is not available
[  210.179259] amdgpu 0000:08:00.0: amdgpu: SMU is resuming...
[  210.179265] amdgpu 0000:08:00.0: amdgpu: smu driver if version = 0x0000000e, smu fw if version = 0x00000012, smu fw program = 0, version = 0x00413e00 (65.62.0)
[  210.179269] amdgpu 0000:08:00.0: amdgpu: SMU driver if version not matched
[  210.179356] amdgpu 0000:08:00.0: amdgpu: use vbios provided pptable
[  210.241293] amdgpu 0000:08:00.0: amdgpu: SMU is resumed successfully!
[  210.243217] [drm] kiq ring mec 2 pipe 1 q 0
[  210.250489] [drm] DMUB hardware initialized: version=0x02020020
[  210.292316] ata10: SATA link down (SStatus 0 SControl 300)
[  210.293316] ata9: SATA link down (SStatus 0 SControl 300)
[  210.442184] usb 1-8: reset low-speed USB device number 3 using xhci_hcd
[  210.447152] ata6: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[  210.447460] sd 5:0:0:0: [sdd] Starting disk
[  210.447727] ata6.00: configured for UDMA/133
[  210.523234] [drm] DM_MST: Differing MST start on aconnector: 00000000806968ff [id: 127]
[  210.926184] amdgpu 0000:08:00.0: amdgpu: ring gfx_0.0.0 uses VM inv eng 0 on hub 0
[  210.926188] amdgpu 0000:08:00.0: amdgpu: ring gfx_0.1.0 uses VM inv eng 1 on hub 0
[  210.926190] amdgpu 0000:08:00.0: amdgpu: ring comp_1.0.0 uses VM inv eng 4 on hub 0
[  210.926191] amdgpu 0000:08:00.0: amdgpu: ring comp_1.1.0 uses VM inv eng 5 on hub 0
[  210.926193] amdgpu 0000:08:00.0: amdgpu: ring comp_1.2.0 uses VM inv eng 6 on hub 0
[  210.926195] amdgpu 0000:08:00.0: amdgpu: ring comp_1.3.0 uses VM inv eng 7 on hub 0
[  210.926196] amdgpu 0000:08:00.0: amdgpu: ring comp_1.0.1 uses VM inv eng 8 on hub 0
[  210.926198] amdgpu 0000:08:00.0: amdgpu: ring comp_1.1.1 uses VM inv eng 9 on hub 0
[  210.926199] amdgpu 0000:08:00.0: amdgpu: ring comp_1.2.1 uses VM inv eng 10 on hub 0
[  210.926201] amdgpu 0000:08:00.0: amdgpu: ring comp_1.3.1 uses VM inv eng 11 on hub 0
[  210.926202] amdgpu 0000:08:00.0: amdgpu: ring kiq_0.2.1.0 uses VM inv eng 12 on hub 0
[  210.926204] amdgpu 0000:08:00.0: amdgpu: ring sdma0 uses VM inv eng 13 on hub 0
[  210.926206] amdgpu 0000:08:00.0: amdgpu: ring sdma1 uses VM inv eng 14 on hub 0
[  210.926207] amdgpu 0000:08:00.0: amdgpu: ring vcn_dec_0 uses VM inv eng 0 on hub 8
[  210.926209] amdgpu 0000:08:00.0: amdgpu: ring vcn_enc_0.0 uses VM inv eng 1 on hub 8
[  210.926211] amdgpu 0000:08:00.0: amdgpu: ring vcn_enc_0.1 uses VM inv eng 4 on hub 8
[  210.926212] amdgpu 0000:08:00.0: amdgpu: ring jpeg_dec uses VM inv eng 5 on hub 8
[  210.971122] usb 1-9: reset high-speed USB device number 5 using xhci_hcd
[  211.335132] usb 1-4: reset high-speed USB device number 2 using xhci_hcd
[  211.899125] usb 1-4.1: reset high-speed USB device number 4 using xhci_hcd
[  212.125114] r8152-cfgselector 1-4.2: reset high-speed USB device number 6 using xhci_hcd
[  212.517019] r8152 1-4.2:1.0: skip request firmware
[  212.799069] ata2: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[  212.816435] sd 1:0:0:0: [sdb] Starting disk
[  212.820613] ata2.00: configured for UDMA/133
[  213.397169] usb 1-4.1.3: WARN: invalid context state for evaluate context command.
[  213.492059] usb 1-4.1.3: reset full-speed USB device number 8 using xhci_hcd
[  213.743392] OOM killer enabled.
[  213.743394] Restarting tasks ... 
[  213.744207] EXT4-fs warning (device nvme0n1p4): dx_probe:823: inode #3407883: lblock 0: comm systemd-udevd: error -5 reading directory block
[  213.744325] done.
[  213.744336] random: crng reseeded on system resumption
[  213.744345] PM: suspend exit
[  213.747456] EXT4-fs warning (device nvme0n1p4): dx_probe:823: inode #3407881: lblock 0: comm python: error -5 reading directory block
[  213.747480] EXT4-fs warning (device nvme0n1p4): dx_probe:823: inode #3407883: lblock 0: comm python: error -5 reading directory block
[  213.747486] EXT4-fs warning (device nvme0n1p4): dx_probe:823: inode #3407881: lblock 0: comm python: error -5 reading directory block
[  213.747492] EXT4-fs warning (device nvme0n1p4): dx_probe:823: inode #3407881: lblock 0: comm python: error -5 reading directory block
[  213.747497] EXT4-fs warning (device nvme0n1p4): dx_probe:823: inode #3407881: lblock 0: comm python: error -5 reading directory block
[  213.748254] EXT4-fs warning (device nvme0n1p4): dx_probe:823: inode #3571976: lblock 0: comm python: error -5 reading directory block
[  213.748280] EXT4-fs warning (device nvme0n1p4): dx_probe:823: inode #3571976: lblock 0: comm python: error -5 reading directory block
[  213.749725] coredump: 1039(udisksd): |/usr/lib/systemd/systemd-coredump pipe failed
[  213.749802] EXT4-fs warning (device nvme0n1p4): dx_probe:823: inode #3407881: lblock 0: comm python: error -5 reading directory block
[  213.749818] EXT4-fs warning (device nvme0n1p4): dx_probe:823: inode #3407883: lblock 0: comm python: error -5 reading directory block
[  213.782037] Generic FE-GE Realtek PHY r8169-0-500:00: attached PHY driver (mii_bus:phy_addr=r8169-0-500:00, irq=MAC)
[  213.859031] usb 1-4.3: new high-speed USB device number 9 using xhci_hcd
[  213.920122] r8169 0000:05:00.0 enp5s0: Link is Down
[  213.997156] usb 1-4.3: New USB device found, idVendor=1b20, idProduct=0400, bcdDevice= 1.00
[  213.997160] usb 1-4.3: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[  213.997162] usb 1-4.3: Product: BillBoard
[  213.997164] usb 1-4.3: Manufacturer: MSTAR
[  213.997166] usb 1-4.3: SerialNumber: 12345
[  214.387196] usb 1-4.3: USB disconnect, device number 9
[  214.479010] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[  214.484506] sd 0:0:0:0: [sda] Starting disk
[  214.490030] ata1.00: configured for UDMA/133
[  215.292979] ata5: link is slow to respond, please be patient (ready=0)
[  217.084137] r8169 0000:05:00.0 enp5s0: Link is Up - 1Gbps/Full - flow control rx/tx
[  217.086946] coredump: 1357(nm-applet): |/usr/lib/systemd/systemd-coredump pipe failed
[  218.756365] EXT4-fs warning: 57 callbacks suppressed
[  218.756370] EXT4-fs warning (device nvme0n1p4): dx_probe:823: inode #3407881: lblock 0: comm python: error -5 reading directory block

This looks to confirm my nvme drive's failure to recover from suspend, but I haven't yet found anything that to my eyes points in the direction of a solution.

Last edited by a-curious-crow (2025-06-04 04:36:59)

seth · 2025-06-04 05:47:08

confirm my nvme drive's failure to recover from suspend

Yup, I've reported your first post ITT itr - this probably should go into a separate thread since for all we know here, it's a pretty vast off-topic distraction.

nvme.noacpi=1 acpi_rev_override=1 acpi_osi=Linux acpiphp.disabled=1

Where and why is that stuff coming from?

mem_sleep_default=deep

do you otherwise default to s2idle and does the nvme pull the same stunt on that mode?

Lone_Wolf · 2025-06-04 12:03:21

Moderator Note
Split off from https://bbs.archlinux.org/viewtopic.php?id=290126

Last edited by Lone_Wolf (2025-06-04 12:05:09)

a-curious-crow · 2025-06-04 23:29:55

"Where and why is that stuff coming from?" - random threads on the internet I found. I've been adding them incrementally, so I tested sleeping first with just your suggestions.

"do you otherwise default to s2idle and does the nvme pull the same stunt on that mode?" - I default to deep sleep, but I tried s2idle and just get a black screen when waking from that mode. I added mem_sleep_default=deep redundantly on the off chance that it would do something.

seth · 2025-06-05 06:38:43

https://bbs.archlinux.org/viewtopic.php … 3#p2138283 - do you have more than one nvme slots?
See whether you still get the exact same nvme failure w/ only "nvme_core.default_ps_max_latency_us=0 pcie_aspm=off pcie_port_pm=off iommu=soft" and wrt you CPU change, "processor.max_cstate=1" - https://wiki.archlinux.org/title/Ryzen#Troubleshooting

a-curious-crow · 2025-06-06 05:37:45

I have several nvme slots, but I'm actually currently using none of them. My nvme drive is inserted into one of my PCI slots, the other being taken up by my graphics card.

And those settings didn't work .

> cat /proc/cmdline
BOOT_IMAGE=/boot/vmlinuz-linux root=UUID=3e3bed23-02b9-4d03-83e2-98fd9afebf37 rw loglevel=3 quiet splash nvme_core.default_ps_max_latency_us=0 pcie_aspm=off pcie_port_pm=off iommu=soft processor.max_cstate=1

Last edited by a-curious-crow (2025-06-06 05:41:48)

seth · 2025-06-06 06:52:42

Ah, it's the intel 750 w/ a pcie connection from 2015, right? Not an m.2 card.
And it shows up on 01:00 while the nvidia GPU is on 06:00… can you swap them?

Arch Linux

#26 2025-06-03 03:53:50

Re: Resume issue - nvme related ?

#27 2025-06-03 06:53:18

Re: Resume issue - nvme related ?

#28 2025-06-04 04:33:54

Re: Resume issue - nvme related ?

#29 2025-06-04 05:47:08

Re: Resume issue - nvme related ?

#30 2025-06-04 12:03:21

Re: Resume issue - nvme related ?

#31 2025-06-04 23:29:55

Re: Resume issue - nvme related ?

#32 2025-06-05 06:38:43

Re: Resume issue - nvme related ?

#33 2025-06-06 05:37:45

Re: Resume issue - nvme related ?

#34 2025-06-06 06:52:42

Re: Resume issue - nvme related ?

Board footer