You are not logged in.

#1 2020-09-08 06:58:08

zyt33
Member
Registered: 2020-09-08
Posts: 3

NVMe drive not detetable by system after resume from suspend

Hi,

I have encountered a weird hardware error after updating my desktop with a new motherboard: the system can not resume from suspend. System logs doesn't have any indication on what the error is - the last few lines of the logs is:

Sep 08 02:14:27 yyf systemd[1]: Reached target Sleep.
Sep 08 02:14:27 yyf systemd[1]: Starting Suspend...
Sep 08 02:14:27 yyf systemd-sleep[1938]: Suspending system...

However, I'm able to replicate the issue using the arch iso; I think the reason that there is no logs in my own system is that the NVMe drive can't be detected after resume.

Sep 08 01:59:18 archiso kernel: nouveau 0000:09:00.0: DRM: base-0: timeout
Sep 08 01:59:20 archiso kernel: nvme nvme0: Device not ready; aborting reset, CSTS=0x1
Sep 08 01:59:20 archiso kernel: nvme nvme0: Removing after probe failure status: -19
Sep 08 01:59:20 archiso kernel: nouveau 0000:09:00.0: DRM: base-1: timeout
Sep 08 01:59:31 archiso kernel: nvme nvme0: Device not ready; aborting reset, CSTS=0x1
Sep 08 01:59:31 archiso kernel: nvme nvme0: failed to set APST feature (-19)

On the arch iso I also ran lsblk before and after resume. lsblk is able to list /dev/nvme0n1 before suspend, but the nvme device disappeared in the output of lsblk after suspend.

Searching on related issue, I tried adding the following lines to the grub boot options (GRUB_CMDLINE_LINUX_DEFAULT) and none of them work:

  • pcie_aspm=off

  • nvme_core.default_ps_max_latency_us=200 (and some other similar number)

  • acpiphp.disable=1

My hardware information:

Motherboard: X570 I AORUS PRO WIFI
Kernel: 5.4.61-1-lts
CPU: AMD Ryzen 5 2600X Six-Core Processor
GPU: NVIDIA Corporation GP104 (GeForce GTX 1080)
NVMe Drive: WD BLACK SN750 1TB

Here is the full log after "Starting Suspend" for the arch iso, in case more information is needed.

Sep 08 01:58:50 archiso systemd[1]: Starting Suspend...
Sep 08 01:58:50 archiso systemd-sleep[816]: Suspending system...
Sep 08 01:58:50 archiso kernel: PM: suspend entry (deep)
Sep 08 01:58:50 archiso kernel: Filesystems sync: 0.000 seconds
Sep 08 01:59:12 archiso kernel: Freezing user space processes ... (elapsed 0.001 seconds) done.
Sep 08 01:59:12 archiso kernel: OOM killer disabled.
Sep 08 01:59:12 archiso kernel: Freezing remaining freezable tasks ... (elapsed 0.000 seconds) done.
Sep 08 01:59:12 archiso kernel: printk: Suspending console(s) (use no_console_suspend to debug)
Sep 08 01:59:12 archiso kernel: sd 5:0:0:0: [sdb] Synchronizing SCSI cache
Sep 08 01:59:12 archiso kernel: sd 1:0:0:0: [sda] Synchronizing SCSI cache
Sep 08 01:59:12 archiso kernel: sd 1:0:0:0: [sda] Stopping disk
Sep 08 01:59:12 archiso kernel: sd 5:0:0:0: [sdb] Stopping disk
Sep 08 01:59:12 archiso kernel: ACPI: Preparing to enter system sleep state S3
Sep 08 01:59:12 archiso kernel: PM: Saving platform NVS memory
Sep 08 01:59:12 archiso kernel: Disabling non-boot CPUs ...
Sep 08 01:59:12 archiso kernel: IRQ 124: no longer affine to CPU1
Sep 08 01:59:12 archiso kernel: smpboot: CPU 1 is now offline
Sep 08 01:59:12 archiso kernel: IRQ 125: no longer affine to CPU2
Sep 08 01:59:12 archiso kernel: smpboot: CPU 2 is now offline
Sep 08 01:59:12 archiso kernel: IRQ 126: no longer affine to CPU3
Sep 08 01:59:12 archiso kernel: smpboot: CPU 3 is now offline
Sep 08 01:59:12 archiso kernel: IRQ 127: no longer affine to CPU4
Sep 08 01:59:12 archiso kernel: smpboot: CPU 4 is now offline
Sep 08 01:59:12 archiso kernel: IRQ 128: no longer affine to CPU5
Sep 08 01:59:12 archiso kernel: smpboot: CPU 5 is now offline
Sep 08 01:59:12 archiso kernel: IRQ 129: no longer affine to CPU6
Sep 08 01:59:12 archiso kernel: smpboot: CPU 6 is now offline
Sep 08 01:59:12 archiso kernel: IRQ 130: no longer affine to CPU7
Sep 08 01:59:12 archiso kernel: smpboot: CPU 7 is now offline
Sep 08 01:59:12 archiso kernel: IRQ 131: no longer affine to CPU8
Sep 08 01:59:12 archiso kernel: smpboot: CPU 8 is now offline
Sep 08 01:59:12 archiso kernel: IRQ 132: no longer affine to CPU9
Sep 08 01:59:12 archiso kernel: smpboot: CPU 9 is now offline
Sep 08 01:59:12 archiso kernel: IRQ 133: no longer affine to CPU10
Sep 08 01:59:12 archiso kernel: smpboot: CPU 10 is now offline
Sep 08 01:59:12 archiso kernel: smpboot: CPU 11 is now offline
Sep 08 01:59:12 archiso kernel: ACPI: Low-level resume complete
Sep 08 01:59:12 archiso kernel: PM: Restoring platform NVS memory
Sep 08 01:59:12 archiso kernel: Enabling non-boot CPUs ...
Sep 08 01:59:12 archiso kernel: x86: Booting SMP configuration:
Sep 08 01:59:12 archiso kernel: smpboot: Booting Node 0 Processor 1 APIC 0x2
Sep 08 01:59:12 archiso kernel: microcode: CPU1: patch_level=0x0800820d
Sep 08 01:59:12 archiso kernel: ACPI: \_PR_.C002: Found 2 idle states
Sep 08 01:59:12 archiso kernel: CPU1 is up
Sep 08 01:59:12 archiso kernel: smpboot: Booting Node 0 Processor 2 APIC 0x4
Sep 08 01:59:12 archiso kernel: microcode: CPU2: patch_level=0x0800820d
Sep 08 01:59:12 archiso kernel: ACPI: \_PR_.C004: Found 2 idle states
Sep 08 01:59:12 archiso kernel: CPU2 is up
Sep 08 01:59:12 archiso kernel: smpboot: Booting Node 0 Processor 3 APIC 0x8
Sep 08 01:59:12 archiso kernel: microcode: CPU3: patch_level=0x0800820d
Sep 08 01:59:12 archiso kernel: ACPI: \_PR_.C006: Found 2 idle states
Sep 08 01:59:12 archiso kernel: CPU3 is up
Sep 08 01:59:12 archiso kernel: smpboot: Booting Node 0 Processor 4 APIC 0xa
Sep 08 01:59:12 archiso kernel: microcode: CPU4: patch_level=0x0800820d
Sep 08 01:59:12 archiso kernel: ACPI: \_PR_.C008: Found 2 idle states
Sep 08 01:59:12 archiso kernel: CPU4 is up
Sep 08 01:59:12 archiso kernel: smpboot: Booting Node 0 Processor 5 APIC 0xc
Sep 08 01:59:12 archiso kernel: microcode: CPU5: patch_level=0x0800820d
Sep 08 01:59:12 archiso kernel: ACPI: \_PR_.C00A: Found 2 idle states
Sep 08 01:59:12 archiso kernel: CPU5 is up
Sep 08 01:59:12 archiso kernel: smpboot: Booting Node 0 Processor 6 APIC 0x1
Sep 08 01:59:12 archiso kernel: microcode: CPU6: patch_level=0x0800820d
Sep 08 01:59:12 archiso kernel: ACPI: \_PR_.C001: Found 2 idle states
Sep 08 01:59:12 archiso kernel: CPU6 is up
Sep 08 01:59:12 archiso kernel: smpboot: Booting Node 0 Processor 7 APIC 0x3
Sep 08 01:59:12 archiso kernel: microcode: CPU7: patch_level=0x0800820d
Sep 08 01:59:12 archiso kernel: ACPI: \_PR_.C003: Found 2 idle states
Sep 08 01:59:12 archiso kernel: CPU7 is up
Sep 08 01:59:12 archiso kernel: smpboot: Booting Node 0 Processor 8 APIC 0x5
Sep 08 01:59:12 archiso kernel: microcode: CPU8: patch_level=0x0800820d
Sep 08 01:59:12 archiso kernel: ACPI: \_PR_.C005: Found 2 idle states
Sep 08 01:59:12 archiso kernel: CPU8 is up
Sep 08 01:59:12 archiso kernel: smpboot: Booting Node 0 Processor 9 APIC 0x9
Sep 08 01:59:12 archiso kernel: microcode: CPU9: patch_level=0x0800820d
Sep 08 01:59:12 archiso kernel: ACPI: \_PR_.C007: Found 2 idle states
Sep 08 01:59:12 archiso kernel: CPU9 is up
Sep 08 01:59:12 archiso kernel: smpboot: Booting Node 0 Processor 10 APIC 0xb
Sep 08 01:59:12 archiso kernel: microcode: CPU10: patch_level=0x0800820d
Sep 08 01:59:12 archiso kernel: ACPI: \_PR_.C009: Found 2 idle states
Sep 08 01:59:12 archiso kernel: CPU10 is up
Sep 08 01:59:12 archiso kernel: smpboot: Booting Node 0 Processor 11 APIC 0xd
Sep 08 01:59:12 archiso kernel: microcode: CPU11: patch_level=0x0800820d
Sep 08 01:59:12 archiso kernel: ACPI: \_PR_.C00B: Found 2 idle states
Sep 08 01:59:12 archiso kernel: CPU11 is up
Sep 08 01:59:12 archiso kernel: ACPI: Waking up from system sleep state S3
Sep 08 01:59:12 archiso kernel: usb usb1: root hub lost power or was reset
Sep 08 01:59:12 archiso kernel: usb usb3: root hub lost power or was reset
Sep 08 01:59:12 archiso kernel: usb usb2: root hub lost power or was reset
Sep 08 01:59:12 archiso kernel: usb usb4: root hub lost power or was reset
Sep 08 01:59:12 archiso kernel: sd 1:0:0:0: [sda] Starting disk
Sep 08 01:59:12 archiso kernel: sd 5:0:0:0: [sdb] Starting disk
Sep 08 01:59:12 archiso kernel: logitech-hidpp-device 0003:046D:4082.0009: hidpp20_batterylevel_get_battery_info: received protocol error 0x09
Sep 08 01:59:12 archiso kernel: usb 5-1: reset high-speed USB device number 2 using xhci_hcd
Sep 08 01:59:12 archiso kernel: ata3: SATA link down (SStatus 0 SControl 300)
Sep 08 01:59:12 archiso kernel: ata7: SATA link down (SStatus 0 SControl 300)
Sep 08 01:59:12 archiso kernel: ata1: SATA link down (SStatus 0 SControl 300)
Sep 08 01:59:12 archiso kernel: ata8: SATA link down (SStatus 0 SControl 300)
Sep 08 01:59:12 archiso kernel: usb 1-2: reset full-speed USB device number 3 using xhci_hcd
Sep 08 01:59:12 archiso kernel: usb 3-2: reset high-speed USB device number 2 using xhci_hcd
Sep 08 01:59:12 archiso kernel: ata6: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
Sep 08 01:59:12 archiso kernel: ata2: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
Sep 08 01:59:12 archiso kernel: ata6.00: supports DRM functions and may not be fully accessible
Sep 08 01:59:12 archiso kernel: ata6.00: supports DRM functions and may not be fully accessible
Sep 08 01:59:12 archiso kernel: ata6.00: configured for UDMA/133
Sep 08 01:59:12 archiso kernel: ata6.00: Enabling discard_zeroes_data
Sep 08 01:59:12 archiso kernel: ata2.00: NCQ Send/Recv Log not supported
Sep 08 01:59:12 archiso kernel: ata2.00: NCQ Send/Recv Log not supported
Sep 08 01:59:12 archiso kernel: ata2.00: configured for UDMA/133
Sep 08 01:59:12 archiso kernel: usb 1-1: reset full-speed USB device number 2 using xhci_hcd
Sep 08 01:59:12 archiso kernel: usb 3-6: reset full-speed USB device number 3 using xhci_hcd
Sep 08 01:59:12 archiso kernel: nouveau 0000:09:00.0: DRM: core notifier timeout
Sep 08 01:59:12 archiso kernel: nouveau 0000:09:00.0: disp: outp 01:0006:0f84: training failed
Sep 08 01:59:12 archiso kernel: OOM killer enabled.
Sep 08 01:59:12 archiso kernel: Restarting tasks ... done.
Sep 08 01:59:12 archiso kernel: PM: suspend exit
Sep 08 01:59:12 archiso kernel: Bluetooth: hci0: Firmware revision 0.0 build 188 week 26 2020
Sep 08 01:59:12 archiso kernel: audit: type=1130 audit(1599530352.553:41): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=systemd-suspend comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
Sep 08 01:59:12 archiso kernel: audit: type=1131 audit(1599530352.553:42): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=systemd-suspend comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
Sep 08 01:59:12 archiso audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=systemd-suspend comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
Sep 08 01:59:12 archiso audit[1]: SERVICE_STOP pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=systemd-suspend comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
Sep 08 01:59:12 archiso systemd-sleep[816]: System resumed.
Sep 08 01:59:12 archiso systemd[1]: Starting Load/Save RF Kill Switch Status...
Sep 08 01:59:12 archiso systemd[1]: systemd-suspend.service: Succeeded.
Sep 08 01:59:12 archiso systemd[1]: Finished Suspend.
Sep 08 01:59:12 archiso systemd[1]: Stopped target Sleep.
Sep 08 01:59:12 archiso systemd[1]: Reached target Suspend.
Sep 08 01:59:12 archiso systemd-logind[480]: Operation 'sleep' finished.
Sep 08 01:59:12 archiso systemd[1]: Stopped target Suspend.
Sep 08 01:59:12 archiso systemd-networkd[383]: lo: Reset carrier
Sep 08 01:59:12 archiso systemd[1]: Stopped target Bluetooth.
Sep 08 01:59:12 archiso systemd[1]: Started Load/Save RF Kill Switch Status.
Sep 08 01:59:12 archiso audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=systemd-rfkill comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
Sep 08 01:59:12 archiso systemd[1]: Reached target Bluetooth.
Sep 08 01:59:12 archiso systemd[649]: Reached target Bluetooth.
Sep 08 01:59:12 archiso kernel: audit: type=1130 audit(1599530352.563:43): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=systemd-rfkill comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
Sep 08 01:59:14 archiso kernel: nouveau 0000:09:00.0: DRM: core notifier timeout
Sep 08 01:59:16 archiso kernel: nouveau 0000:09:00.0: DRM: core notifier timeout
Sep 08 01:59:17 archiso systemd[1]: systemd-rfkill.service: Succeeded.
Sep 08 01:59:17 archiso audit[1]: SERVICE_STOP pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=systemd-rfkill comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
Sep 08 01:59:17 archiso kernel: audit: type=1131 audit(1599530357.566:44): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=systemd-rfkill comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
Sep 08 01:59:18 archiso kernel: nouveau 0000:09:00.0: DRM: base-0: timeout
Sep 08 01:59:20 archiso kernel: nvme nvme0: Device not ready; aborting reset, CSTS=0x1
Sep 08 01:59:20 archiso kernel: nvme nvme0: Removing after probe failure status: -19
Sep 08 01:59:20 archiso kernel: nouveau 0000:09:00.0: DRM: base-1: timeout
Sep 08 01:59:31 archiso kernel: nvme nvme0: Device not ready; aborting reset, CSTS=0x1
Sep 08 01:59:31 archiso kernel: nvme nvme0: failed to set APST feature (-19)
Sep 08 01:59:57 archiso kernel: EXT4-fs (sda1): mounted filesystem with ordered data mode. Opts: (null)
Sep 08 02:00:19 archiso systemd-networkd-wait-online[443]: Event loop failed: Connection timed out
Sep 08 02:00:19 archiso systemd[1]: systemd-networkd-wait-online.service: Main process exited, code=exited, status=1/FAILURE
Sep 08 02:00:19 archiso systemd[1]: systemd-networkd-wait-online.service: Failed with result 'exit-code'.
Sep 08 02:00:19 archiso systemd[1]: Failed to start Wait for Network to be Configured.
Sep 08 02:00:19 archiso audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=systemd-networkd-wait-online comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=failed'
Sep 08 02:00:19 archiso systemd[1]: Reached target Network is Online.
Sep 08 02:00:19 archiso kernel: audit: type=1130 audit(1599530419.449:45): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=systemd-networkd-wait-online comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=failed'
Sep 08 02:00:19 archiso systemd[1]: Starting pacman mirrorlist update...
Sep 08 02:00:20 archiso reflector[980]: error: failed to retrieve mirrorstatus data: URLError: <urlopen error [Errno -3] Temporary failure in name resolution>
Sep 08 02:00:20 archiso systemd[1]: reflector.service: Main process exited, code=exited, status=1/FAILURE
Sep 08 02:00:20 archiso systemd[1]: reflector.service: Failed with result 'exit-code'.
Sep 08 02:00:20 archiso systemd[1]: Failed to start pacman mirrorlist update.
Sep 08 02:00:20 archiso audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=reflector comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=failed'
Sep 08 02:00:20 archiso systemd[1]: Reached target Multi-User System.
Sep 08 02:00:20 archiso systemd[1]: Reached target Graphical Interface.
Sep 08 02:00:20 archiso kernel: audit: type=1130 audit(1599530420.599:46): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=reflector comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=failed'
Sep 08 02:00:20 archiso systemd[1]: Startup finished in 25.563s (firmware) + 5.476s (loader) + 14.219s (kernel) + 1min 46.397s (userspace) = 2min 31.657s.

Offline

#2 2020-09-08 08:13:34

angelsl
Member
Registered: 2017-05-28
Posts: 4

Re: NVMe drive not detetable by system after resume from suspend

I have the exact same problem with the exact same drive (WD SN750 1TB).

Kernel: Linux 5.7.19.a-1-hardened
CPU: AMD Ryzen 7 2700X
Motherboard: Gigabyte X470 AORUS GAMING 5 WIFI

The drive seems to permanently lock up until a power cycle: after a suspend, the drive will no longer respond even after a reboot, until a power cycle.

I turned off XMP but that didn't seem to have any effect, so it's not a RAM/RAM clockspeed-related issue.

I am going to try:

1. this drive on my laptop
2. my laptop's (different) NVMe drive on my desktop

which should confirm if this is an issue specific to this drive, or this motherboard, or this specific motherboard-drive combination (?!).

Edit: There are a number of other posts elsewhere that report similar issues with AM4 motherboards and this particular drive:

- https://www.reddit.com/r/buildapc/comme … _biosuefi/
- https://forums.tomshardware.com/threads … p.3488581/
- https://linustechtips.com/main/topic/12 … vme-drive/
- https://www.reddit.com/r/ASRock/comment … n750_nvme/

Last edited by angelsl (2020-09-08 17:22:00)

Offline

#3 2020-09-08 16:43:41

zyt33
Member
Registered: 2020-09-08
Posts: 3

Re: NVMe drive not detetable by system after resume from suspend

My problem seems to be the same but different in one thing: my drive seems to be performing okay after a reboot.

I'll see if I can change to another drive and see if the problem persists.

Offline

#4 2020-09-08 21:32:29

angelsl
Member
Registered: 2017-05-28
Posts: 4

Re: NVMe drive not detetable by system after resume from suspend

Okay, I tried:

1. The SN750 on a ThinkPad T480s: Seems to work fine. Used the August Arch install ISO, on kernel 5.7.11.
2. My laptop's P34A80 in my desktop: Seems to work fine.

So I guess it is really something to do with this drive and AM4 motherboards.

Refunding this drive on Amazon, and going to try a SK hynix Gold P31.

Offline

#5 2020-09-09 03:10:45

zyt33
Member
Registered: 2020-09-08
Posts: 3

Re: NVMe drive not detetable by system after resume from suspend

I tried to move the drive from the front nvme slot to the back nvme slot, and sleeping worked. I think this is a fine workaround, but I'll have to test this setup to see if it's stable.

Offline

#6 2020-09-10 15:22:27

angelsl
Member
Registered: 2017-05-28
Posts: 4

Re: NVMe drive not detetable by system after resume from suspend

zyt33 wrote:

I tried to move the drive from the front nvme slot to the back nvme slot, and sleeping worked. I think this is a fine workaround, but I'll have to test this setup to see if it's stable.

You should check if that NVMe slot can support the full bandwidth of the drive. My motherboard has a 2nd NVMe slot, but it's only PCIe 2.0 x4, which has a 2.0 GB/s maximum throughput, so it will be limiting the drive.

But I guess it's fast enough that it doesn't really matter anyway..

Offline

Board footer

Powered by FluxBB