You are not logged in.

#1 2021-10-12 00:42:50

cgarz
Member
Registered: 2020-06-12
Posts: 2

nvme shows in lspci but not lsblk. probe failure -19. slows boot speed

I'm having issues trying to get an nvme drive to load.

It inexplicably stopped showing in the laptop it was installed in one day. It also started causing the boot process to hang before POST as well as when booting the OS.

I have since installed it into a PCIe slot in my desktop with an NVMe to PCIe adapter and the same issues are also present there.

Once finally booted it doesn't show in lsblk or under /dev but is visible in lspci:

04:00.0 Non-Volatile memory controller: Intel Corporation SSD 660P Series (rev 03) (prog-if 02 [NVM Express])
        Subsystem: Intel Corporation Device 390d
        Flags: fast devsel, IRQ 38, NUMA node 0, IOMMU group 21
        Memory at fc800000 (64-bit, non-prefetchable) [size=16K]
        Capabilities: [40] Power Management version 3
        Capabilities: [50] MSI: Enable- Count=1/8 Maskable+ 64bit+
        Capabilities: [70] Express Endpoint, MSI 00
        Capabilities: [b0] MSI-X: Enable- Count=16 Masked-
        Capabilities: [100] Advanced Error Reporting
        Capabilities: [158] Secondary PCI Express
        Capabilities: [178] Latency Tolerance Reporting
        Capabilities: [180] L1 PM Substates

The systemd journal shows a failure to probe during the boot process as well as after doing a pci remove and rescan:

01:07:24: pci 0000:04:00.0: Removing from iommu group 21
01:07:25: pci 0000:04:00.0: [8086:f1a8] type 00 class 0x010802
01:07:25: pci 0000:04:00.0: reg 0x10: [mem 0xfc800000-0xfc803fff 64bit]
01:07:25: pci 0000:04:00.0: Adding to iommu group 21
01:07:25: pci 0000:04:00.0: BAR 0: assigned [mem 0xfc800000-0xfc803fff 64bit]
01:07:25: nvme nvme1: pci function 0000:04:00.0
01:08:26: nvme nvme1: Device not ready; aborting initialisation, CSTS=0x0
01:08:26: nvme nvme1: Removing after probe failure status: -19

It's the 1TB model and is only just over a year old with a warranty for 5. Usage wasn't excessive and though I don't know the TBW that should also be well under warranty.

I have tried booting with debug in the kernel cmdline to try and get more information as well as tried booting with pcie_aspm=off after finding some posts when searching around but that didn't help.

Does anyone know how to fix it or how to better debug the issue? I would really like to recover the data if possible.

Last edited by cgarz (2021-10-13 17:49:12)

Offline

#2 2021-10-12 16:01:32

Ferdinand
Member
From: Norway
Registered: 2020-01-02
Posts: 331

Re: nvme shows in lspci but not lsblk. probe failure -19. slows boot speed

Others can give more insight and better advice, but I think you have a corrupt file system.

lspci will show you information about PCI buses in the system and devices connected to them, while lsblk will show you information about all available or the specified block devices. In other words, lspci will show you the controller on the PCI bus, whereas lsblk will show you the disk that is connected to the controller.

Your controller is good, but your disk is not.

I think you should see your disk if you do a sudo fdisk -l

If you do, then read about ddrescue: https://wiki.archlinux.org/title/Disk_c … g_ddrescue

Basically I think you should try to get a disk image of your disk or make a raw copy to another disk, and then try to save data from that, and leave your faulty disk in peace.
You should not write to the faulty disk - you can copy it and work with the copy.

Good luck.

Offline

#3 2021-10-13 17:48:29

cgarz
Member
Registered: 2020-06-12
Posts: 2

Re: nvme shows in lspci but not lsblk. probe failure -19. slows boot speed

Thanks for the info. I've given fdisk a try now but unfortunately it doesn't show.

I should have been clearer that it does not appear at all in lsblk. Not that the partitions don't show. Or can bad data can make lsblk ignore a disk?
It also has no entry under /dev like the other nvme drive I have. It should show as nvme1 like the boot log mentions briefly before giving up on it.

lshw marks it as UNCLAIMED and shows no logical name entry:

  *-nvme                    
       description: NVMe device
       product: Samsung SSD 980 PRO 1TB
       vendor: Samsung Electronics Co Ltd
       physical id: 0
       bus info: pci@0000:03:00.0
       logical name: /dev/nvme0
       version: 2B2QGXA7
       serial: redacted
       width: 64 bits
       clock: 33MHz
       capabilities: nvme pm msi pciexpress msix nvm_express bus_master cap_list
       configuration: driver=nvme latency=0 nqn=nqn.1994-11.com.samsung:nvme:980PRO:M.2:redacted state=live
       resources: irq:36 memory:fc900000-fc903fff
  *-nvme UNCLAIMED
       description: Non-Volatile memory controller
       product: SSD 660P Series
       vendor: Intel Corporation
       physical id: 0
       bus info: pci@0000:04:00.0
       version: 03
       width: 64 bits
       clock: 33MHz
       capabilities: nvme pm msi pciexpress msix nvm_express cap_list
       configuration: latency=0
       resources: memory:fc800000-fc803fff

The Samsung is my main drive for comparison.

Good advice about not writing to it. I will make sure to mount it read only or just dump it raw if it ever becomes readable. Hopefully there is a way.

Last edited by cgarz (2021-10-13 17:51:55)

Offline

Board footer

Powered by FluxBB