NVMe SSD dead after Secure Erase???

yiff · 2026-02-08 18:19:03

I yesterday I did a clean install of Arch on my gaming laptop (Acer Nitro 5 AN515-54). I wanted to use LUKS on my root partition, so I performed a secure erase with the following command:

nvme format /dev/nvme1n1 -s 2

Before executing it I made sure that my SSD supported it, and it did. The erase was successful.
My SSD model is Micron 2200 MTFDHBA512TCK (firmware P1MA003).
Afterwards I went through the all the necessary steps listed in the installation guide and the dm-crypt example setup guide. When I was done I then rebooted only to see a "No Bootable Device" screen. UEFI showed no boot options. When rebooting into arch liveiso I saw that lsblk didn't show the partitions of that drive, and only showed the drive itself. Across multiple liveiso reboots fdisk -l either showed me the GPT label + the partitions, or didn't.
Today I tried reinstalling Arch after securely erasing the ssd again, and ended up having the same issue.
Then I securely erased again and installed CachyOS with its installer, to make sure I didn't mess up something in the manual arch installation. I got the same "No Bootable Device" screen.

(Securely erasing, partitioning, formatting and installing Linux on the drive worked fine, but after rebooting it seems to stop working properly).

Right now, this is what I see:

# blkid | grep nvme1n1
/dev/nvme1n1: PTUUID="fe5db2d4-6894-4c8e-9bc8-1e6d571dcf26" PTTYPE="gpt"

# lsblk | grep nvme1n1
nvme1n1     259:2    0 476.9G  0 disk

# fdisk -l /dev/nvme1n1      
Disk /dev/nvme1n1: 476.94 GiB, 512110190592 bytes, 1000215216 sectors
Disk model: Micron_2200_MTFDHBA512TCK               
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: FE5DB2D4-6894-4C8E-9BC8-1E6D571DCF26

Device           Start       End   Sectors  Size Type
/dev/nvme1n1p1    2048   8587263   8585216  4.1G EFI System
/dev/nvme1n1p2 8587264 428017663 419430400  200G Linux filesystem

SSD health:

# nvme smart-log -H /dev/nvme1n1
Smart Log for NVME device:nvme1n1 namespace-id:ffffffff
critical_warning			: 0
      Available Spare[0]             : 0
      Temp. Threshold[1]             : 0
      NVM subsystem Reliability[2]   : 0
      Read-only[3]                   : 0
      Volatile mem. backup failed[4] : 0
      Persistent Mem. RO[5]          : 0
temperature				: 26 °C (299 K, 78 °F)
available_spare				: 100%
available_spare_threshold		: 5%
percentage_used				: 26%
endurance group critical warning summary: 0
Data Units Read				: 385042 (197.14 GB)
Data Units Written			: 230883 (118.21 GB)
host_read_commands			: 5777336
host_write_commands			: 6707754
controller_busy_time			: 148
power_cycles				: 840
power_on_hours				: 23
unsafe_shutdowns			: 737
media_errors				: 114
num_err_log_entries			: 36
Warning Temperature Time		: 0
Critical Composite Temperature Time	: 0
Temperature Sensor 1			: 26 °C (299 K, 78 °F)
Temperature Sensor 2			: 28 °C (301 K, 82 °F)
Thermal Management T1 Trans Count	: 0
Thermal Management T2 Trans Count	: 0
Thermal Management T1 Total Time	: 0
Thermal Management T2 Total Time	: 0

Note that the

media_errors

stat was 6 after the first time I installed Arch (yesterday). It seems to have increased after every secure erase.
Is there a possibility that the format command killed my SSD? Can I still save it?
After taking out my SSD, I saw that it had Opal2 written on the label. Maybe I should have used that instead of nvme format?
What if my SSD was already on the verge of dying and the secure erase is what did pushed it over the edge? I've been using this SSD for around 6 years now.

All help is greatly appreciated.

Last edited by yiff (2026-02-08 18:23:54)

seth · 2026-02-08 20:52:20

When I was done I then rebooted only to see a "No Bootable Device" screen. UEFI showed no boot options.

liveiso I saw that lsblk didn't show the partitions of that drive

/dev/nvme1n1p1    2048   8587263   8585216  4.1G EFI System
/dev/nvme1n1p2 8587264 428017663 419430400  200G Linux filesystem

What is the output of

lsblk -f

right now?
Can you mount /dev/nvme1n1p1 ?
What are its filesystem and contents?

/dev/nvme1n1p1

Is there no "nvme0n1"?

yiff · 2026-02-08 21:18:48

# lsblk -f
NAME        FSTYPE   FSVER            LABEL       UUID                                 FSAVAIL FSUSE% MOUNTPOINTS
loop0       squashfs 4.0                                                                     0   100% /run/archiso/airootfs
sda                                                                                                   
├─sda1      exfat    1.0              Ventoy      5D42-D2E1                                           
│ ├─ventoy  iso9660  Joliet Extension ARCH_202602 2026-02-01-08-05-47-00                              
│ └─sda1    exfat    1.0              Ventoy      5D42-D2E1                                           
└─sda2      vfat     FAT16            VTOYEFI     626B-4255                                           
nvme0n1                                                                                               
└─nvme0n1p1 btrfs                                 6f4cde95-df1b-46c9-930d-c1a2c4470d87                
nvme1n1

As you can see it doesn't show the partitions/filesystems right now, but like an hour ago I booted into liveiso to check if things were the same, and lsblk did actually show the partitions, and I was even able to mount the ESP, and unencrypt the encrypted partition too (tho I didn't try mounting it, but it would have probably worked).

The filesystem in nvme1n1p1 is FAT32 and the contents are what you would expect for a /boot partition.
On nvme1n1p2 it's btrfs and it's the / partition.
And nvme0n1 is the drive I use to store games.

If I try mounting /dev/nvme1n1p1 right now:

# mount /dev/nvme1n1p1 /mnt
mount: /mnt: fsconfig() failed: /dev/nvme1n1p1: Can't lookup blockdev.
       dmesg(1) may have more information after failed mount system call.

And if I look through dmesg I find this:

[    3.005728]  nvme0n1: p1
[    3.057093] nvme nvme1: 8/0/0 default/read/poll queues
[    3.072258] nvme1n1: Read(0x2) @ LBA 0, 8 blocks, Unrecovered Read Error (sct 0x2 / sc 0x81) 
[    3.072274] critical medium error, dev nvme1n1, sector 0 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 2
[    3.072284] Buffer I/O error on dev nvme1n1, logical block 0, async page read
[    3.080697] nvme1n1: Read(0x2) @ LBA 0, 8 blocks, Unrecovered Read Error (sct 0x2 / sc 0x81) 
[    3.080724] critical medium error, dev nvme1n1, sector 0 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 2
[    3.080734] Buffer I/O error on dev nvme1n1, logical block 0, async page read
[    3.080746] ldm_validate_partition_table(): Disk read failed.
[    3.089151] nvme1n1: Read(0x2) @ LBA 0, 8 blocks, Unrecovered Read Error (sct 0x2 / sc 0x81) 
[    3.089162] critical medium error, dev nvme1n1, sector 0 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 2
[    3.089171] Buffer I/O error on dev nvme1n1, logical block 0, async page read
[    3.097536] nvme1n1: Read(0x2) @ LBA 0, 8 blocks, Unrecovered Read Error (sct 0x2 / sc 0x81) 
[    3.097548] critical medium error, dev nvme1n1, sector 0 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 2
[    3.097557] Buffer I/O error on dev nvme1n1, logical block 0, async page read
[    3.105933] nvme1n1: Read(0x2) @ LBA 0, 8 blocks, Unrecovered Read Error (sct 0x2 / sc 0x81) 
[    3.105944] critical medium error, dev nvme1n1, sector 0 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 2
[    3.105953] Buffer I/O error on dev nvme1n1, logical block 0, async page read
[    3.105966]  nvme1n1: unable to read partition table
[    3.439158] nvme1n1: Read(0x2) @ LBA 0, 8 blocks, Unrecovered Read Error (sct 0x2 / sc 0x81) 
[    3.439177] critical medium error, dev nvme1n1, sector 0 op 0x0:(READ) flags 0x80700 phys_seg 1 prio class 2
[    3.447379] nvme1n1: Read(0x2) @ LBA 0, 8 blocks, Unrecovered Read Error (sct 0x2 / sc 0x81) 
[    3.447399] critical medium error, dev nvme1n1, sector 0 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 2
[    3.447414] Buffer I/O error on dev nvme1n1, logical block 0, async page read

yiff · 2026-02-08 21:46:42

Another thing I just noticed is the

media_errors

increased to 135. The SSD seems cooked, don't you think?

seth · 2026-02-08 21:50:32

Maybe

available_spare				: 100%
available_spare_threshold		: 5%
percentage_used				: 26%
endurance group critical warning summary: 0
Data Units Read				: 385042 (197.14 GB)
Data Units Written			: 230883 (118.21 GB)

it's not that old, though.
Have you tried to re-seat it in the pci slot?
There's also https://wiki.archlinux.org/title/Solid_ … leshooting (APST/ASPM/IOMMU)

yiff · 2026-02-08 21:58:53

I think there's no way the smart-log is showing accurate stats for it. The power on hours are absurdly low and so are the data units read/written. Maybe the secure erase also erased some of these logs?? Wouldn't make sense for it to do that, but idk.
Also yes - I have reseated the SSD.

These are the logs of my second drive, which i got 3-4 years after using the first, and they definitely seem more accurate:

# nvme smart-log /dev/nvme0n1
Smart Log for NVME device:nvme0n1 namespace-id:ffffffff
critical_warning			: 0
temperature				: 30 °C (303 K, 86 °F)
available_spare				: 100%
available_spare_threshold		: 10%
percentage_used				: 1%
endurance group critical warning summary: 0
Data Units Read				: 35385659 (18.12 TB)
Data Units Written			: 16394130 (8.39 TB)
host_read_commands			: 118475088
host_write_commands			: 47102967
controller_busy_time			: 449
power_cycles				: 4176
power_on_hours				: 4593
unsafe_shutdowns			: 458
media_errors				: 0
num_err_log_entries			: 9058
Warning Temperature Time		: 0
Critical Composite Temperature Time	: 0
Temperature Sensor 1			: 30 °C (303 K, 86 °F)
Temperature Sensor 2			: 25 °C (298 K, 77 °F)
Thermal Management T1 Trans Count	: 0
Thermal Management T2 Trans Count	: 0
Thermal Management T1 Total Time	: 0
Thermal Management T2 Total Time	: 0

Last edited by yiff (2026-02-08 21:59:54)

seth · 2026-02-08 22:13:05

In that case the media errors would build up really fast.
Do you have older logs of it?
Have there previously been problems w/ the drive?

yiff · 2026-02-08 22:22:15

I don't have older logs, as I've only discovered the smart log command today.
I never really had problems with this drive, at least none that I'm aware of.

seth · 2026-02-08 22:51:43

Disable APST and ASPM and use the software IOMMU and see whether the IO errors remain and the media errors keep creeping up.

The erasure will have written the entire drive what might have exposed some problems.
The firmware from 2020 is the latest and has been there before all of this?

yiff · 2026-02-08 23:29:54

I haven't manually updated the SSD firmware, and I couldn't really find any info about the latest firmware when looking for it an hour or so ago. I'd imagine that it's the latest. Also the same firmware version is on the SSD's physical label, so I think it's safe to say that it's always been there.

As for disabling APST and ASPM, and using software IOMMU - how would I do that? I'd have to reinstall linux and set the kernel parameters in the live iso before rebooting into the newly installed system? Or is it possible to modify the parameters of the live iso (persistantly or at runtime)?
Anyhow, I'll get back to this tomorrow, as it's already late for me.
Thanks for the info btw.

seth · 2026-02-09 10:28:12

possible to modify the parameters of the live iso

Yes, absolutely. https://wiki.archlinux.org/title/Kernel_parameters

nvme_core.default_ps_max_latency_us=0 pci_aspm=off iommu=soft

yiff · 2026-02-09 21:19:56

I set those parameters (and pcie_port_pm=off for good measure too).
It didn't stop the media errors from occuring (However, as I booted into live iso with these parameters multiple times, I noticed that 2 times it booted without media errors, but I think this isn't connected to those parameters, because yesterday I managed to boot into live iso without media errors, but without setting any parameters. I guess it's random).

seth · 2026-02-10 09:17:25

Please post your complete system journal for a boot with those errors:

sudo journalctl -b | curl -F 'file=@-' 0x0.st

Running that from the Install iso is fine.

yiff · 2026-02-10 18:19:41

without setting kernel params - http://0x0.st/PAjE.txt
with kernel params - http://0x0.st/PAeb.txt

seth · 2026-02-10 19:55:48

Feb 10 18:16:03 archiso kernel: system 00:07: [mem 0xfed10000-0xfed17fff] has been reserved
Feb 10 18:16:03 archiso kernel: system 00:07: [mem 0xfed18000-0xfed18fff] has been reserved
Feb 10 18:16:03 archiso kernel: system 00:07: [mem 0xfed19000-0xfed19fff] has been reserved
Feb 10 18:16:03 archiso kernel: system 00:07: [mem 0xe0000000-0xefffffff] has been reserved
Feb 10 18:16:03 archiso kernel: system 00:07: [mem 0xfed20000-0xfed3ffff] has been reserved
Feb 10 18:16:03 archiso kernel: system 00:07: [mem 0xfed90000-0xfed93fff] could not be reserved
Feb 10 18:16:03 archiso kernel: system 00:07: [mem 0xfed45000-0xfed8ffff] could not be reserved
Feb 10 18:16:03 archiso kernel: system 00:07: [mem 0xfee00000-0xfeefffff] could not be reserved

That's the nvme, can you swap the nvme0 and nvme1 slots (and does the problem remain w/ the pci slot rather than the nvme)?

yiff · 2026-02-10 20:33:35

This is after removing my functional NVMe and plugging the borked one into it's slot: https://0x0.st/PA_q.txt
Nothing changed as far as I can tell.

seth · 2026-02-10 20:36:51

Feb 10 21:16:58 arch kernel: pcieport 0000:00:1d.0: AER: Correctable error message received from 0000:07:00.0
Feb 10 21:16:58 arch kernel: alx 0000:07:00.0: PCIe Bus Error: severity=Correctable, type=Data Link Layer, (Receiver ID)
Feb 10 21:16:58 arch kernel: alx 0000:07:00.0:   device [1969:e0b1] error status/mask=00000040/00002000
Feb 10 21:16:58 arch kernel: alx 0000:07:00.0:    [ 6] BadTLP                
Feb 10 21:16:58 arch kernel: pcieport 0000:00:1d.0: AER: Correctable error message received from 0000:07:00.0
Feb 10 21:16:58 arch kernel: alx 0000:07:00.0: PCIe Bus Error: severity=Correctable, type=Data Link Layer, (Receiver ID)
Feb 10 21:16:58 arch kernel: alx 0000:07:00.0:   device [1969:e0b1] error status/mask=00000040/00002000
Feb 10 21:16:58 arch kernel: alx 0000:07:00.0:    [ 6] BadTLP                
Feb 10 21:16:58 arch kernel: alx 0000:07:00.0 eth0: Qualcomm Atheros AR816x/AR817x Ethernet [08:97:98:ea:d2:45]

But it might be the nvme that's spamming the bus - looks bleak

yiff · 2026-02-10 20:38:58

Oh, I forgot to mention - I installed a SATA SSD from an older laptop here. Maybe that's what causing the errors.

seth · 2026-02-10 21:17:41

The bus errors are from the ethernet chip, but I don't think they're causing the problems w/ the nvme (rather the other way round)

Arch Linux

#1 2026-02-08 18:19:03

NVMe SSD dead after Secure Erase???

#2 2026-02-08 20:52:20

Re: NVMe SSD dead after Secure Erase???

#3 2026-02-08 21:18:48

Re: NVMe SSD dead after Secure Erase???

#4 2026-02-08 21:46:42

Re: NVMe SSD dead after Secure Erase???

#5 2026-02-08 21:50:32

Re: NVMe SSD dead after Secure Erase???

#6 2026-02-08 21:58:53

Re: NVMe SSD dead after Secure Erase???

#7 2026-02-08 22:13:05

Re: NVMe SSD dead after Secure Erase???

#8 2026-02-08 22:22:15

Re: NVMe SSD dead after Secure Erase???

#9 2026-02-08 22:51:43

Re: NVMe SSD dead after Secure Erase???

#10 2026-02-08 23:29:54

Re: NVMe SSD dead after Secure Erase???

#11 2026-02-09 10:28:12

Re: NVMe SSD dead after Secure Erase???

#12 2026-02-09 21:19:56

Re: NVMe SSD dead after Secure Erase???

#13 2026-02-10 09:17:25

Re: NVMe SSD dead after Secure Erase???

#14 2026-02-10 18:19:41

Re: NVMe SSD dead after Secure Erase???

#15 2026-02-10 19:55:48

Re: NVMe SSD dead after Secure Erase???

#16 2026-02-10 20:33:35

Re: NVMe SSD dead after Secure Erase???

#17 2026-02-10 20:36:51

Re: NVMe SSD dead after Secure Erase???

#18 2026-02-10 20:38:58

Re: NVMe SSD dead after Secure Erase???

#19 2026-02-10 21:17:41

Re: NVMe SSD dead after Secure Erase???

Board footer