You are not logged in.
Pages: 1
I have added an NVMe SSD to my system. It is a backup drive mounted via crypttab and fstab on boot.
The system boots fine or I get the following errors (journalctl)
nvme nvme0: missing or invalid SUBNQN field.
nvme nvme0: I/O 16 QID 0 timeout, disable controller
nvme nvme0: could not set timestamp (-4)
nvme nvme0: Removing after probe failure status: -4
nvme nvme0: failed to set APST feature (-19)
Searching the topic I followed the ArchWiki instructions:
for
nvme get-feature -f 0x0c -H /dev/nvme[0-9]
i do get
Autonomous Power State Transition Enable (APSTE): Enabled
and a few non-zero values next to a lot of zeros. This should be fine from my understanding.
Going further nontheless "the total latency of any state (enlat + xlat)" is also NOT "greater than 25000 (25ms)".
ps 0 : mp:9.00W operational enlat:5 exlat:5 rrt:0 rrl:0
rwt:0 rwl:0 idle_power:- active_power:-
ps 1 : mp:4.60W operational enlat:30 exlat:30 rrt:1 rrl:1
rwt:1 rwl:1 idle_power:- active_power:-
ps 2 : mp:3.80W operational enlat:30 exlat:30 rrt:2 rrl:2
rwt:2 rwl:2 idle_power:- active_power:-
ps 3 : mp:0.0500W non-operational enlat:1000 exlat:1000 rrt:3 rrl:3
rwt:3 rwl:3 idle_power:- active_power:-
ps 4 : mp:0.0040W non-operational enlat:6000 exlat:8000 rrt:4 rrl:4
rwt:4 rwl:4 idle_power:- active_power:-
From what I can see APST should enable just fine latency wise.
How can I investigate further to solve the issue? This error does not occur on every boot and when it does a reboot solves it.
Thanks!
// EDIT: Kernel is
4.19.13-1-lts
and SSD is a Crucial P1 1TB.
Last edited by revilo.r (2019-01-13 14:29:18)
Offline
nvme nvme0: failed to set APST feature (-19)
-19 = -ENODEV
nvme nvme0: I/O 16 QID 0 timeout, disable controller
....
nvme nvme0: Removing after probe failure status: -4
If that happened before the APST failure then it makes sense that the device could not be found as it had already been disabled.
Offline
nvme nvme0: failed to set APST feature (-19)
-19 = -ENODEV
nvme nvme0: I/O 16 QID 0 timeout, disable controller .... nvme nvme0: Removing after probe failure status: -4
If that happened before the APST failure then it makes sense that the device could not be found as it had already been disabled.
Makes sense.
Why does it timeout, though?
nvme nvme0: pci function 0000:10:00.0
comes up prior
[...]
nvme nvme0: missing or invalid SUBNQN field.
nvme nvme0: I/O 16 QID 0 timeout, disable controller
[...]
Nothing else nvme related found.
Last edited by revilo.r (2019-01-13 16:24:44)
Offline
Hello revilo.r,
I'm not a ArchLinux user (...I know... shame on me...), but it seems you're the only fellow Linux user on the whole Internet with the same P1 SSD by Crucial (mine is the CT500P1SSD8).
Do you have any solution to this?
I tried many things, from switching kernel versions to adding boot parameters, but nothing seems definitive and reliable.
I have an ASUS N751JK laptop and I'm running Linux Mint 19.1
I think I have the best results when using:
- the latest possible kernel (now is the 4.20.12)
- the `nvme_core.default_ps_max_latency_us=0` boot parameter
The latter will disable the APSTE feature. This seems useless anyway... with time, or randomly... it crashes.
Since the SSD does not survive a suspend-to-ram, I also tried the param `acpiphp.disable=1`, but with no avail.
This is from this session (still alive); I crashed the previous one trying a suspend-to-RAM. If I don't do it, it's fairly stable.
~ $ dmesg | grep nvme
[ 0.081169] Kernel command line: BOOT_IMAGE=/boot/vmlinuz-4.20.12-042012-generic root=UUID=0bb0cacf-7ea4-4eb8-96c3-50af4b066fef ro nvme_core.default_ps_max_latency_us=0
[ 1.105254] nvme nvme0: pci function 0000:05:00.0
[ 1.206666] nvme nvme0: missing or invalid SUBNQN field.
[ 1.211508] nvme0n1: p1 p2 p3 p4 p5 p6
[ 3.750658] EXT4-fs (nvme0n1p1): mounted filesystem with ordered data mode. Opts: (null)
[ 4.105468] EXT4-fs (nvme0n1p1): re-mounted. Opts: errors=remount-ro
[ 4.869196] EXT4-fs (nvme0n1p5): mounted filesystem with ordered data mode. Opts: (null)
Last edited by dentex (2019-02-24 20:24:33)
Offline
message to the air :
- Today's I discovered one more detail: it's not the suspend-to-RAM itself that screws the SSD, but it's when I disconnect the power supply... In fact suspending the laptop retaining the cable attached has no influence AND disconnecting the power supply only, while maintaining the PC awake, has the same effect on the SSD giving up and crashing.
- With another two days of operations, I confirm that the parameter `nvme_core.default_ps_max_latency_us=0` it's the most stable.
Offline
Hi dentex,
sorry for not replying. Thought this thread was dead and got no notifictaion.
I tried your parameter "nvme_core.default_ps_max_latency_us=0" without any success to the boot issue.
However, it seems a BIOS update for my Gigabyte mainboard has fixed the issue (the changelog didn't mention any). Since then my PC boots fine on every cold boot done.
The only nvme related error I can find via journalctl is
nvme nvme0: missing or invalid SUBNQN field.
So far it seems fine. Didn't test the SSD in a notebook.
Offline
Hi dentex,
sorry for not replying. Thought this thread was dead and got no notifictaion.
I tried your parameter "nvme_core.default_ps_max_latency_us=0" without any success to the boot issue.
However, it seems a BIOS update for my Gigabyte mainboard has fixed the issue (the changelog didn't mention any). Since then my PC boots fine on every cold boot done.
The only nvme related error I can find via journalctl is
nvme nvme0: missing or invalid SUBNQN field.
So far it seems fine. Didn't test the SSD in a notebook.
Hello, Sorry for my late reply too
I ended up removing the Crucial SSD. I RMA'ed it and took a Samsung 970 EVO... which works like a charm.
Offline
Pages: 1