You are not logged in.
Hi,
I installed a rasdaemon for monitoring errors on system, and errors raising on nvme disk.
$ ras-mc-ctl --errors
55 2020-06-01 14:52:09 +0200 error: dev=0:66304, sector=76952384, nr_sector=256, error='unknown block error', rwbs='RA', cmd='',
56 2020-06-01 15:40:41 +0200 error: dev=0:0, sector=-1, nr_sector=0, error='I/O error', rwbs='N', cmd='',
57 2020-06-01 15:40:41 +0200 error: dev=0:2048, sector=-1, nr_sector=1, error='I/O error', rwbs='N', cmd='',
58 2020-06-01 15:40:41 +0200 error: dev=0:2048, sector=-1, nr_sector=1, error='I/O error', rwbs='N', cmd='',
59 2020-06-01 15:40:41 +0200 error: dev=0:2048, sector=-1, nr_sector=0, error='I/O error', rwbs='N', cmd='',
60 2020-06-01 15:40:41 +0200 error: dev=0:2048, sector=-1, nr_sector=1, error='I/O error', rwbs='N', cmd='',
61 2020-06-01 15:40:41 +0200 error: dev=0:2048, sector=-1, nr_sector=1, error='I/O error', rwbs='N', cmd='',
62 2020-06-01 15:40:41 +0200 error: dev=0:2048, sector=-1, nr_sector=1, error='I/O error', rwbs='N', cmd='',
63 2020-06-01 15:40:41 +0200 error: dev=0:2048, sector=-1, nr_sector=0, error='I/O error', rwbs='N', cmd='',
64 2020-06-01 15:40:41 +0200 error: dev=0:2048, sector=-1, nr_sector=0, error='I/O error', rwbs='N', cmd='',
65 2020-06-01 15:40:41 +0200 error: dev=0:2048, sector=-1, nr_sector=0, error='I/O error', rwbs='N', cmd='',
66 2020-06-01 15:40:47 +0200 error: dev=0:2048, sector=-1, nr_sector=0, error='I/O error', rwbs='N', cmd='',
67 2020-06-01 15:40:47 +0200 error: dev=0:2048, sector=-1, nr_sector=0, error='I/O error', rwbs='N', cmd='',
68 2020-06-01 15:45:15 +0200 error: dev=0:2048, sector=-1, nr_sector=0, error='I/O error', rwbs='N', cmd='',
69 2020-06-01 15:45:15 +0200 error: dev=0:2048, sector=-1, nr_sector=1, error='I/O error', rwbs='N', cmd='',
70 2020-06-01 15:45:15 +0200 error: dev=0:2048, sector=-1, nr_sector=1, error='I/O error', rwbs='N', cmd='',
71 2020-06-01 15:45:15 +0200 error: dev=0:2048, sector=-1, nr_sector=1, error='I/O error', rwbs='N', cmd='',
72 2020-06-01 15:45:15 +0200 error: dev=0:2048, sector=-1, nr_sector=0, error='I/O error', rwbs='N', cmd='',
73 2020-06-01 15:55:15 +0200 error: dev=0:2048, sector=-1, nr_sector=0, error='I/O error', rwbs='N', cmd='',
74 2020-06-01 15:55:15 +0200 error: dev=0:2048, sector=-1, nr_sector=1, error='I/O error', rwbs='N', cmd='',
75 2020-06-01 15:55:15 +0200 error: dev=0:2048, sector=-1, nr_sector=1, error='I/O error', rwbs='N', cmd='',
76 2020-06-01 15:55:15 +0200 error: dev=0:2048, sector=-1, nr_sector=1, error='I/O error', rwbs='N', cmd='',
77 2020-06-01 15:55:15 +0200 error: dev=0:2048, sector=-1, nr_sector=0, error='I/O error', rwbs='N', cmd='',
78 2020-06-01 16:05:15 +0200 error: dev=0:2048, sector=-1, nr_sector=0, error='I/O error', rwbs='N', cmd='',
79 2020-06-01 16:15:15 +0200 error: dev=0:2048, sector=-1, nr_sector=0, error='I/O error', rwbs='N', cmd='',
80 2020-06-01 16:25:15 +0200 error: dev=0:2048, sector=-1, nr_sector=0, error='I/O error', rwbs='N', cmd='',
81 2020-06-01 16:35:00 +0200 error: dev=0:66304, sector=84908160, nr_sector=56, error='unknown block error', rwbs='RA', cmd='',
82 2020-06-01 16:35:00 +0200 error: dev=0:66304, sector=84908128, nr_sector=32, error='unknown block error', rwbs='RA', cmd='',
83 2020-06-01 16:35:00 +0200 error: dev=0:66304, sector=76952384, nr_sector=256, error='unknown block error', rwbs='RA', cmd='',
84 2020-06-01 16:35:04 +0200 error: dev=0:2048, sector=-1, nr_sector=1, error='I/O error', rwbs='N', cmd='',
85 2020-06-01 16:35:04 +0200 error: dev=0:2048, sector=-1, nr_sector=0, error='I/O error', rwbs='N', cmd='',
86 2020-06-01 16:35:04 +0200 error: dev=0:2048, sector=-1, nr_sector=0, error='I/O error', rwbs='N', cmd='',
87 2020-06-01 16:35:04 +0200 error: dev=0:2048, sector=-1, nr_sector=1, error='I/O error', rwbs='N', cmd='',
88 2020-06-01 16:35:04 +0200 error: dev=0:2048, sector=-1, nr_sector=0, error='I/O error', rwbs='N', cmd='',
89 2020-06-01 16:35:04 +0200 error: dev=0:2048, sector=-1, nr_sector=0, error='I/O error', rwbs='N', cmd='',
90 2020-06-01 16:35:04 +0200 error: dev=0:2048, sector=-1, nr_sector=1, error='I/O error', rwbs='N', cmd='',
91 2020-06-01 16:35:04 +0200 error: dev=0:2048, sector=-1, nr_sector=0, error='I/O error', rwbs='N', cmd='',
92 2020-06-01 16:35:04 +0200 error: dev=0:2048, sector=-1, nr_sector=1, error='I/O error', rwbs='N', cmd='',
93 2020-06-01 16:35:04 +0200 error: dev=0:2048, sector=-1, nr_sector=1, error='I/O error', rwbs='N', cmd='',
94 2020-06-01 16:35:04 +0200 error: dev=0:2048, sector=-1, nr_sector=0, error='I/O error', rwbs='N', cmd='',
95 2020-06-01 16:45:04 +0200 error: dev=0:2048, sector=-1, nr_sector=0, error='I/O error', rwbs='N', cmd='',The model is not fully detected:
$ lspci -v
0c:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller SM981/PM981/PM983 (prog-if 02 [NVM Express])
Subsystem: Samsung Electronics Co Ltd NVMe SSD Controller SM981/PM981/PM983
Flags: bus master, fast devsel, latency 0, IRQ 45, NUMA node 0
Memory at fcf00000 (64-bit, non-prefetchable) [size=16K]
Capabilities: [40] Power Management version 3
Capabilities: [50] MSI: Enable- Count=1/1 Maskable- 64bit+
Capabilities: [70] Express Endpoint, MSI 00
Capabilities: [b0] MSI-X: Enable+ Count=33 Masked-
Capabilities: [100] Advanced Error Reporting
Capabilities: [148] Device Serial Number 00-00-00-00-00-00-00-00
Capabilities: [158] Power Budgeting <?>
Capabilities: [168] Secondary PCI Express
Capabilities: [188] Latency Tolerance Reporting
Capabilities: [190] L1 PM Substates
Kernel driver in use: nvmedmesg:
$ dmesg |grep nvme
[ 0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-linux root=UUID=a469c5e9-2ff0-4023-bd15-9fcc3a9632b2 rw amdgpu.ppfeaturemask=0xffffffff nvme_core.default_ps_max_latency_us=5500 loglevel=3 quiet
[ 0.000000] Kernel command line: BOOT_IMAGE=/boot/vmlinuz-linux root=UUID=a469c5e9-2ff0-4023-bd15-9fcc3a9632b2 rw amdgpu.ppfeaturemask=0xffffffff nvme_core.default_ps_max_latency_us=5500 loglevel=3 quiet
[ 1.140489] nvme nvme0: pci function 0000:0c:00.0
[ 1.353858] nvme nvme0: missing or invalid SUBNQN field.
[ 1.353872] nvme nvme0: Shutdown timeout set to 8 seconds
[ 1.386275] nvme nvme0: 32/0/0 default/read/poll queues
[ 1.394714] nvme0n1: p1 p2 p3
[ 4.713560] EXT4-fs (nvme0n1p2): mounted filesystem with ordered data mode. Opts: (null)
[ 4.931254] EXT4-fs (nvme0n1p2): re-mounted. Opts: (null)
[ 5.343443] Adding 9227464k swap on /dev/nvme0n1p3. Priority:-2 extents:1 across:9227464k SSFS
$ dmesg |grep ACPI
[ 0.000000] BIOS-e820: [mem 0x000000000a200000-0x000000000a20afff] ACPI NVS
[ 0.000000] BIOS-e820: [mem 0x00000000dabb0000-0x00000000dadcbfff] ACPI data
[ 0.000000] BIOS-e820: [mem 0x00000000dadcc000-0x00000000db282fff] ACPI NVS
[ 0.000000] efi: ACPI 2.0=0xdadab000 ACPI=0xdadab000 SMBIOS=0xdc20e000 SMBIOS 3.0=0xdc20d000 ESRT=0xd5bdb518 MEMATTR=0xd595c018
[ 0.000000] ACPI: Early table checksum verification disabled
[ 0.000000] ACPI: RSDP 0x00000000DADAB000 000024 (v02 ALASKA)
[ 0.000000] ACPI: XSDT 0x00000000DADAB0A0 0000B4 (v01 ALASKA A M I 01072009 AMI 00010013)
[ 0.000000] ACPI: FACP 0x00000000DADBA490 000114 (v06 ALASKA A M I 01072009 AMI 00010013)
[ 0.000000] ACPI: DSDT 0x00000000DADAB1E8 00F2A6 (v02 ALASKA A M I 01072009 INTL 20120913)
[ 0.000000] ACPI: FACS 0x00000000DB26AD80 000040
[ 0.000000] ACPI: APIC 0x00000000DADBA5A8 00015E (v03 ALASKA A M I 01072009 AMI 00010013)
[ 0.000000] ACPI: FPDT 0x00000000DADBA708 000044 (v01 ALASKA A M I 01072009 AMI 00010013)
[ 0.000000] ACPI: FIDT 0x00000000DADBA750 00009C (v01 ALASKA A M I 01072009 AMI 00010013)
[ 0.000000] ACPI: SSDT 0x00000000DADBA7F0 0000FC (v02 ALASKA CPUSSDT 01072009 AMI 01072009)
[ 0.000000] ACPI: SSDT 0x00000000DADBA8F0 008C98 (v02 AMD AMD ALIB 00000002 MSFT 04000000)
[ 0.000000] ACPI: SSDT 0x00000000DADC3588 00368A (v01 AMD AMD AOD 00000001 INTL 20120913)
[ 0.000000] ACPI: MCFG 0x00000000DADC6C18 00003C (v01 ALASKA A M I 01072009 MSFT 00010013)
[ 0.000000] ACPI: SSDT 0x00000000DADCBE48 0000BF (v01 AMD AMD PT 00001000 INTL 20120913)
[ 0.000000] ACPI: HPET 0x00000000DADC6CB0 000038 (v01 ALASKA A M I 01072009 AMI 00000005)
[ 0.000000] ACPI: SSDT 0x00000000DADC6CE8 000024 (v01 AMDFCH FCHZP 00001000 INTL 20120913)
[ 0.000000] ACPI: UEFI 0x00000000DADC6D10 000042 (v01 ALASKA A M I 00000002 01000013)
[ 0.000000] ACPI: BGRT 0x00000000DADC6D58 000038 (v01 ALASKA A M I 01072009 AMI 00010013)
[ 0.000000] ACPI: IVRS 0x00000000DADC6D90 0000D0 (v02 AMD AMD IVRS 00000001 AMD 00000000)
[ 0.000000] ACPI: SSDT 0x00000000DADC6E60 002314 (v01 AMD AMD CPU 00000001 AMD 00000001)
[ 0.000000] ACPI: CRAT 0x00000000DADC9178 000F50 (v01 AMD AMD CRAT 00000001 AMD 00000001)
[ 0.000000] ACPI: CDIT 0x00000000DADCA0C8 000029 (v01 AMD AMD CDIT 00000001 AMD 00000001)
[ 0.000000] ACPI: SSDT 0x00000000DADCA0F8 001D4A (v01 AMD AmdTable 00000001 INTL 20120913)
[ 0.000000] ACPI: Local APIC address 0xfee00000
[ 0.000000] ACPI: PM-Timer IO Port: 0x808
[ 0.000000] ACPI: Local APIC address 0xfee00000
[ 0.000000] ACPI: LAPIC_NMI (acpi_id[0xff] high edge lint[0x1])
[ 0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
[ 0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 low level)
[ 0.000000] ACPI: IRQ0 used by override.
[ 0.000000] ACPI: IRQ9 used by override.
[ 0.000000] Using ACPI (MADT) for SMP configuration information
[ 0.000000] ACPI: HPET id: 0x10228201 base: 0xfed00000
[ 0.000000] ACPI: Core revision 20200110
[ 0.774341] PM: Registering ACPI NVS region [mem 0x0a200000-0x0a20afff] (45056 bytes)
[ 0.774341] PM: Registering ACPI NVS region [mem 0xdadcc000-0xdb282fff] (4943872 bytes)
[ 0.774341] ACPI: bus type PCI registered
[ 0.774341] acpiphp: ACPI Hot Plug PCI Controller Driver version: 0.5
[ 0.780110] ACPI: Added _OSI(Module Device)
[ 0.780111] ACPI: Added _OSI(Processor Device)
[ 0.780111] ACPI: Added _OSI(3.0 _SCP Extensions)
[ 0.780112] ACPI: Added _OSI(Processor Aggregator Device)
[ 0.780112] ACPI: Added _OSI(Linux-Dell-Video)
[ 0.780113] ACPI: Added _OSI(Linux-Lenovo-NV-HDMI-Audio)
[ 0.780114] ACPI: Added _OSI(Linux-HPI-Hybrid-Graphics)
[ 0.792290] ACPI: 8 ACPI AML tables successfully acquired and loaded
[ 0.794012] ACPI: [Firmware Bug]: BIOS _OSI(Linux) query ignored
[ 0.797316] ACPI: EC: EC started
[ 0.797316] ACPI: EC: interrupt blocked
[ 0.797405] ACPI: EC: EC_CMD/EC_SC=0x66, EC_DATA=0x62
[ 0.797407] ACPI: \_SB_.PCI0.SBRG.EC0_: Boot DSDT EC used to handle transactions
[ 0.797407] ACPI: Interpreter enabled
[ 0.797420] ACPI: (supports S0 S3 S4 S5)
[ 0.797421] ACPI: Using IOAPIC for interrupt routing
[ 0.797863] PCI: Using host bridge windows from ACPI; if necessary, use "pci=nocrs" and report a bug
[ 0.798182] ACPI: Enabled 3 GPEs in block 00 to 1F
[ 0.808235] ACPI: PCI Root Bridge [PCI0] (domain 0000 [bus 00-ff])
[ 0.816531] ACPI: PCI Interrupt Link [LNKA] (IRQs 4 5 7 10 11 14 15) *0
[ 0.816594] ACPI: PCI Interrupt Link [LNKB] (IRQs 4 5 7 10 11 14 15) *0
[ 0.816652] ACPI: PCI Interrupt Link [LNKC] (IRQs 4 5 7 10 11 14 15) *0
[ 0.816723] ACPI: PCI Interrupt Link [LNKD] (IRQs 4 5 7 10 11 14 15) *0
[ 0.816786] ACPI: PCI Interrupt Link [LNKE] (IRQs 4 5 7 10 11 14 15) *0
[ 0.816839] ACPI: PCI Interrupt Link [LNKF] (IRQs 4 5 7 10 11 14 15) *0
[ 0.816886] ACPI: PCI Interrupt Link [LNKG] (IRQs 4 5 7 10 11 14 15) *0
[ 0.816934] ACPI: PCI Interrupt Link [LNKH] (IRQs 4 5 7 10 11 14 15) *0
[ 0.817447] ACPI: EC: interrupt unblocked
[ 0.817464] ACPI: EC: event unblocked
[ 0.817472] ACPI: EC: EC_CMD/EC_SC=0x66, EC_DATA=0x62
[ 0.817472] ACPI: EC: GPE=0x2
[ 0.817473] ACPI: \_SB_.PCI0.SBRG.EC0_: Boot DSDT EC used to handle transactions and events
[ 0.817517] ACPI: bus type USB registered
[ 0.820052] PCI: Using ACPI for IRQ routing
[ 0.832256] pnp: PnP ACPI init
[ 0.832389] system 00:00: Plug and Play ACPI device, IDs PNP0c01 (active)
[ 0.832468] pnp 00:01: Plug and Play ACPI device, IDs PNP0b00 (active)
[ 0.832644] system 00:02: Plug and Play ACPI device, IDs PNP0c02 (active)
[ 0.832922] system 00:03: Plug and Play ACPI device, IDs PNP0c02 (active)
[ 0.833371] pnp: PnP ACPI: found 4 devices
[ 1.131742] ACPI: Power Button [PWRB]
[ 1.133659] ACPI: Power Button [PWRF]
[ 1.133740] ACPI: \_PR_.C000: Found 2 idle states
[ 1.133865] ACPI: \_PR_.C002: Found 2 idle states
[ 1.133928] ACPI: \_PR_.C004: Found 2 idle states
[ 1.134003] ACPI: \_PR_.C006: Found 2 idle states
[ 1.134094] ACPI: \_PR_.C008: Found 2 idle states
[ 1.137735] ACPI: \_PR_.C00A: Found 2 idle states
[ 1.137850] ACPI: \_PR_.C00C: Found 2 idle states
[ 1.137939] ACPI: \_PR_.C00E: Found 2 idle states
[ 1.138006] ACPI: \_PR_.C001: Found 2 idle states
[ 1.138121] ACPI: \_PR_.C003: Found 2 idle states
[ 1.138219] ACPI: \_PR_.C005: Found 2 idle states
[ 1.138314] ACPI: \_PR_.C007: Found 2 idle states
[ 1.138454] ACPI: \_PR_.C009: Found 2 idle states
[ 1.138608] ACPI: \_PR_.C00B: Found 2 idle states
[ 1.138721] ACPI: \_PR_.C00D: Found 2 idle states
[ 1.138822] ACPI: \_PR_.C00F: Found 2 idle states
[ 3.811983] Ignoring ACPI CRAT on non-APU systemThe nvme's not working well on Linux or i missing anything?
Greetings.
Offline
Everyone has those errors. It's also happening with SATA drives and not just NVME.
That disk error feature of rasdaemon is experimental. It's probably just buggy and not working right. I disabled it here for me by removing the "--enable-diskerror" line in the PKGBUILD.
Offline
Hi,
I disabled APSTE with boot param: nvme_core.default_ps_max_latency_us=0 and no more errors raised.
$ nvme get-feature -f 0x0c -H /dev/nvme0
Autonomous Power State Transition Enable (APSTE): Disabled
Its fine for me, temps are ok.
Offline