You are not logged in.
Hey there, I'm running into an issue where my system randomly locks up (can't interact with terminals, new windows, etc., however I can move the mouse around in my DE).
I'm seeing an error in dmesg that seems to correspond to when the lock-up happens. Looks like this:
[35997.592789] ata2.00: exception Emask 0x0 SAct 0x3f000000 SErr 0xd0000 action 0x6 frozen
[35997.592828] ata2: SError: { PHYRdyChg CommWake 10B8B }
[35997.592850] ata2.00: failed command: WRITE FPDMA QUEUED
[35997.592870] ata2.00: cmd 61/08:c0:b0:73:85/00:00:08:00:00/40 tag 24 ncq dma 4096 out
res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[35997.592927] ata2.00: status: { DRDY }
[35997.592943] ata2.00: failed command: WRITE FPDMA QUEUED
[35997.592963] ata2.00: cmd 61/08:c8:e8:75:85/00:00:08:00:00/40 tag 25 ncq dma 4096 out
res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[35997.593028] ata2.00: status: { DRDY }
[35997.593047] ata2.00: failed command: WRITE FPDMA QUEUED
[35997.593067] ata2.00: cmd 61/08:d0:68:76:85/00:00:08:00:00/40 tag 26 ncq dma 4096 out
res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[35997.593126] ata2.00: status: { DRDY }
[35997.593142] ata2.00: failed command: WRITE FPDMA QUEUED
[35997.593162] ata2.00: cmd 61/08:d8:28:7b:85/00:00:08:00:00/40 tag 27 ncq dma 4096 out
res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[35997.593218] ata2.00: status: { DRDY }
[35997.593231] ata2.00: failed command: WRITE FPDMA QUEUED
[35997.593247] ata2.00: cmd 61/08:e0:78:7b:85/00:00:08:00:00/40 tag 28 ncq dma 4096 out
res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[35997.593290] ata2.00: status: { DRDY }
[35997.593303] ata2.00: failed command: WRITE FPDMA QUEUED
[35997.593320] ata2.00: cmd 61/08:e8:d8:7b:85/00:00:08:00:00/40 tag 29 ncq dma 4096 out
res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[35997.593363] ata2.00: status: { DRDY }
[35997.593376] ata2: hard resetting link
[35998.062790] ata2: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[35998.068032] ata2.00: configured for UDMA/133
[35998.068040] ahci 0000:0e:00.0: port does not support device sleep
[35998.068087] ata2: EH complete
[39587.879366] ata2.00: exception Emask 0x10 SAct 0xf0000 SErr 0x4050000 action 0xe frozen
[39587.879409] ata2.00: irq_stat 0x00000040, connection status changed
[39587.879433] ata2: SError: { PHYRdyChg CommWake DevExch }
[39587.879454] ata2.00: failed command: WRITE FPDMA QUEUED
[39587.879474] ata2.00: cmd 61/08:80:78:7b:85/00:00:08:00:00/40 tag 16 ncq dma 4096 out
res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x10 (ATA bus error)
[39587.879532] ata2.00: status: { DRDY }
[39587.879548] ata2.00: failed command: WRITE FPDMA QUEUED
[39587.879568] ata2.00: cmd 61/08:88:d8:7b:85/00:00:08:00:00/40 tag 17 ncq dma 4096 out
res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x10 (ATA bus error)
[39587.879625] ata2.00: status: { DRDY }
[39587.879641] ata2.00: failed command: WRITE FPDMA QUEUED
[39587.879660] ata2.00: cmd 61/08:90:00:7c:85/00:00:08:00:00/40 tag 18 ncq dma 4096 out
res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x10 (ATA bus error)
[39587.879718] ata2.00: status: { DRDY }
[39587.879734] ata2.00: failed command: WRITE FPDMA QUEUED
[39587.879753] ata2.00: cmd 61/08:98:80:7d:85/00:00:08:00:00/40 tag 19 ncq dma 4096 out
res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x10 (ATA bus error)
[39587.879811] ata2.00: status: { DRDY }
[39587.879833] ata2: hard resetting link
[39588.776254] ata2: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[39588.781462] ata2.00: configured for UDMA/133
[39588.781468] ahci 0000:0e:00.0: port does not support device sleep
[39588.781514] ata2: EH complete
As far as I can tell, the disk in question is `/dev/sdb`, which is a btrfs-formatted drive, with a subvolumes for `/` and `~/`
Fstab looks like this:
# /dev/sdb1 LABEL=arch
UUID=d9dfa3d3-a12a-44d2-b48c-c603473700e8 / btrfs rw,noatime,autodefrag,compress=zstd,commit=120,discard=async,subvol=/_active/rootvol 0 0
# /dev/sdc1
UUID=7EBB-117E /boot vfat rw,noatime,fmask=0022,dmask=0022,codepage=437,iocharset=ascii,shortname=mixed,utf8,errors=remount-ro 0 2
# /dev/sdb1 LABEL=arch
UUID=d9dfa3d3-a12a-44d2-b48c-c603473700e8 /home btrfs rw,noatime,autodefrag,compress=zstd,commit=120,discard=async,subvol=/_active/homevol 0 0
# /dev/sdb1 LABEL=arch
UUID=d9dfa3d3-a12a-44d2-b48c-c603473700e8 /mnt/defvol btrfs rw,noatime,autodefrag,compress=zstd,commit=120,discard=async,subvol=/ 0 0
Some smartctl output:
smartctl 7.4 2023-08-01 r5530 [x86_64-linux-6.11.7-arch1-1] (local build)
Copyright (C) 2002-23, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF READ SMART DATA SECTION ===
General SMART Values:
Offline data collection status: (0x00) Offline data collection activity
was never started.
Auto Offline Data Collection: Disabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: ( 0) seconds.
Offline data collection
capabilities: (0x11) SMART execute Offline immediate.
No Auto Offline data collection support.
Suspend Offline collection upon new
command.
No Offline surface scan supported.
Self-test supported.
No Conveyance Self-test supported.
No Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 2) minutes.
Extended self-test routine
recommended polling time: ( 10) minutes.
And then after waiting 10 minutes for `sudo smartctl -t long /dev/sdb`, `sudo smartctl -H /dev/sdb` reports:
smartctl 7.4 2023-08-01 r5530 [x86_64-linux-6.11.7-arch1-1] (local build)
Copyright (C) 2002-23, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
Anyone have any troubleshooting I can try? My understanding of the smartctl output is that the drive itself doesn't have any issues, but I'm close to just ordering a new one to see if that fixes it.
Offline
I have the same issue here, I have no idea the path to find the culprit. I have though found that it does not happen on the linux-lts kernel.
Not that there is any noticeable difference, but I will include my log and fstab aswell.
ata2.00: exception Emask 0x10 SAct 0x200000 SErr 0x4050000 action 0xe frozen
ata2.00: irq_stat 0x00000040, connection status changed
ata2: SError: { PHYRdyChg CommWake DevExch }
ata2.00: failed command: WRITE FPDMA QUEUED
ata2.00: cmd 61/08:a8:28:71:a9/00:00:55:00:00/40 tag 21 ncq dma 4096 out
res 40/00:01:00:4f:c2/00:00:00:00:00/00 Emask 0x10 (ATA bus error)
ata2.00: status: { DRDY }
UUID=f3bae795-c6a0-4c64-9d9b-3e20df34cf0e /home btrfs subvol=/@home,defaults,noatime,compress=zstd 0 0
UUID=f3bae795-c6a0-4c64-9d9b-3e20df34cf0e /var/cache btrfs subvol=/@cache,defaults,noatime,compress=zstd 0 0
UUID=f3bae795-c6a0-4c64-9d9b-3e20df34cf0e /var/log btrfs subvol=/@log,defaults,noatime,compress=zstd 0 0
UUID=E67B-896F /boot vfat defaults,noatime 0 2
UUID=0a0f530d-110c-46ae-bb4f-8b68db01cb34 swap swap defaults 0 0
tmpfs /tmp tmpfs defaults,noatime,mode=1777 0 0
UUID=f3bae795-c6a0-4c64-9d9b-3e20df34cf0e / btrfs subvol=/@,defaults,noatime,compress=zstd 0 0
UUID=e05873fe-92ce-41a0-9429-d1a1f3c02ecb /mnt/jason/ssd2 btrfs nosuid,nodev,nofail,x-gvfs-show 0 0
Offline
smartctl -H is borderline useless, if you want to look at telling output a smartctl -a is necessary at the minimum (though newer smartctl versions also suggest -X for more info)
as for the ATA issues, the default power saving mode was changed somewhat recently, try whether explicitly going with max_performance helps: https://wiki.archlinux.org/title/Power_ … Management -- if you're using TLP or similar, they will configure and change this, check the configuration there.
Offline
Fixing this for everyone would mean that for your drive mode a quirk is added to the linux kernel that removes the new default from your device.
Offline