You are not logged in.

#1 2020-06-28 13:09:13

Al.Piotrowicz
Member
Registered: 2017-08-07
Posts: 116

[SOLVED] Kernel 5.7.6 weird sata problem

Although I've posted a similar issues in the past, this time I would like that some experienced user could narrow me a little bit what is (or most likely may be) a cause of my system behaviour. The sata port randomly resets and pushes this into the log:

cze 28 13:36:39 rebro kernel: ata1: softreset failed (device not ready)
cze 28 13:36:49 rebro kernel: ata1: softreset failed (device not ready)
cze 28 13:36:59 rebro kernel: ata1: link is slow to respond, please be patient (ready=0)
cze 28 13:37:24 rebro kernel: ata1: softreset failed (device not ready)
cze 28 13:37:24 rebro kernel: ata1: limiting SATA link speed to 1.5 Gbps
cze 28 13:37:29 rebro udisksd[2810]: Error performing housekeeping for drive /org/freedesktop/UDisks2/drives/WDC_WD20EARS_00S8B1_WD_WCAVY5975648: Error upd>
                                       0000: 70 00 05 00  00 00 00 0a  00 40 00 00  21 04 00 00    p........@..!...
                                       0010: 00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00    ................
                                        (g-io-error-quark, 0)
cze 28 13:37:29 rebro kernel: ata1: softreset failed (device not ready)
cze 28 13:37:29 rebro kernel: ata1: reset failed, giving up
cze 28 13:37:29 rebro kernel: ata1.00: disabled

Following obviously some xfs mounted disk lvm partition errors:

cze 28 13:38:51 rebro kernel: sd 0:0:0:0: [sda] tag#31 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK cmd_age=0s
cze 28 13:38:51 rebro kernel: sd 0:0:0:0: [sda] tag#31 CDB: Read(10) 28 00 3e b5 fa c0 00 00 08 00
cze 28 13:38:51 rebro kernel: blk_update_request: I/O error, dev sda, sector 1052113600 op 0x0:(READ) flags 0x81700 phys_seg 1 prio class 0
cze 28 13:38:51 rebro kernel: sd 0:0:0:0: [sda] tag#0 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK cmd_age=0s
cze 28 13:38:51 rebro kernel: sd 0:0:0:0: [sda] tag#0 CDB: Read(10) 28 00 3e b5 fa c0 00 00 08 00
cze 28 13:38:51 rebro kernel: blk_update_request: I/O error, dev sda, sector 1052113600 op 0x0:(READ) flags 0x1000 phys_seg 1 prio class 0
cze 28 13:38:51 rebro kernel: XFS (dm-9): metadata I/O error in "xfs_btree_read_buf_block.constprop.0+0xbc/0x100 [xfs]" at daddr 0x3eb5bac0 len 8 error 5
cze 28 13:38:51 rebro kernel: sd 0:0:0:0: [sda] tag#1 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK cmd_age=0s
cze 28 13:38:51 rebro kernel: sd 0:0:0:0: [sda] tag#1 CDB: Read(10) 28 00 3e b5 fa c0 00 00 08 00
cze 28 13:38:51 rebro kernel: blk_update_request: I/O error, dev sda, sector 1052113600 op 0x0:(READ) flags 0x81700 phys_seg 1 prio class 0
cze 28 13:38:51 rebro kernel: sd 0:0:0:0: [sda] tag#2 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK cmd_age=0s
cze 28 13:38:51 rebro kernel: sd 0:0:0:0: [sda] tag#2 CDB: Read(10) 28 00 3e b5 fa c0 00 00 08 00
cze 28 13:38:51 rebro kernel: blk_update_request: I/O error, dev sda, sector 1052113600 op 0x0:(READ) flags 0x1000 phys_seg 1 prio class 0
cze 28 13:38:51 rebro kernel: XFS (dm-9): metadata I/O error in "xfs_btree_read_buf_block.constprop.0+0xbc/0x100 [xfs]" at daddr 0x3eb5bac0 len 8 error 5
cze 28 13:38:51 rebro kernel: sd 0:0:0:0: [sda] tag#3 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK cmd_age=0s
cze 28 13:38:51 rebro kernel: sd 0:0:0:0: [sda] tag#3 CDB: Read(10) 28 00 3e b5 fa c0 00 00 08 00
cze 28 13:38:51 rebro kernel: blk_update_request: I/O error, dev sda, sector 1052113600 op 0x0:(READ) flags 0x81700 phys_seg 1 prio class 0
cze 28 13:38:51 rebro kernel: sd 0:0:0:0: [sda] tag#4 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK cmd_age=0s
cze 28 13:38:51 rebro kernel: sd 0:0:0:0: [sda] tag#4 CDB: Read(10) 28 00 3e b5 fa c0 00 00 08 00
cze 28 13:38:51 rebro kernel: blk_update_request: I/O error, dev sda, sector 1052113600 op 0x0:(READ) flags 0x1000 phys_seg 1 prio class 0
cze 28 13:38:51 rebro kernel: XFS (dm-9): metadata I/O error in "xfs_btree_read_buf_block.constprop.0+0xbc/0x100 [xfs]" at daddr 0x3eb5bac0 len 8 error 5
cze 28 13:38:51 rebro kernel: sd 0:0:0:0: [sda] tag#5 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK cmd_age=0s
cze 28 13:38:51 rebro kernel: sd 0:0:0:0: [sda] tag#5 CDB: Read(10) 28 00 3e b5 fa c0 00 00 08 00
cze 28 13:38:51 rebro kernel: blk_update_request: I/O error, dev sda, sector 1052113600 op 0x0:(READ) flags 0x81700 phys_seg 1 prio class 0
cze 28 13:38:51 rebro kernel: sd 0:0:0:0: [sda] tag#6 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK cmd_age=0s
cze 28 13:38:51 rebro kernel: sd 0:0:0:0: [sda] tag#6 CDB: Read(10) 28 00 3e b5 fa c0 00 00 08 00
cze 28 13:38:51 rebro kernel: blk_update_request: I/O error, dev sda, sector 1052113600 op 0x0:(READ) flags 0x1000 phys_seg 1 prio class 0
cze 28 13:38:51 rebro kernel: XFS (dm-9): metadata I/O error in "xfs_btree_read_buf_block.constprop.0+0xbc/0x100 [xfs]" at daddr 0x3eb5bac0 len 8 error 5
cze 28 13:38:51 rebro kernel: sd 0:0:0:0: [sda] tag#7 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK cmd_age=0s
cze 28 13:38:51 rebro kernel: sd 0:0:0:0: [sda] tag#7 CDB: Read(10) 28 00 3e b5 fa c0 00 00 08 00
cze 28 13:38:51 rebro kernel: blk_update_request: I/O error, dev sda, sector 1052113600 op 0x0:(READ) flags 0x81700 phys_seg 1 prio class 0
cze 28 13:38:51 rebro kernel: sd 0:0:0:0: [sda] tag#12 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK cmd_age=0s
cze 28 13:38:51 rebro kernel: sd 0:0:0:0: [sda] tag#12 CDB: Read(10) 28 00 3e b5 fa c0 00 00 08 00
cze 28 13:38:51 rebro kernel: blk_update_request: I/O error, dev sda, sector 1052113600 op 0x0:(READ) flags 0x1000 phys_seg 1 prio class 0
cze 28 13:38:51 rebro kernel: XFS (dm-9): metadata I/O error in "xfs_btree_read_buf_block.constprop.0+0xbc/0x100 [xfs]" at daddr 0x3eb5bac0 len 8 error 5
cze 28 13:38:51 rebro kernel: XFS (dm-9): metadata I/O error in "xfs_btree_read_buf_block.constprop.0+0xbc/0x100 [xfs]" at daddr 0x3eb5bac0 len 8 error 5
cze 28 13:38:51 rebro kernel: XFS (dm-9): metadata I/O error in "xfs_btree_read_buf_block.constprop.0+0xbc/0x100 [xfs]" at daddr 0x3eb5bac0 len 8 error 5
cze 28 13:38:51 rebro kernel: XFS (dm-9): metadata I/O error in "xfs_btree_read_buf_block.constprop.0+0xbc/0x100 [xfs]" at daddr 0x3eb5bac0 len 8 error 5
cze 28 13:38:51 rebro kernel: XFS (dm-9): metadata I/O error in "xfs_btree_read_buf_block.constprop.0+0xbc/0x100 [xfs]" at daddr 0x3eb5bac0 len 8 error 5
cze 28 13:38:51 rebro kernel: XFS (dm-9): metadata I/O error in "xfs_btree_read_buf_block.constprop.0+0xbc/0x100 [xfs]" at daddr 0x3eb5bac0 len 8 error 5

Finally it shuts down the filesystem I assume:

cze 28 13:38:58 rebro kernel: scsi_io_completion_action: 32786 callbacks suppressed
cze 28 13:38:58 rebro kernel: sd 0:0:0:0: [sda] tag#21 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK cmd_age=0s
cze 28 13:38:58 rebro kernel: sd 0:0:0:0: [sda] tag#21 CDB: Read(10) 28 00 3e b5 fa c0 00 00 08 00
cze 28 13:38:58 rebro kernel: print_req_error: 32786 callbacks suppressed
cze 28 13:38:58 rebro kernel: blk_update_request: I/O error, dev sda, sector 1052113600 op 0x0:(READ) flags 0x81700 phys_seg 1 prio class 0
cze 28 13:38:58 rebro kernel: sd 0:0:0:0: [sda] tag#22 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK cmd_age=0s
cze 28 13:38:58 rebro kernel: sd 0:0:0:0: [sda] tag#22 CDB: Read(10) 28 00 3e b5 fa c0 00 00 08 00
cze 28 13:38:58 rebro kernel: blk_update_request: I/O error, dev sda, sector 1052113600 op 0x0:(READ) flags 0x1000 phys_seg 1 prio class 0
cze 28 13:38:58 rebro kernel: xfs_buf_ioerror_alert: 16388 callbacks suppressed
cze 28 13:38:58 rebro kernel: XFS (dm-9): metadata I/O error in "xfs_btree_read_buf_block.constprop.0+0xbc/0x100 [xfs]" at daddr 0x3eb5bac0 len 8 error 5
cze 28 13:38:58 rebro kernel: sd 0:0:0:0: [sda] tag#23 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK cmd_age=0s
cze 28 13:38:58 rebro kernel: sd 0:0:0:0: [sda] tag#23 CDB: Read(10) 28 00 3e b5 fa c0 00 00 08 00
cze 28 13:38:58 rebro kernel: blk_update_request: I/O error, dev sda, sector 1052113600 op 0x0:(READ) flags 0x81700 phys_seg 1 prio class 0
cze 28 13:38:58 rebro kernel: sd 0:0:0:0: [sda] tag#31 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK cmd_age=0s
cze 28 13:38:58 rebro kernel: sd 0:0:0:0: [sda] tag#31 CDB: Read(10) 28 00 3e b5 fa c0 00 00 08 00
cze 28 13:38:58 rebro kernel: blk_update_request: I/O error, dev sda, sector 1052113600 op 0x0:(READ) flags 0x1000 phys_seg 1 prio class 0
cze 28 13:38:58 rebro kernel: XFS (dm-9): metadata I/O error in "xfs_btree_read_buf_block.constprop.0+0xbc/0x100 [xfs]" at daddr 0x3eb5bac0 len 8 error 5
cze 28 13:38:58 rebro kernel: sd 0:0:0:0: [sda] tag#0 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK cmd_age=0s
cze 28 13:38:58 rebro kernel: sd 0:0:0:0: [sda] tag#0 CDB: Read(10) 28 00 3e b5 fa c0 00 00 08 00
cze 28 13:38:58 rebro kernel: blk_update_request: I/O error, dev sda, sector 1052113600 op 0x0:(READ) flags 0x81700 phys_seg 1 prio class 0
cze 28 13:38:58 rebro kernel: sd 0:0:0:0: [sda] tag#1 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK cmd_age=0s
cze 28 13:38:58 rebro kernel: sd 0:0:0:0: [sda] tag#1 CDB: Read(10) 28 00 3e b5 fa c0 00 00 08 00
cze 28 13:38:58 rebro kernel: blk_update_request: I/O error, dev sda, sector 1052113600 op 0x0:(READ) flags 0x1000 phys_seg 1 prio class 0
cze 28 13:38:58 rebro kernel: XFS (dm-9): metadata I/O error in "xfs_btree_read_buf_block.constprop.0+0xbc/0x100 [xfs]" at daddr 0x3eb5bac0 len 8 error 5
cze 28 13:38:58 rebro kernel: sd 0:0:0:0: [sda] tag#2 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK cmd_age=0s
cze 28 13:38:58 rebro kernel: sd 0:0:0:0: [sda] tag#2 CDB: Read(10) 28 00 3e b5 fa c0 00 00 08 00
cze 28 13:38:58 rebro kernel: blk_update_request: I/O error, dev sda, sector 1052113600 op 0x0:(READ) flags 0x81700 phys_seg 1 prio class 0
cze 28 13:38:58 rebro kernel: sd 0:0:0:0: [sda] tag#3 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK cmd_age=0s
cze 28 13:38:58 rebro kernel: sd 0:0:0:0: [sda] tag#3 CDB: Read(10) 28 00 3e b5 fa c0 00 00 08 00
cze 28 13:38:58 rebro kernel: blk_update_request: I/O error, dev sda, sector 1052113600 op 0x0:(READ) flags 0x1000 phys_seg 1 prio class 0
cze 28 13:38:58 rebro kernel: XFS (dm-9): metadata I/O error in "xfs_btree_read_buf_block.constprop.0+0xbc/0x100 [xfs]" at daddr 0x3eb5bac0 len 8 error 5
cze 28 13:38:58 rebro kernel: sd 0:0:0:0: [sda] tag#4 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK cmd_age=0s
cze 28 13:38:58 rebro kernel: sd 0:0:0:0: [sda] tag#4 CDB: Read(10) 28 00 3e b5 fa c0 00 00 08 00
cze 28 13:38:58 rebro kernel: blk_update_request: I/O error, dev sda, sector 1052113600 op 0x0:(READ) flags 0x81700 phys_seg 1 prio class 0
cze 28 13:38:58 rebro kernel: sd 0:0:0:0: [sda] tag#5 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK cmd_age=0s
cze 28 13:38:58 rebro kernel: sd 0:0:0:0: [sda] tag#5 CDB: Read(10) 28 00 3e b5 fa c0 00 00 08 00
cze 28 13:38:58 rebro kernel: blk_update_request: I/O error, dev sda, sector 1052113600 op 0x0:(READ) flags 0x1000 phys_seg 1 prio class 0
cze 28 13:38:58 rebro kernel: XFS (dm-9): metadata I/O error in "xfs_btree_read_buf_block.constprop.0+0xbc/0x100 [xfs]" at daddr 0x3eb5bac0 len 8 error 5
cze 28 13:38:58 rebro kernel: XFS (dm-9): metadata I/O error in "xfs_btree_read_buf_block.constprop.0+0xbc/0x100 [xfs]" at daddr 0x3eb5bac0 len 8 error 5
cze 28 13:38:58 rebro kernel: XFS (dm-9): metadata I/O error in "xfs_btree_read_buf_block.constprop.0+0xbc/0x100 [xfs]" at daddr 0x3eb5bac0 len 8 error 5
cze 28 13:38:58 rebro kernel: XFS (dm-9): metadata I/O error in "xfs_btree_read_buf_block.constprop.0+0xbc/0x100 [xfs]" at daddr 0x3eb5bac0 len 8 error 5
cze 28 13:38:58 rebro kernel: XFS (dm-9): metadata I/O error in "xfs_btree_read_buf_block.constprop.0+0xbc/0x100 [xfs]" at daddr 0x3eb5bac0 len 8 error 5
cze 28 13:38:59 rebro kernel: XFS (dm-9): metadata I/O error in "xfs_btree_read_buf_block.constprop.0+0xbc/0x100 [xfs]" at daddr 0x3eb5bac0 len 8 error 5
cze 28 13:39:01 rebro kernel: XFS (dm-9): log I/O error -5
cze 28 13:39:01 rebro kernel: XFS (dm-9): xfs_do_force_shutdown(0x2) called from line 1196 of file fs/xfs/xfs_log.c. Return address = 00000000878a19ac
cze 28 13:39:01 rebro kernel: XFS (dm-9): Log I/O Error Detected. Shutting down filesystem
cze 28 13:39:01 rebro kernel: XFS (dm-9): Please unmount the filesystem and rectify the problem(s)

After the soft reboot disk isnt recognized by the system bios until I did a power off and powered it on back again, the disk initialized gracefully and Im able to work without a problem with it. SMART is not leaving me any clues (no errors, bad sectors, nothing)

I'd really appreciate any sugestion about what could trigger this behaviour. Thank you.

Last edited by Al.Piotrowicz (2020-08-18 09:37:00)

Offline

#2 2020-06-28 13:47:12

EdeWolf
Member
Registered: 2016-01-06
Posts: 79

Re: [SOLVED] Kernel 5.7.6 weird sata problem

I've had similar errors due to a bad sata cable. In fact, more than once.

However, if the problem is gone, once you switch back to a 5.6 Kernel, than this is not a solution, of course.

I suspect:

smartctl -l error /dev/sda

does not show up any errors or at least raise the error count, if repeated after a while?

Offline

#3 2020-06-28 13:52:46

Al.Piotrowicz
Member
Registered: 2017-08-07
Posts: 116

Re: [SOLVED] Kernel 5.7.6 weird sata problem

I'm now on LTS official one, waiting for it to trigger:

[user@rebro ~]$ sudo smartctl -l error /dev/sda
[sudo] hasło użytkownika user: 
smartctl 7.1 2019-12-30 r5022 [x86_64-linux-5.4.49-1-lts] (local build)
Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF READ SMART DATA SECTION ===
SMART Error Log Version: 1
No Errors Logged

Offline

#4 2020-06-30 13:43:55

EdeWolf
Member
Registered: 2016-01-06
Posts: 79

Re: [SOLVED] Kernel 5.7.6 weird sata problem

So the drive seems to be fine. A reason more, If the error does occur again with the lts kernel, to change the sata cable and if that does not help, the sata port as well. No guarantee, of course, but I've had this more than once.
As a last resort you could try adding libata.force=noncq to your kernel command line. But I believe remembering ncq errors to having produced different error messages.

Offline

#5 2020-07-01 10:31:47

rep_movsd
Member
Registered: 2013-08-24
Posts: 133

Re: [SOLVED] Kernel 5.7.6 weird sata problem

Recently I too started facing some random freezes - the system just locks up for 10 to 15 seconds - mouse and keyboard and display frozen, audio continues to play - when it recovers all the things I did with the mouse happen.

I have 3 hard disks on my laptop - 2 NVME and 1 SATA
I was running RAID 0 with root partition across all 3

I was getting some error related to ata4 DRDY or something
I forgot that ata refers to my SATA disk, and since 1 of the NVME disks showed 6 counts of media errors in SMART , I jumped to the conclusion that it was the reason...
Other 2 disks had no errors.

So I migrated the install onto the SATA drive on a single non-RAID partition
Still was getting crashes - most of them on Chromium
Then I noticed in journalctl that Xorg was crashing with a coredump in some nvidia gl core binary

I reverted my entire pacman from the rollback machine to June 15th
Still no use

Then I realized the ata4 error was for my SATA drive, and remigrated the whole install again to the error-free NVMe drive (PHEW)
After this I got neither of these errors in dmesg or journalctl, but still occasional freezing happened - this time I saw chromium doing the core dump in dmesg

Then I read on some random forum about nvidia 440.82 having issues when the GPU clock tries to ramp up. I have 440.100, but seemed a likely explanation.
Setting the Powermizer settings to full performance seems to cure the problem!

I also switched to linux-ck in between, but the freezes were not solved by the different kernel

Waiting now to see if I get that issue again

Last edited by rep_movsd (2020-07-01 10:34:29)

Offline

#6 2020-07-01 15:10:57

rep_movsd
Member
Registered: 2013-08-24
Posts: 133

Re: [SOLVED] Kernel 5.7.6 weird sata problem

Follow up

Enabling kernel modeset seems to be the culprit - not only does chromium cause freezes with that, but the cursor doesnt render correctly - it doesn't draw transparently sometimes
I tried disabling the GPU rendering options in chromium to no avail - it doesn't disable all GPU usage

After removing the nvidia-drm.modeset=1 kernel parameter it seems to be OK

Offline

#7 2020-07-03 15:36:25

Al.Piotrowicz
Member
Registered: 2017-08-07
Posts: 116

Re: [SOLVED] Kernel 5.7.6 weird sata problem

Hey rep_movsd, thanks for your feedback, but your problem doesn't really seem to be related, because I don't notice any libata status expansion { DRDY } errors Take a look here. For me it's a SATA port reset like I mentioned in the OP. I did some investigation meanwhile and found out the problem is most probably connected to some kernel feature called SATA port multiplexer. I'm probably affected by the issue here, because I own SATA controller:

00:11.0 SATA controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 SATA Controller [AHCI mode] (rev 40)
03:00.0 SATA controller: JMicron Technology Corp. JMB362 SATA Controller (rev 10)

After recompilling kernel with CONFIG_SATA_PMP=n the issue still keeps going but it just resets the disk ( Power-Off_Retract_Count and Power_Cycle_Count goes up ) after this came up in the journal on self compiled 5.7.6-arch

lip 02 22:19:09 rebro kernel: ata1: link is slow to respond, please be patient (ready=0)
lip 02 22:19:10 rebro kernel: ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
lip 02 22:19:11 rebro kernel: ata1.00: configured for UDMA/133

This time I doesn't affect the mounted filesystem in any way. Still investigating. It can be the HW issue though, but I'm not that smart.

EDIT: fixed KVER... & typo

Last edited by Al.Piotrowicz (2020-07-03 15:43:18)

Offline

#8 2020-08-18 09:36:41

Al.Piotrowicz
Member
Registered: 2017-08-07
Posts: 116

Re: [SOLVED] Kernel 5.7.6 weird sata problem

Managed to solve the issue by compiling kernel with CONFIG_SATA_PMP=n flag. No more problem since then. Marking as solved.

Offline

Board footer

Powered by FluxBB