You are not logged in.

#1 2025-01-19 17:24:59

jzhu
Member
Registered: 2012-06-17
Posts: 30

Resurface issue: mv_sas can't find all hdds (only find 2)

I am reviving an old workstation which has 5 hdd bays. Recently when I add more discs (initially I only had two) and I found kernel can only successfully detect two of them while the BIOS can see all the discs. kernel.org already had the same report ( kernel.org Bug 214967 ) and fixed it back in 2021. but seems this issue resurfaced again. I posted issue in kernel.org and then realized that I should report back in Arch since I don't build the kernel myself.

my kernel:

 6.6.72-1-lts 

journal:

Jan 14 08:33:36 d20-x8664 kernel: drivers/scsi/mvsas/mv_sas.c 1058:phy 0 attach dev info is 0
Jan 14 08:33:36 d20-x8664 kernel: drivers/scsi/mvsas/mv_sas.c 1060:phy 0 attach sas addr is 0
Jan 14 08:33:36 d20-x8664 kernel: drivers/scsi/mvsas/mv_sas.c 1058:phy 1 attach dev info is 0
Jan 14 08:33:36 d20-x8664 kernel: drivers/scsi/mvsas/mv_sas.c 1060:phy 1 attach sas addr is 0
Jan 14 08:33:36 d20-x8664 kernel: drivers/scsi/mvsas/mv_sas.c 1058:phy 2 attach dev info is 0
Jan 14 08:33:36 d20-x8664 kernel: drivers/scsi/mvsas/mv_sas.c 1060:phy 2 attach sas addr is 0
Jan 14 08:33:36 d20-x8664 kernel: drivers/scsi/mvsas/mv_sas.c 1058:phy 3 attach dev info is 0
Jan 14 08:33:36 d20-x8664 kernel: drivers/scsi/mvsas/mv_sas.c 1060:phy 3 attach sas addr is 3
Jan 14 08:33:36 d20-x8664 kernel: drivers/scsi/mvsas/mv_sas.c 1058:phy 4 attach dev info is 0
Jan 14 08:33:36 d20-x8664 kernel: drivers/scsi/mvsas/mv_sas.c 1060:phy 4 attach sas addr is 0
Jan 14 08:33:36 d20-x8664 kernel: drivers/scsi/mvsas/mv_sas.c 1058:phy 5 attach dev info is 0
Jan 14 08:33:36 d20-x8664 kernel: drivers/scsi/mvsas/mv_sas.c 1060:phy 5 attach sas addr is 0
Jan 14 08:33:36 d20-x8664 kernel: drivers/scsi/mvsas/mv_sas.c 1058:phy 6 attach dev info is 0
Jan 14 08:33:36 d20-x8664 kernel: drivers/scsi/mvsas/mv_sas.c 1060:phy 6 attach sas addr is 0
Jan 14 08:33:36 d20-x8664 kernel: drivers/scsi/mvsas/mv_sas.c 1058:phy 7 attach dev info is 0
Jan 14 08:33:36 d20-x8664 kernel: drivers/scsi/mvsas/mv_sas.c 1060:phy 7 attach sas addr is 7
Jan 14 08:33:36 d20-x8664 kernel: scsi host6: mvsas
Jan 14 08:33:36 d20-x8664 kernel: drivers/scsi/mvsas/mv_sas.c 261:phy 0 byte dmaded.
Jan 14 08:33:36 d20-x8664 kernel: drivers/scsi/mvsas/mv_sas.c 261:phy 3 byte dmaded.
Jan 14 08:33:36 d20-x8664 kernel: drivers/scsi/mvsas/mv_sas.c 261:phy 7 byte dmaded.
Jan 14 08:33:36 d20-x8664 kernel: sas: phy-6:0 added to port-6:0, phy_mask:0x1 (0000000000000000)
Jan 14 08:33:36 d20-x8664 kernel: sas: DOING DISCOVERY on port 0, pid:11
Jan 14 08:33:36 d20-x8664 kernel: sas: Enter sas_scsi_recover_host busy: 0 failed: 0
Jan 14 08:33:36 d20-x8664 kernel: sas: ata7: end_device-6:0: dev error handler
Jan 14 08:33:36 d20-x8664 kernel: ata7.00: ATA-8: WDC WD2500AAJS-08L7A0, 03.03E03, max UDMA/100
Jan 14 08:33:36 d20-x8664 kernel: ata7.00: 488397168 sectors, multi 0: LBA48 NCQ (depth 32)
Jan 14 08:33:36 d20-x8664 kernel: ata7.00: configured for UDMA/100
Jan 14 08:33:36 d20-x8664 kernel: sas: --- Exit sas_scsi_recover_host: busy: 0 failed: 0 tries: 1
Jan 14 08:33:36 d20-x8664 kernel: scsi 6:0:0:0: Direct-Access     ATA      WDC WD2500AAJS-0 3E03 PQ: 0 ANSI: 5
Jan 14 08:33:36 d20-x8664 kernel: sas: DONE DISCOVERY on port 0, pid:11, result:0
Jan 14 08:33:36 d20-x8664 kernel: sas: phy-6:3 added to port-6:1, phy_mask:0x8 (0300000000000000)
Jan 14 08:33:36 d20-x8664 kernel: sas: DOING DISCOVERY on port 1, pid:11
Jan 14 08:33:36 d20-x8664 kernel: sas: Enter sas_scsi_recover_host busy: 0 failed: 0
Jan 14 08:33:36 d20-x8664 kernel: sas: ata7: end_device-6:0: dev error handler
Jan 14 08:33:36 d20-x8664 kernel: sas: ata8: end_device-6:1: dev error handler
Jan 14 08:33:36 d20-x8664 kernel: ata8.00: ATA-8: ST31000524NS, 130C, max UDMA/133
Jan 14 08:33:36 d20-x8664 kernel: ata8.00: 1953525168 sectors, multi 0: LBA48 NCQ (depth 32)
Jan 14 08:33:36 d20-x8664 kernel: drivers/scsi/mvsas/mv_sas.c 1585:port 3 slot 0 rx_desc 20000 has error info0000000081000000.
Jan 14 08:33:36 d20-x8664 kernel: drivers/scsi/mvsas/mv_sas.c 1585:port 3 slot 0 rx_desc 20000 has error info0000000081000000.
Jan 14 08:33:36 d20-x8664 kernel: ata8.00: configured for UDMA/133
Jan 14 08:33:36 d20-x8664 kernel: sas: --- Exit sas_scsi_recover_host: busy: 0 failed: 0 tries: 1
Jan 14 08:33:36 d20-x8664 kernel: scsi 6:0:1:0: Direct-Access     ATA      ST31000524NS     130C PQ: 0 ANSI: 5
Jan 14 08:33:36 d20-x8664 kernel: sas: DONE DISCOVERY on port 1, pid:11, result:0
Jan 14 08:33:36 d20-x8664 kernel: sas: phy-6:7 added to port-6:2, phy_mask:0x80 (0700000000000000)
Jan 14 08:33:36 d20-x8664 kernel: sas: DOING DISCOVERY on port 2, pid:11
Jan 14 08:33:36 d20-x8664 kernel: sas: Enter sas_scsi_recover_host busy: 0 failed: 0
Jan 14 08:33:36 d20-x8664 kernel: sas: ata7: end_device-6:0: dev error handler
Jan 14 08:33:36 d20-x8664 kernel: sas: ata8: end_device-6:1: dev error handler
Jan 14 08:33:36 d20-x8664 kernel: sas: ata9: end_device-6:2: dev error handler
Jan 14 08:33:36 d20-x8664 kernel: drivers/scsi/mvsas/mv_sas.c 1585:port 7 slot 0 rx_desc 20000 has error info0000000081000000.
Jan 14 08:33:36 d20-x8664 kernel: sas: --- Exit sas_scsi_recover_host: busy: 0 failed: 0 tries: 1
Jan 14 08:33:36 d20-x8664 kernel: sas: sas_probe_sata: for direct-attached device 0700000000000000 returned -19
Jan 14 08:33:36 d20-x8664 kernel: drivers/scsi/mvsas/mv_sas.c 1229:found dev[2:5] is gone.
Jan 14 08:33:36 d20-x8664 kernel: sas: DONE DISCOVERY on port 2, pid:11, result:0
Jan 14 08:33:36 d20-x8664 kernel: sd 6:0:0:0: [sda] 488397168 512-byte logical blocks: (250 GB/233 GiB)
Jan 14 08:33:36 d20-x8664 kernel: sd 6:0:0:0: [sda] Write Protect is off
Jan 14 08:33:36 d20-x8664 kernel: sd 6:0:0:0: [sda] Mode Sense: 00 3a 00 00
Jan 14 08:33:36 d20-x8664 kernel: sd 6:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Jan 14 08:33:36 d20-x8664 kernel: sd 6:0:0:0: [sda] Preferred minimum I/O size 512 bytes
Jan 14 08:33:36 d20-x8664 kernel: sd 6:0:1:0: [sdb] 1953525168 512-byte logical blocks: (1.00 TB/932 GiB)
Jan 14 08:33:36 d20-x8664 kernel: sd 6:0:1:0: [sdb] Write Protect is off
Jan 14 08:33:36 d20-x8664 kernel: sd 6:0:1:0: [sdb] Mode Sense: 00 3a 00 00
Jan 14 08:33:36 d20-x8664 kernel: sd 6:0:1:0: [sdb] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA
Jan 14 08:33:36 d20-x8664 kernel: sd 6:0:1:0: [sdb] Preferred minimum I/O size 512 bytes
Jan 14 08:33:36 d20-x8664 kernel:  sda: sda1 sda2
Jan 14 08:33:36 d20-x8664 kernel: sd 6:0:0:0: [sda] Attached SCSI disk
Jan 14 08:33:36 d20-x8664 kernel:  sdb: sdb1 sdb2 sdb3
Jan 14 08:33:36 d20-x8664 kernel: sd 6:0:1:0: [sdb] Attached SCSI disk
Jan 14 08:33:36 d20-x8664 kernel: sas: Enter sas_scsi_recover_host busy: 2 failed: 2
Jan 14 08:33:36 d20-x8664 kernel: sas: ata7: end_device-6:0: cmd error handler
Jan 14 08:33:36 d20-x8664 kernel: sas: ata8: end_device-6:1: cmd error handler
Jan 14 08:33:36 d20-x8664 kernel: sas: ata7: end_device-6:0: dev error handler
Jan 14 08:33:36 d20-x8664 kernel: sas: ata8: end_device-6:1: dev error handler
Jan 14 08:33:36 d20-x8664 kernel: sas: --- Exit sas_scsi_recover_host: busy: 0 failed: 2 tries: 1

Offline

#2 2025-01-19 17:36:15

gromit
Package Maintainer (PM)
From: Germany
Registered: 2024-02-10
Posts: 908
Website

Re: Resurface issue: mv_sas can't find all hdds (only find 2)

With which version of the kernel did the issue first occur? Also which version are you currently using?
Does switching to the linux-lts package help?

Offline

#3 2025-01-19 20:08:53

jzhu
Member
Registered: 2012-06-17
Posts: 30

Re: Resurface issue: mv_sas can't find all hdds (only find 2)

Thanks and these are good questions; not thought of difference between regular kernel vs lts one.

1. the 2021 posted issue in kernel.org showed regular kernel version of 5.15.x.
2. I use lts kernel for long time. and only recently I found the problem since I tried to add more discs. So I don't know when this issue emerged.
3. I assumed lts kernel would have no difference from the regular one if the issue was fixed back in 5.15 version and would be clean in upstream versions. Apparently I need to test the regular kernel if current regular linux shows the same problem as in lts one. Will report back.

Offline

#4 2025-01-19 22:03:15

jzhu
Member
Registered: 2012-06-17
Posts: 30

Re: Resurface issue: mv_sas can't find all hdds (only find 2)

update:

under the regular linux:

 6.12.10-arch1-1 

, same problem as seen in the above lts kernel.

Any comments?

Offline

#5 2025-01-22 21:55:59

jzhu
Member
Registered: 2012-06-17
Posts: 30

Re: Resurface issue: mv_sas can't find all hdds (only find 2)

Update and self-note for the solution for future reference:
1. The mvsas issue for not able to detect all hdds in the boot was first raised in 2012 and a complete explanation and patch was posted here 2012 .
2. The issue came back again in 2021 and reported here 2021
3. unfortunately the 2021 bug patch seems incomplete (see below) and missing the following statement that was in the original patch 2012:

	memset( SATA_RECEIVED_D2H_FIS(mvi_dev->taskfileset), 0,
			sizeof(struct dev_to_host_fis) );

the 2021 bug fixer mentioned to break a long statement into 2 lines and I think this was the line but ended up missed from the patch and all versions after that.
4. Using bisection method proposed in kernel.org, I found all kernel versions have the same issue and with the same mvsas driver, 0.8.16; (if a patch is applied, is the version meant to be the same?)
5. I ended up to compile a custom kernel, based off the stable

 6.12.10 

then insert the above statement into mv_sas.c at arround line 450. Then follow the steps, here in wiki , to rebuild kernel. Note, after patching, use the following command to build so that makepkg will NOT override modified source code:

 makepkg -e 

6. reboot into the new customized kernel, and

 lsblk 

to find your additional discs.

question is: is there anyway to get this fix into the main line of kernel? (don't want to patch it from my side everytime kernel is updated -- take me several hours to rebuild).

Cheers.

Offline

#6 2025-01-22 22:17:58

gromit
Package Maintainer (PM)
From: Germany
Registered: 2024-02-10
Posts: 908
Website

Re: Resurface issue: mv_sas can't find all hdds (only find 2)

Could you post the full diff/patch you have applied in the end?

Offline

#7 2025-01-23 00:56:56

jzhu
Member
Registered: 2012-06-17
Posts: 30

Re: Resurface issue: mv_sas can't find all hdds (only find 2)

Here is the patch from diff:

diff -urpN linux/src/linux-6.12.10/drivers/scsi/mvsas/mv_sas.c linux-custom/src/linux-6.12.10/drivers/scsi/mvsas/mv_sas.c
--- linux/src/linux-6.12.10/drivers/scsi/mvsas/mv_sas.c	2025-01-22 18:50:56.693221869 -0500
+++ linux-custom/src/linux-6.12.10/drivers/scsi/mvsas/mv_sas.c	2025-01-22 18:30:27.249943408 -0500
@@ -447,6 +447,8 @@ static int mvs_task_prep_ata(struct mvs_
 			mvi_dev->device_id);
 		return -EBUSY;
 	}
+	memset( SATA_RECEIVED_D2H_FIS(mvi_dev->taskfileset), 0,
+			sizeof(struct dev_to_host_fis) );
 	slot = &mvi->slot_info[tag];
 	slot->tx = mvi->tx_prod;
 	del_q = TXQ_MODE_I | tag |

Offline

#8 2025-01-27 08:34:51

gromit
Package Maintainer (PM)
From: Germany
Registered: 2024-02-10
Posts: 908
Website

Re: Resurface issue: mv_sas can't find all hdds (only find 2)

Do you want to send this patch to the linux kernel developers or at least report the bug? If you want I can help you with this smile

Offline

#9 2025-01-27 16:54:43

jzhu
Member
Registered: 2012-06-17
Posts: 30

Re: Resurface issue: mv_sas can't find all hdds (only find 2)

Hi, Grommit:

I definitely want to see this patch to be integrated into the mainline kernel development. I would think it could help some users out there; even though users, including myself, can compile a customer kernel, it is too cumbersome and time consuming to do so (for me it took hours to get kernel compled)...

I am not familar with the official kernel developers or official bug report/fixing. Although I initially reported this to kernel bugzilla , I don't think it is the right place for the official kernel development. So if you could help me out, it is great.

Thank you.

Offline

Board footer

Powered by FluxBB