You are not logged in.

#1 2015-09-26 15:29:06

matyilona200
Member
Registered: 2012-06-21
Posts: 77

Problems with HDD at boot (errno=-19)

I have a old machine with a few years old western digital wd2500aajb hdd. It has a single partrition with /home on it. After a blackout I was dropped to maintanace started running fsck on the disk, then there was an other blackout while fsck was running. When rebooting systemd stopped with A start job is running for dev-disk-by... then it droped me to a maintainance shell. There is nothing in /dev for the hdd. When looking at the output of dmesg I see that there is a model number mismatch. Here is the interesting part of dmesg:

[    0.887231] ata_piix 0000:00:1f.1: version 2.13
[    0.887244] ata_piix 0000:00:1f.1: enabling device (0005 -> 0007)
[    0.893415] scsi host0: ata_piix
[    0.899380] scsi host1: ata_piix
[    0.899521] ata1: PATA max UDMA/100 cmd 0x1f0 ctl 0x3f6 bmdma 0x14c0 irq 14
[    0.899526] ata2: PATA max UDMA/100 cmd 0x170 ctl 0x376 bmdma 0x14c8 irq 15
[    0.899910] ata_piix 0000:00:1f.2: MAP [ P0 -- P1 -- ]
[    0.971586] input: AT Translated Set 2 keyboard as /devices/platform/i8042/serio0/input/input0
[    1.050982] scsi host2: ata_piix
[    1.051321] scsi host3: ata_piix
[    1.051431] ata3: SATA max UDMA/133 cmd 0x14f8 ctl 0x1810 bmdma 0x14d0 irq 18
[    1.051435] ata4: SATA max UDMA/133 cmd 0x1800 ctl 0x1814 bmdma 0x14d8 irq 18
[    1.067451] ata2.00: ATA-8: WDC WD2500AAJB-00WGA0, 00.02C01, max UDMA/100
[    1.067458] ata2.00: 488397168 sectors, multi 16: LBA48 
[    1.080345] ata2.00: configured for UDMA/100
[    1.110285] ata1.00: ATA-6: ST340014A, 3.06, max UDMA/100
[    1.110289] ata1.00: 78165360 sectors, multi 16: LBA 
[    1.123496] ata1.00: configured for UDMA/100
[    1.123667] scsi 0:0:0:0: Direct-Access     ATA      ST340014A        3.06 PQ: 0 ANSI: 5
[    1.124491] scsi 1:0:0:0: Direct-Access     ATA      WDC WD2500AAJB-0 2C01 PQ: 0 ANSI: 5
[    1.190041] usb usb1-port7: over-current condition
[    1.245244] sd 0:0:0:0: [sda] 78165360 512-byte logical blocks: (40.0 GB/37.2 GiB)
[    1.245332] sd 0:0:0:0: [sda] Write Protect is off
[    1.245337] sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
[    1.245375] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[    1.245982] sd 1:0:0:0: [sdb] 488397168 512-byte logical blocks: (250 GB/232 GiB)
[    1.246065] sd 1:0:0:0: [sdb] Write Protect is off
[    1.246070] sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00
[    1.246106] sd 1:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[    1.247489]  sda: sda1 sda2
[    1.248052] sd 0:0:0:0: [sda] Attached SCSI disk
[    1.263381] ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6
[    1.263444] ata2.00: BMDMA stat 0x24
[    1.263494] ata2.00: failed command: READ DMA
[    1.263548] ata2.00: cmd c8/00:08:00:00:00/00:00:00:00:00/e0 tag 0 dma 4096 in
                        res 51/84:00:07:00:00/00:00:00:00:00/e0 Emask 0x10 (ATA bus error)
[    1.263641] ata2.00: status: { DRDY ERR }
[    1.263689] ata2.00: error: { ICRC ABRT }
[    1.263784] ata2: soft resetting link
[    1.300018] tsc: Refined TSC clocksource calibration: 2659.999 MHz
[    1.300024] clocksource tsc: mask: 0xffffffffffffffff max_cycles: 0x2657a34898c, max_idle_ns: 440795323804 ns
[    1.437669] ata2.00: configured for UDMA/100
[    1.437737] ata2: EH complete
[    1.453367] ata2.00: limiting speed to UDMA/66:PIO4
[    1.453373] ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6
[    1.453429] ata2.00: BMDMA stat 0x24
[    1.453479] ata2.00: failed command: READ DMA
[    1.453531] ata2.00: cmd c8/00:08:00:00:00/00:00:00:00:00/e0 tag 0 dma 4096 in
                        res 51/84:00:07:00:00/00:00:00:00:00/e0 Emask 0x10 (ATA bus error)
[    1.453625] ata2.00: status: { DRDY ERR }
[    1.453674] ata2.00: error: { ICRC ABRT }
[    1.453749] ata2: soft resetting link
[    1.636822] ata2.00: model number mismatch 'WDC WD2500AAJB-00WGA0' != 'WDC WD2500AAJB-00WGQ0'
[    1.636826] ata2.00: revalidation failed (errno=-19)
[    1.636878] ata2.00: limiting speed to UDMA/66:PIO3
[    2.300066] Switched to clocksource tsc
[    6.606710] ata2: soft resetting link
[    6.786825] ata2.00: model number mismatch 'WDC WD2500AAJB-00WGA0' != 'WDC WD2500AAJB-00WGQ0'
[    6.786829] ata2.00: revalidation failed (errno=-19)
[    6.786884] ata2.00: disabled
[   11.760035] ata2: soft resetting link
[   11.929610] ata2.00: ATA-8: WDC WD2500AAJB-00WGA0, 00.02C01, max UDMA/100
[   11.929615] ata2.00: 488397168 sectors, multi 16: LBA48 
[   11.940150] ata2.00: model number mismatch 'WDC WD2500AAJB-00WGA0' != 'WDC WD2500AAJB-00WGQ0'
[   11.940154] ata2.00: revalidation failed (errno=-19)
[   11.940207] ata2.00: limiting speed to UDMA/100:PIO3
[   16.913367] ata2: soft resetting link
[   17.093491] ata2.00: model number mismatch 'WDC WD2500AAJB-00WGA0' != 'WDC WD2500AAJB-00WGQ0'
[   17.093495] ata2.00: revalidation failed (errno=-19)
[   17.093548] ata2.00: disabled
[   17.093575] sd 1:0:0:0: [sdb] tag#0 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=0x08
[   17.093581] sd 1:0:0:0: [sdb] tag#0 Sense Key : 0xb [current] [descriptor] 
[   17.093585] sd 1:0:0:0: [sdb] tag#0 ASC=0x47 ASCQ=0x0 
[   17.093590] sd 1:0:0:0: [sdb] tag#0 CDB: opcode=0x28 28 00 00 00 00 00 00 00 08 00
[   17.093593] blk_update_request: I/O error, dev sdb, sector 0
[   17.093647] Buffer I/O error on dev sdb, logical block 0, async page read
[   17.093748] sd 1:0:0:0: rejecting I/O to offline device
[   17.093802] sd 1:0:0:0: killing request
[   17.093809] sd 1:0:0:0: rejecting I/O to offline device
[   17.093867] ldm_validate_partition_table(): Disk read failed.
[   17.093877] sd 1:0:0:0: rejecting I/O to offline device
[   17.093936] sd 1:0:0:0: rejecting I/O to offline device
[   17.093996] sd 1:0:0:0: rejecting I/O to offline device
[   17.094049]  sdb: unable to read partition table
[   17.094211] sd 1:0:0:0: [sdb] Attached SCSI disk
[   17.095402] sd 1:0:0:0: rejecting I/O to offline device
[   17.095476] sd 1:0:0:0: rejecting I/O to offline device
[   17.095533] sd 1:0:0:0: rejecting I/O to offline device
[   17.095589] sd 1:0:0:0: [sdb] Read Capacity(16) failed: Result: hostbyte=0x01 driverbyte=0x00
[   17.095594] sd 1:0:0:0: [sdb] Sense not available.
[   17.095600] sd 1:0:0:0: rejecting I/O to offline device
[   17.095656] sd 1:0:0:0: rejecting I/O to offline device
[   17.095711] sd 1:0:0:0: rejecting I/O to offline device
[   17.095766] sd 1:0:0:0: [sdb] Read Capacity(10) failed: Result: hostbyte=0x01 driverbyte=0x00
[   17.095770] sd 1:0:0:0: [sdb] Sense not available.
[   17.095776] sd 1:0:0:0: rejecting I/O to offline device
[   17.095832] sd 1:0:0:0: rejecting I/O to offline device
[   17.095890] sd 1:0:0:0: rejecting I/O to offline device
[   17.095947] sd 1:0:0:0: rejecting I/O to offline device
[   17.096752] ata2: EH complete
[   17.096786] ata2.00: detaching (SCSI 1:0:0:0)
[   17.097092] sd 1:0:0:0: [sdb] Stopping disk
[   17.097134] sd 1:0:0:0: [sdb] Start/Stop Unit failed: Result: hostbyte=0x04 driverbyte=0x00
[   18.157496] EXT4-fs (sda1): mounted filesystem with ordered data mode. Opts: (null)
[   18.363768] random: nonblocking pool is initialized
[   19.227857] ip_tables: (C) 2000-2006 Netfilter Core Team

the whole is here. I only found a few thing about similar issues, like this and this. My situation is similar to the second link in that  'WDC WD2500AAJB-00WGA0' and 'WDC WD2500AAJB-00WGQ0' differ in 1bit, literaly one bit flipped. It is mentioned in the other posts that it is/might be a kernel bug, so I tryed booting with the LTS kernel, with the same results. Since the hdd does not show up in /dev I cant try the other stuff recomended like smartctl. If its any help here are the outputs for dmidecode and lshw. Any help is appreciated.

Offline

#2 2015-09-26 16:52:09

mich41
Member
Registered: 2012-06-22
Posts: 796

Re: Problems with HDD at boot (errno=-19)

So you are having bit flip errors on the SATA link. Try different cable, different SATA ports, make sure that the cable isn't bent tightly, try to move it away from other signals. If possible, test the disk in other machine.

Offline

#3 2015-09-26 17:56:47

matyilona200
Member
Registered: 2012-06-21
Posts: 77

Re: Problems with HDD at boot (errno=-19)

Unfortunatly I dont have the spare cabels or an other machine to try right now, but after multiple reboots the names are the same that suggest its not some random cabel error.

Offline

#4 2015-09-26 18:19:06

mich41
Member
Registered: 2012-06-22
Posts: 796

Re: Problems with HDD at boot (errno=-19)

Then at least try swapping these disks between ports.

If you performed update before it stopped working, try reverting to some older kernel from /var/cache/pacman/pkg.

You can also attempt running smartctl using SCSI generic interface:

modprobe sg
smartctl -a /dev/sgN # check dmesg|tail to find N, probably it's 1

Last edited by mich41 (2015-09-26 18:21:37)

Offline

#5 2015-09-27 00:48:17

matyilona200
Member
Registered: 2012-06-21
Posts: 77

Re: Problems with HDD at boot (errno=-19)

Swapping cabels did not change anything. Changing to LTS kernel or downgrading to earlier normal kernels (3 or 4) did not change anything. The output of smartctl:

smartctl 6.4 2015-06-04 r4109 [i686-linux-4.1.6-1-ARCH] (local build)
Copyright (C) 2002-15, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Seagate Barracuda 7200.7 and 7200.7 Plus
Device Model:     ST340014A
Serial Number:    3JX4R80Y
Firmware Version: 3.06
User Capacity:    40,020,664,320 bytes [40.0 GB]
Sector Size:      512 bytes logical/physical
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ATA/ATAPI-6 T13/1410D revision 2
Local Time is:    Sat Sep 26 20:58:32 2015 UTC
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x82)	Offline data collection activity
					was completed without error.
					Auto Offline Data Collection: Enabled.
Self-test execution status:      (   0)	The previous self-test routine completed
					without error or no self-test has ever 
					been run.
Total time to complete Offline 
data collection: 		(  430) seconds.
Offline data collection
capabilities: 			 (0x5b) SMART execute Offline immediate.
					Auto Offline data collection on/off support.
					Suspend Offline collection upon new
					command.
					Offline surface scan supported.
					Self-test supported.
					No Conveyance Self-test supported.
					Selective Self-test supported.
SMART capabilities:            (0x0003)	Saves SMART data before entering
					power-saving mode.
					Supports SMART auto save timer.
Error logging capability:        (0x01)	Error logging supported.
					No General Purpose Logging support.
Short self-test routine 
recommended polling time: 	 (   1) minutes.
Extended self-test routine
recommended polling time: 	 (  31) minutes.

SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000f   060   052   006    Pre-fail  Always       -       72701212
  3 Spin_Up_Time            0x0003   098   098   000    Pre-fail  Always       -       0
  4 Start_Stop_Count        0x0032   100   100   020    Old_age   Always       -       1
  5 Reallocated_Sector_Ct   0x0033   100   100   036    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x000f   087   060   030    Pre-fail  Always       -       614763870
  9 Power_On_Hours          0x0032   064   064   000    Old_age   Always       -       31990
 10 Spin_Retry_Count        0x0013   100   100   097    Pre-fail  Always       -       0
 12 Power_Cycle_Count       0x0032   099   099   020    Old_age   Always       -       1986
194 Temperature_Celsius     0x0022   033   054   000    Old_age   Always       -       33
195 Hardware_ECC_Recovered  0x001a   060   052   000    Old_age   Always       -       72701212
197 Current_Pending_Sector  0x0012   100   100   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0010   100   100   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x003e   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x0000   100   253   000    Old_age   Offline      -       0
202 Data_Address_Mark_Errs  0x0032   100   253   000    Old_age   Always       -       0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Completed without error       00%         0         -

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

it seems ok to me.

Offline

#6 2015-09-27 06:10:45

mich41
Member
Registered: 2012-06-22
Posts: 796

Re: Problems with HDD at boot (errno=-19)

Model Family:     Seagate Barracuda 7200.7 and 7200.7 Plus
Device Model:     ST340014A

Really? wink

Offline

#7 2015-09-27 11:50:40

R00KIE
Forum Fellow
From: Between a computer and a chair
Registered: 2008-09-14
Posts: 4,734

Re: Problems with HDD at boot (errno=-19)

How old is the power supply in that machine? Do you have any mods/addon lights/water pumps inside the case?

This [1] seems to point to either a bad cable or a bad power supply as the most common problems. Try swapping cables if you have more that one ide cable, or swap the drives' configuration between primary/secondary. If you can try to test the WD drive in another machine.

[1] https://ata.wiki.kernel.org/index.php/L … _expansion


R00KIE
Tm90aGluZyB0byBzZWUgaGVyZSwgbW92ZSBhbG9uZy4K

Offline

#8 2015-09-27 16:18:19

matyilona200
Member
Registered: 2012-06-21
Posts: 77

Re: Problems with HDD at boot (errno=-19)

Oh, sorry it seems I was very tired, the hdd in the smartctl output is the other one mounted at /, old but working properly. The WD drive doesnt show up with sg either. The power supply is 8-10 years old as the whole machine except the WD drive wich is just 4-5 years old. I dug up an other old pc, switched around the cabels and drives, but the error remained in every setup.

Offline

#9 2015-09-27 17:38:43

mich41
Member
Registered: 2012-06-22
Posts: 796

Re: Problems with HDD at boot (errno=-19)

Well, failure to work with know-good kernels and in another machine with another PSU suggests malfunction of disk electronics.

Depending on the exact nature of the problem, you may be able to rescue your data by reducing communication speed. See libata.force in kernel-parameters.txt. I'd start with libata.force=udma/33, maybe udma/16 if this thing really exists. Or the PIO modes, why not.

If this fails, you will need to replace disk's PCB or send it to some data recovery company.

Offline

#10 2015-09-27 18:31:50

R00KIE
Forum Fellow
From: Between a computer and a chair
Registered: 2008-09-14
Posts: 4,734

Re: Problems with HDD at boot (errno=-19)

Does the WD disk show up during the bios post? It should show up in the first screen when drives are detected. It might be as mich41 says and the disk just gave up the ghost, I have one 30GB WD disk that died while unplugged, it spins but it's as if it isn't connected to the ide cable, the same might have happened to yours during the second power failure.


R00KIE
Tm90aGluZyB0byBzZWUgaGVyZSwgbW92ZSBhbG9uZy4K

Offline

#11 2015-09-28 20:16:08

matyilona200
Member
Registered: 2012-06-21
Posts: 77

Re: Problems with HDD at boot (errno=-19)

Since the drive had no crucial information, I given up on it. Thanks for all the help.

Offline

Board footer

Powered by FluxBB