You are not logged in.

#1 2013-12-29 08:12:47

ttheodor
Member
Registered: 2013-12-29
Posts: 31

HDD errors during boot

Hello,

it's been a few weeks since the following errors appeared for the first time. I get them every time I boot the computer.
Sometimes all goes well, while other times I have to run fsck on one or more lvm logical volumes to continue booting (i.e. home, boot etc which are mounted on separate logical volumes) (I don't know whether this is relevant but only one of the three drives belongs to an lvm volume group).

Any suggestions?

Thanks,
Theodor

[   11.379855] ata4: exception Emask 0x10 SAct 0x0 SErr 0x4010000 action 0xe frozen
[   11.379912] ata4: irq_stat 0x00400040, connection status changed
[   11.379934] ata4: SError: { PHYRdyChg DevExch }
[   11.379954] ata4: hard resetting link
[   11.386347] ata2: exception Emask 0x10 SAct 0x0 SErr 0x4010000 action 0xe frozen
[   11.386403] ata2: irq_stat 0x00400040, connection status changed
[   11.386447] ata2: SError: { PHYRdyChg DevExch }
[   11.386483] ata2: hard resetting link
[   11.386553] ata1: exception Emask 0x10 SAct 0x0 SErr 0x4010000 action 0xe frozen
[   11.386609] ata1: irq_stat 0x00400040, connection status changed
[   11.386653] ata1: SError: { PHYRdyChg DevExch }
[   11.386688] ata1: hard resetting link
[   18.508080] ata2: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[   18.508121] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[   18.529130] ata2.00: configured for UDMA/133
[   18.529136] ata2: EH complete
[   18.535800] ata1.00: configured for UDMA/133
[   18.535804] ata1: EH complete
[   18.577859] ata2: exception Emask 0x10 SAct 0x0 SErr 0x4010000 action 0xe frozen
[   18.577928] ata2: irq_stat 0x00400040, connection status changed
[   18.577971] ata2: SError: { PHYRdyChg DevExch }
[   18.577991] ata2: hard resetting link
[   18.604794] ata1.00: exception Emask 0x10 SAct 0x1 SErr 0x4010000 action 0xe frozen
[   18.604849] ata1.00: irq_stat 0x00400040, connection status changed
[   18.604872] ata1: SError: { PHYRdyChg DevExch }
[   18.604890] ata1.00: failed command: WRITE FPDMA QUEUED
[   18.604911] ata1.00: cmd 61/10:00:80:57:24/00:00:07:00:00/40 tag 0 ncq 8192 out
         res 40/00:04:80:57:24/00:00:07:00:00/40 Emask 0x10 (ATA bus error)
[   18.604960] ata1.00: status: { DRDY }
[   18.604976] ata1: hard resetting link
[   21.373511] ata4: softreset failed (1st FIS failed)
[   21.373551] ata4: hard resetting link
[   24.178843] ata4: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[   24.363030] ata4.00: configured for UDMA/133
[   24.363036] ata4: EH complete
[   25.701517] ata2: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[   25.728161] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[   25.728741] ata2.00: configured for UDMA/133
[   25.728745] ata2: EH complete
[   25.750370] ata1.00: configured for UDMA/133
[   25.750386] ata1: EH complete
[   25.792429] ata4: exception Emask 0x10 SAct 0x0 SErr 0x4010000 action 0xe frozen
[   25.792485] ata4: irq_stat 0x00400040, connection status changed
[   25.792528] ata4: SError: { PHYRdyChg DevExch }
[   25.792548] ata4: hard resetting link
[   25.813052] ata2: exception Emask 0x10 SAct 0x0 SErr 0x4010000 action 0xe frozen
[   25.813088] ata2: irq_stat 0x00400040, connection status changed
[   25.813111] ata2: SError: { PHYRdyChg DevExch }
[   25.813129] ata2: hard resetting link
[   25.813407] ata1.00: exception Emask 0x10 SAct 0x1 SErr 0x4010000 action 0xe frozen
[   25.813463] ata1.00: irq_stat 0x00400040, connection status changed
[   25.813486] ata1: SError: { PHYRdyChg DevExch }
[   25.813503] ata1.00: failed command: WRITE FPDMA QUEUED
[   25.813524] ata1.00: cmd 61/10:00:c0:15:27/00:00:07:00:00/40 tag 0 ncq 8192 out
         res 40/00:04:c0:15:27/00:00:07:00:00/40 Emask 0x10 (ATA bus error)
[   25.813573] ata1.00: status: { DRDY }
[   25.813588] ata1: hard resetting link
[   32.881735] ata2: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[   32.904750] ata2.00: configured for UDMA/133
[   32.904752] ata2: EH complete
[   32.934970] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[   32.960871] ata1.00: configured for UDMA/133
[   32.960886] ata1: EH complete
[   33.011746] ata2: limiting SATA link speed to 1.5 Gbps
[   33.011751] ata2: exception Emask 0x10 SAct 0x0 SErr 0x4010000 action 0xe frozen
[   33.011792] ata2: irq_stat 0x00400040, connection status changed
[   33.011815] ata2: SError: { PHYRdyChg DevExch }
[   33.011833] ata2: hard resetting link
[   33.016467] ata1: limiting SATA link speed to 1.5 Gbps
[   33.016472] ata1.00: exception Emask 0x10 SAct 0x1 SErr 0x4010000 action 0xe frozen
[   33.016511] ata1.00: irq_stat 0x00400040, connection status changed
[   33.016534] ata1: SError: { PHYRdyChg DevExch }
[   33.016551] ata1.00: failed command: WRITE FPDMA QUEUED
[   33.016572] ata1.00: cmd 61/10:00:e0:06:27/00:00:07:00:00/40 tag 0 ncq 8192 out
         res 40/00:04:e0:06:27/00:00:07:00:00/40 Emask 0x10 (ATA bus error)
[   33.016620] ata1.00: status: { DRDY }
[   33.016635] ata1: hard resetting link
[   35.787058] ata4: softreset failed (1st FIS failed)
[   35.787098] ata4: hard resetting link
[   38.912211] ata4: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[   39.098022] ata4.00: configured for UDMA/133
[   39.098028] ata4: EH complete
[   40.081811] ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
[   40.110026] ata2.00: configured for UDMA/133
[   40.110032] ata2: EH complete
[   40.138453] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
[   40.165709] ata1.00: configured for UDMA/133
[   40.165725] ata1: EH complete
[   40.224806] ata4: exception Emask 0x10 SAct 0x0 SErr 0x4010000 action 0xe frozen
[   40.224862] ata4: irq_stat 0x00400040, connection status changed
[   40.224896] ata4: SError: { PHYRdyChg DevExch }
[   40.224919] ata4: hard resetting link
[   40.253699] ata1.00: exception Emask 0x10 SAct 0x1 SErr 0x4010000 action 0xe frozen
[   40.253755] ata1.00: irq_stat 0x00400040, connection status changed
[   40.253783] ata1: SError: { PHYRdyChg DevExch }
[   40.253801] ata1.00: failed command: WRITE FPDMA QUEUED
[   40.253826] ata1.00: cmd 61/10:00:20:0c:27/00:00:07:00:00/40 tag 0 ncq 8192 out
         res 40/00:04:20:0c:27/00:00:07:00:00/40 Emask 0x10 (ATA bus error)
[   40.253875] ata1.00: status: { DRDY }
[   40.253891] ata1: hard resetting link
[   40.253971] ata2: exception Emask 0x10 SAct 0x0 SErr 0x4010000 action 0xe frozen
[   40.254027] ata2: irq_stat 0x00400040, connection status changed
[   40.254071] ata2: SError: { PHYRdyChg DevExch }
[   40.254107] ata2: hard resetting link
[   46.122319] ata4: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[   46.302501] ata4.00: configured for UDMA/133
[   46.302506] ata4: EH complete
[   47.321777] ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
[   47.348350] ata2.00: configured for UDMA/133
[   47.348355] ata2: EH complete
[   47.375081] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
[   47.402082] ata1.00: configured for UDMA/133
[   47.402099] ata1: EH complete
...
[  113.650271] ata4: limiting SATA link speed to 1.5 Gbps
[  113.650276] ata4: exception Emask 0x10 SAct 0x0 SErr 0x4010000 action 0xe frozen
[  113.650280] ata4: irq_stat 0x00400040, connection status changed
[  113.650282] ata4: SError: { PHYRdyChg DevExch }
[  113.650288] ata4: hard resetting link
[  119.497457] ata4: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
[  119.674349] ata4.00: configured for UDMA/133
[  119.674356] ata4: EH complete

Offline

#2 2014-01-05 11:01:09

user1234
Member
Registered: 2014-01-05
Posts: 1

Re: HDD errors during boot

Hello,
can you describe your HW configuration
Mainboard, SATA drives, installed PCI / PCIe cards, ...
thanks,
Peter

Offline

#3 2014-01-05 11:38:59

Spider.007
Member
Registered: 2004-06-20
Posts: 1,175

Re: HDD errors during boot

Is it a samsung drive and do you have an nvidia card? You might want to read http://ubuntuforums.org/archive/index.p … 38608.html

Also, whenever these sort of errors pop up it might be smart to run a full scan on the drives using smartctl, to make sure the hardware is okay

Offline

#4 2014-01-05 17:11:38

ttheodor
Member
Registered: 2013-12-29
Posts: 31

Re: HDD errors during boot

user1234 wrote:

Hello,
can you describe your HW configuration
Mainboard, SATA drives, installed PCI / PCIe cards, ...
thanks,
Peter

MoBo: Gigabyte X58A-UD3R

Sata Drives: 2X WDC Black 640GB (WD6401AALS ), 1X Seagate ST3500320AS

PCI/PCIe: None except for the GPU: 5870



Spider.007 wrote:

Hello,
Is it a samsung drive and do you have an nvidia card? You might want to read http://ubuntuforums.org/archive/index.p … 38608.html

Also, whenever these sort of errors pop up it might be smart to run a full scan on the drives using smartctl, to make sure the hardware is okay

ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       6
  3 Spin_Up_Time            0x0027   201   160   021    Pre-fail  Always       -       2950
  4 Start_Stop_Count        0x0032   095   095   000    Old_age   Always       -       5644
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x002e   200   200   000    Old_age   Always       -       0
  9 Power_On_Hours          0x0032   078   078   000    Old_age   Always       -       16415
 10 Spin_Retry_Count        0x0032   100   100   000    Old_age   Always       -       0
 11 Calibration_Retry_Count 0x0032   100   100   000    Old_age   Always       -       0
 12 Power_Cycle_Count       0x0032   095   095   000    Old_age   Always       -       5068
192 Power-Off_Retract_Count 0x0032   199   199   000    Old_age   Always       -       1028
193 Load_Cycle_Count        0x0032   199   199   000    Old_age   Always       -       5642
194 Temperature_Celsius     0x0022   105   094   000    Old_age   Always       -       42
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       1
198 Offline_Uncorrectable   0x0030   200   200   000    Old_age   Offline      -       9
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       4
200 Multi_Zone_Error_Rate   0x0008   200   200   000    Old_age   Offline      -       2

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Completed: read failure       90%     16413         141458831
# 2  Extended offline    Completed: read failure       90%     16125         123811039
# 3  Short offline       Completed: read failure       90%     16080         123811039
# 4  Short offline       Completed: read failure       90%     16080         123811039
# 5  Short offline       Completed without error       00%     10771         -
# 6  Short offline       Completed without error       00%     10769         -
# 7  Short offline       Completed without error       00%     10766         -
# 8  Short offline       Completed without error       00%     10765         -
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       3
  3 Spin_Up_Time            0x0027   202   152   021    Pre-fail  Always       -       2891
  4 Start_Stop_Count        0x0032   095   095   000    Old_age   Always       -       5641
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x002e   100   253   000    Old_age   Always       -       0
  9 Power_On_Hours          0x0032   077   077   000    Old_age   Always       -       16812
 10 Spin_Retry_Count        0x0032   100   100   000    Old_age   Always       -       0
 11 Calibration_Retry_Count 0x0032   100   100   000    Old_age   Always       -       0
 12 Power_Cycle_Count       0x0032   095   095   000    Old_age   Always       -       5413
192 Power-Off_Retract_Count 0x0032   199   199   000    Old_age   Always       -       1356
193 Load_Cycle_Count        0x0032   199   199   000    Old_age   Always       -       5641
194 Temperature_Celsius     0x0022   106   095   000    Old_age   Always       -       41
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       2
198 Offline_Uncorrectable   0x0030   200   200   000    Old_age   Offline      -       3
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x0008   200   200   000    Old_age   Offline      -       2

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Completed: read failure       90%     16809         120018891
# 2  Extended offline    Completed without error       00%     16525         -
# 3  Short offline       Completed without error       00%     16479         -
# 4  Short offline       Completed without error       00%     16479         -
# 5  Short offline       Completed without error       00%     10771         -
# 6  Short offline       Completed without error       00%     10765         -
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000f   117   099   006    Pre-fail  Always       -       152860992
  3 Spin_Up_Time            0x0003   097   093   000    Pre-fail  Always       -       0
  4 Start_Stop_Count        0x0032   095   095   020    Old_age   Always       -       5833
  5 Reallocated_Sector_Ct   0x0033   100   100   036    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x000f   075   060   030    Pre-fail  Always       -       38796352
  9 Power_On_Hours          0x0032   089   089   000    Old_age   Always       -       10184
 10 Spin_Retry_Count        0x0013   100   098   097    Pre-fail  Always       -       667
 12 Power_Cycle_Count       0x0032   097   037   020    Old_age   Always       -       3550
184 End-to-End_Error        0x0032   100   100   099    Old_age   Always       -       0
187 Reported_Uncorrect      0x0032   100   100   000    Old_age   Always       -       0
188 Command_Timeout         0x0032   100   097   000    Old_age   Always       -       412323151969
189 High_Fly_Writes         0x003a   100   100   000    Old_age   Always       -       0
190 Airflow_Temperature_Cel 0x0022   067   057   045    Old_age   Always       -       33 (Min/Max 23/34)
194 Temperature_Celsius     0x0022   033   043   000    Old_age   Always       -       33 (0 13 0 0 0)
195 Hardware_ECC_Recovered  0x001a   021   015   000    Old_age   Always       -       152860992
197 Current_Pending_Sector  0x0012   100   100   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0010   100   100   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x003e   200   200   000    Old_age   Always       -       0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Completed without error       00%     10183         -
# 2  Extended offline    Completed without error       00%      9901         -
# 3  Short offline       Completed without error       00%      4189         -

Last edited by ttheodor (2014-01-06 14:17:07)

Offline

#5 2014-01-06 15:56:38

R00KIE
Forum Fellow
From: Between a computer and a chair
Registered: 2008-09-14
Posts: 4,734

Re: HDD errors during boot

You have pending sectors on 2 drives, I'd say backup everything, completely overwrite the drives with pending sectors (1) and check if there are not more pending sectors and more importantly if there are no reallocated sectors, then restore the backups.

(1) You could go by the smart error logs and try to overwrite only the sectors that can't be read and also try to figure out which files use those sectors - complete overwrite is easier tongue

To completely overwrite the disk I would recommend the 4 pass write test done by badblocks, it's a 2 in one, you overwrite the whole disk and check for any problematic sectors.

Alternatively you could use the smart security erase feature or dd to overwrite the whole disk with zeros.


R00KIE
Tm90aGluZyB0byBzZWUgaGVyZSwgbW92ZSBhbG9uZy4K

Offline

#6 2014-01-07 09:05:42

ttheodor
Member
Registered: 2013-12-29
Posts: 31

Re: HDD errors during boot

I 've run the 4 pass write test on one of the drives and:
16020 bad blocks found. (28/0/15992 errors)

Is it time for RMA?

Offline

#7 2014-01-07 14:42:04

R00KIE
Forum Fellow
From: Between a computer and a chair
Registered: 2008-09-14
Posts: 4,734

Re: HDD errors during boot

I'd say yes, that test should pass without errors, but before submitting an RMA run the manufacturer's tool, usually a livecd that runs their own utility and reports an error code, that might be one of the things asked in the RMA form.


R00KIE
Tm90aGluZyB0byBzZWUgaGVyZSwgbW92ZSBhbG9uZy4K

Offline

Board footer

Powered by FluxBB