You are not logged in.

#1 2011-07-10 10:19:02

fixje
Member
From: Germany
Registered: 2010-02-13
Posts: 9

New WD Hard Drive ATA Link Errors

Hi there,

I just bought a new hard drive for my lenovo ThinkPad R60e. It is the following model:

# hdparm -I
ATA device, with non-removable media
    Model Number:       WDC WD5000BUDT-63G8FY0                  
    Firmware Revision:  01.01A01
    Transport:          Serial, SATA 1.0a, SATA II Extensions, SATA Rev 2.5, SATA Rev 2.6
Standards:
    Supported: 8 7 6 5 
    Likely used: 8
Configuration:
    Logical        max    current
    cylinders    16383    16383
    heads        16    16
    sectors/track    63    63
    --
    CHS current addressable sectors:   16514064
    LBA    user addressable sectors:  268435455
    LBA48  user addressable sectors:  976773168
    Logical  Sector size:                   512 bytes
    Physical Sector size:                  4096 bytes
    Logical Sector-0 offset:                  0 bytes
    device size with M = 1024*1024:      476940 MBytes
    device size with M = 1000*1000:      500107 MBytes (500 GB)
    cache/buffer size  = unknown
    Nominal Media Rotation Rate: 5400
Capabilities:
    LBA, IORDY(can be disabled)
    Queue depth: 32
    Standby timer values: spec'd by Standard, with device specific minimum
    R/W multiple sector transfer: Max = 16    Current = 0
    Advanced power management level: 128
    Recommended acoustic management value: 128, current value: 254
    DMA: mdma0 mdma1 mdma2 udma0 udma1 udma2 udma3 udma4 *udma5 udma6 
         Cycle time: min=120ns recommended=120ns
    PIO: pio0 pio1 pio2 pio3 pio4 
         Cycle time: no flow control=120ns  IORDY flow control=120ns
Commands/features:
    Enabled    Supported:
       *    SMART feature set
            Security Mode feature set
       *    Power Management feature set
       *    Write cache
       *    Look-ahead
       *    Host Protected Area feature set
       *    WRITE_BUFFER command
       *    READ_BUFFER command
       *    NOP cmd
       *    DOWNLOAD_MICROCODE
       *    Advanced Power Management feature set
            Power-Up In Standby feature set
       *    SET_FEATURES required to spinup after power up
            SET_MAX security extension
       *    Automatic Acoustic Management feature set
       *    48-bit Address feature set
       *    Device Configuration Overlay feature set
       *    Mandatory FLUSH_CACHE
       *    FLUSH_CACHE_EXT
       *    SMART error logging
       *    SMART self-test
            Media Card Pass-Through
       *    General Purpose Logging feature set
       *    64-bit World wide name
       *    URG for READ_STREAM[_DMA]_EXT
       *    URG for WRITE_STREAM[_DMA]_EXT
       *    IDLE_IMMEDIATE with UNLOAD
       *    {READ,WRITE}_DMA_EXT_GPL commands
       *    Segmented DOWNLOAD_MICROCODE
       *    Gen1 signaling speed (1.5Gb/s)
       *    Gen2 signaling speed (3.0Gb/s)
       *    Native Command Queueing (NCQ)
       *    Host-initiated interface power management
       *    Phy event counters
       *    Idle-Unload when NCQ is active
       *    NCQ priority information
            DMA Setup Auto-Activate optimization
            Device-initiated interface power management
       *    Software settings preservation
       *    SMART Command Transport (SCT) feature set
       *    SCT Features Control (AC4)
       *    SCT Data Tables (AC5)
            unknown 206[12] (vendor specific)
            unknown 206[13] (vendor specific)
            unknown 206[14] (vendor specific)
Security: 
    Master password revision code = 65534
        supported
    not    enabled
    not    locked
    not    frozen
    not    expired: security count
        supported: enhanced erase
    138min for SECURITY ERASE UNIT. 138min for ENHANCED SECURITY ERASE UNIT.
Logical Unit WWN Device Identifier: 50014ee60132e99d
    NAA        : 5
    IEEE OUI    : 0014ee
    Unique ID    : 60132e99d
Checksum: correct

As it uses Advanced Format, I did an aligned partitioning as you can see here:

# fdisk -u -l
Disk /dev/sda: 500.1 GB, 500107862016 bytes
255 heads, 63 sectors/track, 60801 cylinders, total 976773168 sectors
Units = sectors of 1 * 512 = 512 bytes
Disk identifier: 0x04801fe0

   Device Boot      Start         End      Blocks   Id  System
/dev/sda1            2048      224909      111431   83  Linux
/dev/sda2          224912    44275139    22025114   83  Linux
/dev/sda3        44275144    48484169     2104513   83  Linux
/dev/sda4        48484176   976768064   464141944+  83  Linux

After that I tried to copy my old system and data from the old hard drive (The new WD drive was built into the laptop and the old drive connected via SATA2USB adaptor). After a few minutes the hdd started to hang and the following messages occured in "dmesg"

[  934.624568] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
[  934.624575] ata1.00: BMDMA stat 0x25
[  934.624583] ata1.00: failed command: WRITE DMA EXT
[  934.624596] ata1.00: cmd 35/00:80:68:2b:13/00:01:2d:00:00/e0 tag 0 dma 196608 out
[  934.624599]          res 61/04:10:f0:1c:13/04:00:2d:00:00/e0 Emask 0x1 (device error)
[  934.624606] ata1.00: status: { DRDY DF ERR }
[  934.624611] ata1.00: error: { ABRT }
[  934.631465] ata1.00: failed to read native max address (err_mask=0x1)
[  934.631472] ata1.00: HPA support seems broken, skipping HPA handling
[  934.646871] ata1.00: configured for UDMA/133 (device error ignored)
[  934.646890] ata1: EH complete
....
[ 1234.019934] ata1: link is slow to respond, please be patient (ready=0)
[ 1239.033266] ata1: device not ready (errno=-16), forcing hardreset
[ 1239.033279] ata1: soft resetting link
[ 1244.199930] ata1: link is slow to respond, please be patient (ready=0)
[ 1249.053269] ata1: SRST failed (errno=-16)
[ 1249.053282] ata1: soft resetting link
[ 1254.219921] ata1: link is slow to respond, please be patient (ready=0)
[ 1259.073267] ata1: SRST failed (errno=-16)
[ 1259.073279] ata1: soft resetting link
[ 1264.239938] ata1: link is slow to respond, please be patient (ready=0)

First I thought it looks like a problem with the SATA connector within my notebook, but it shouldn't be case as the hdd sits tight and the old one still works. Moreover I tried to change SATA settings in BIOS from "AHCI" to "Compatibility" in case of a driver fault, but this also didn't fix the issue.
For now I have connected the new hard drive with the SATA2USB adaptor and do some writing on it via dd. In this case, the errors do not occur at all.

The disk now gives me errors even if connected via USB. So I assume it is a hardware defect. The Western Digital Drive Check tools won't boot, but, I think these errors are more or less unambiguous

[  268.018935] Buffer I/O error on device sdb, logical block 77649
[  268.018938] lost page write due to I/O error on sdb
[  268.226335] sd 2:0:0:0: [sdb] Unhandled sense code
[  268.226340] sd 2:0:0:0: [sdb]  Result: hostbyte=0x00 driverbyte=0x08
[  268.226345] sd 2:0:0:0: [sdb]  Sense Key : 0x3 [current] 
[  268.226350] sd 2:0:0:0: [sdb]  ASC=0xc ASCQ=0x0
[  268.226355] sd 2:0:0:0: [sdb] CDB: cdb[0]=0x2a: 2a 00 00 09 7b 30 00 00 f0 00
[  268.226366] end_request: I/O error, dev sdb, sector 621360
[  268.434722] sd 2:0:0:0: [sdb] Unhandled sense code
[  268.434732] sd 2:0:0:0: [sdb]  Result: hostbyte=0x00 driverbyte=0x08
[  268.434740] sd 2:0:0:0: [sdb]  Sense Key : 0x3 [current] 
[  268.434748] sd 2:0:0:0: [sdb]  ASC=0xc ASCQ=0x0
[  268.434756] sd 2:0:0:0: [sdb] CDB: cdb[0]=0x2a: 2a 00 00 09 7c 20 00 00 f0 00
[  268.434775] end_request: I/O error, dev sdb, sector 621600
[  268.642711] sd 2:0:0:0: [sdb] Unhandled sense code
[  268.642718] sd 2:0:0:0: [sdb]  Result: hostbyte=0x00 driverbyte=0x08
[  268.642723] sd 2:0:0:0: [sdb]  Sense Key : 0x3 [current] 
[  268.642728] sd 2:0:0:0: [sdb]  ASC=0xc ASCQ=0x0
[  268.642733] sd 2:0:0:0: [sdb] CDB: cdb[0]=0x2a: 2a 00 00 09 7d 10 00 00 f0 00
[  268.642744] end_request: I/O error, dev sdb, sector 621840
[  268.849950] sd 2:0:0:0: [sdb] Unhandled sense code
[  268.849956] sd 2:0:0:0: [sdb]  Result: hostbyte=0x00 driverbyte=0x08
[  268.849961] sd 2:0:0:0: [sdb]  Sense Key : 0x3 [current] 
[  268.849967] sd 2:0:0:0: [sdb]  ASC=0xc ASCQ=0x0
[  268.849971] sd 2:0:0:0: [sdb] CDB: cdb[0]=0x2a: 2a 00 00 09 7e 00 00 00 f0 00
[  268.850004] end_request: I/O error, dev sdb, sector 622080
[  269.057592] sd 2:0:0:0: [sdb] Unhandled sense code
[  269.057598] sd 2:0:0:0: [sdb]  Result: hostbyte=0x00 driverbyte=0x08
[  269.057603] sd 2:0:0:0: [sdb]  Sense Key : 0x3 [current] 
[  269.057608] sd 2:0:0:0: [sdb]  ASC=0xc ASCQ=0x0
[  269.057612] sd 2:0:0:0: [sdb] CDB: cdb[0]=0x2a: 2a 00 00 09 7e f0 00 00 f0 00
[  269.057624] end_request: I/O error, dev sdb, sector 622320
[  299.466711] usb 1-1: reset high speed USB device number 3 using ehci_hcd
[  329.760049] usb 1-1: reset high speed USB device number 3 using ehci_hcd
[  354.454537] usb 1-1: USB disconnect, device number 3
[  354.458395] sd 2:0:0:0: [sdb] Unhandled error code
[  354.458404] sd 2:0:0:0: [sdb]  Result: hostbyte=0x01 driverbyte=0x00
[  354.458412] sd 2:0:0:0: [sdb] CDB: cdb[0]=0x2a: 2a 00 00 09 7f e0 00 00 f0 00
[  354.458431] end_request: I/O error, dev sdb, sector 622560
[  354.458446] quiet_error: 170 callbacks suppressed
[  354.458451] Buffer I/O error on device sdb, logical block 77820

For the sake of completeness, here are the SMART values:

smartctl 5.40 2010-02-03 r3060 [i686-pc-linux-gnu] (local build)
Copyright (C) 2002-10 by Bruce Allen, http://smartmontools.sourceforge.net

=== START OF INFORMATION SECTION ===
Device Model:     WDC WD5000BUDT-63G8FY0
Firmware Version: 01.01A01
User Capacity:    500,107,862,016 bytes
Device is:        Not in smartctl database [for details use: -P showall]
ATA Version is:   8
ATA Standard is:  Exact ATA specification draft version not indicated
Local Time is:    Sun Jul 10 09:44:26 2011 CEST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00)    Offline data collection activity
                    was never started.
                    Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0)    The previous self-test routine completed
                    without error or no self-test has ever 
                    been run.
Total time to complete Offline 
data collection:          (13560) seconds.
Offline data collection
capabilities:              (0x7b) SMART execute Offline immediate.
                    Auto Offline data collection on/off support.
                    Suspend Offline collection upon new
                    command.
                    Offline surface scan supported.
                    Self-test supported.
                    Conveyance Self-test supported.
                    Selective Self-test supported.
SMART capabilities:            (0x0003)    Saves SMART data before entering
                    power-saving mode.
                    Supports SMART auto save timer.
Error logging capability:        (0x01)    Error logging supported.
                    General Purpose Logging supported.
Short self-test routine 
recommended polling time:      (   2) minutes.
Extended self-test routine
recommended polling time:      ( 158) minutes.
Conveyance self-test routine
recommended polling time:      (   5) minutes.
SCT capabilities:            (0x7031)    SCT Status supported.
                    SCT Feature Control supported.
                    SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       0
  3 Spin_Up_Time            0x0027   189   185   021    Pre-fail  Always       -       1541
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       54
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x002e   200   200   000    Old_age   Always       -       0
  9 Power_On_Hours          0x0032   100   100   000    Old_age   Always       -       16
 10 Spin_Retry_Count        0x0032   100   253   000    Old_age   Always       -       0
 11 Calibration_Retry_Count 0x0032   100   253   000    Old_age   Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       50
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       36
193 Load_Cycle_Count        0x0032   200   200   000    Old_age   Always       -       164
194 Temperature_Celsius     0x0022   111   105   000    Old_age   Always       -       36
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0030   100   253   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x0008   100   253   000    Old_age   Offline      -       0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
No self-tests have been logged.  [To run self-tests, use: smartctl -t]


SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

It is regardless which kernel version I use (latest ARCH 2.6.39 or 2.6.23 from a grml live system), the error is persistent everywhere.

I will just send it back for warranty. In fact, it's another prove for today's harddisk "quality"

EDITED: 1. forgot some detail in dmesg; 2. Finally got it

Last edited by fixje (2011-07-10 11:35:06)

Offline

Board footer

Powered by FluxBB