WRITE SAME failed messages on RAIDs

domanov · 2013-04-09 15:53:46

Hi all,

Since the last kernel upgrade on an headless server

[2013-04-04 09:56] upgraded linux (3.8.4-1 -> 3.8.5-1)

I get loads of these errors:

3w-xxxx: scsi2: Unknown scsi opcode: 0x41
[ 4356.826902] sd 2:0:0:0: [sda] Unhandled error code
[ 4356.826905] sd 2:0:0:0: [sda]  
[ 4356.826907] Result: hostbyte=0x04 driverbyte=0x00
[ 4356.826909] sd 2:0:0:0: [sda] CDB: 
[ 4356.826911] cdb[0]=0x41: 41 00 00 93 28 ee 00 00 08 00
[ 4356.826919] end_request: I/O error, dev sda, sector 9644270
[ 4356.827002] sda3: WRITE SAME failed. Manually zeroing.

sda3 is an ext4 / partition, on a 3ware hardware controller (RAID1). The awful tw_cli tool says it's everything ok:

Unit  UnitType  Status         %RCmpl  %V/I/M  Stripe  Size(GB)  Cache  AVrfy
------------------------------------------------------------------------------
u0    RAID-1    OK             -       -       -       69.2482   ON     -      

Port   Status           Unit   Size        Blocks        Serial
---------------------------------------------------------------
p0     OK               u0     69.25 GB    145226112     WD-WMANS1886705     
p1     OK               u0     69.25 GB    145226112     WD-WMANS1886623

I did not find relevant clues by googling.
Do any of you guys know whether this issue is harmful or can be ignored? Since otherwise everything seems fine...

Cheers,

domanov

code tags as opposed to quote tags please -- Inxsible

Last edited by Inxsible (2013-04-09 16:49:22)

alphaniner · 2013-04-09 16:39:07

Did you try a manual verify?

Inxsible · 2013-04-09 16:50:39

domanov, use code tags. not quote tags. Quote tags are reserved for quoting another person's post or comment. Code tags provide a better monospaced font which makes it easier to distinguish between your post text and snippets. They also provide scrollers in case the snippets are long.

domanov · 2013-04-09 17:07:05

Inxsible wrote:

domanov, use code tags. not quote tags. Quote tags are reserved for quoting another person's post or comment.

Sorry mate, thanks for pointing that out and for the correction in the OP.

alphaniner wrote:

Did you try a manual verify?

It's running right now, but it does not seem to be the right tw_cli client (I repeat: awful support from 3ware):

[root@server ~] ./tw_cli /c2/u0 show all
/c2/u0 show all
/c2/u0 status = VERIFYING
/c2/u0 is not rebuilding, its current state is VERIFYING
/c2/u0 is verifying with percent completion = 54%
/c2/u0 is not initializing. Its current state is VERIFYING
/c2/u0 Write Cache = on
Error: (CLI:149) Feature not supported.

I did not find any clue about that error yet, not even in the manual.
Cheers,
domanov

Last edited by domanov (2013-04-09 17:11:06)

alphaniner · 2013-04-09 17:30:40

Apparently, WRITE SAME is an optional drive feature that allows writing the same data to consecutive sectors of the drive. A search for "WRITE SAME 3ware" turned up an offsite (ie. not hosted by 3ware) 3ware manual that suggests it's only or primarily used during RAID initialization to speed things up.

WRITE SAME seems to be part of the "SMART Command Transport" feature set, so a good first step would be to make sure SMART is enabled on your drives. You could also try checking out results for "WD740ADFD sct" (surprised I figured out your drive model? ) to see if there's anything related to your particular drive. If you're feeling particularly froggy (and you have backups) you can apparently send WRITE SAME commands manually using the sg_write_same command which is in the sg3_utils package.

domanov · 2013-04-10 09:39:00

alphaniner wrote:

Apparently, WRITE SAME is an optional drive feature that allows writing the same data to consecutive sectors of the drive. A search for "WRITE SAME 3ware" turned up an offsite (ie. not hosted by 3ware) 3ware manual that suggests it's only or primarily used during RAID initialization to speed things up.
WRITE SAME seems to be part of the "SMART Command Transport" feature set, so a good first step would be to make sure SMART is enabled on your drives.

Thanks alphaniner for pointing me there. This is the output of smartctl on one of the two disks in RAID1 (the other gave no error whatsoever):

[root@server ~]# smartctl --all -T permissive --device=3ware,0 /dev/twe0
smartctl 6.1 2013-03-16 r3800 [x86_64-linux-3.8.6-1-ARCH] (local build)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Western Digital Raptor
Device Model:     WDC WD740ADFD-00NLR5
Serial Number:    WD-WMANS1886705
Firmware Version: 21.07QR5
User Capacity:    74,355,769,344 bytes [74.3 GB]
Sector Size:      512 bytes logical/physical
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ATA/ATAPI-7 published, ANSI INCITS 397-2005
Local Time is:    Wed Apr 10 10:59:26 2013 CEST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x82) Offline data collection activity
                                        was completed without error.
                                        Auto Offline Data Collection: Enabled.
Self-test execution status:      (   0) The previous self-test routine completed
                                        without error or no self-test has ever 
                                        been run.
Total time to complete Offline 
data collection:                ( 2391) seconds.
Offline data collection
capabilities:                    (0x7b) SMART execute Offline immediate.
                                        Auto Offline data collection on/off support.
                                        Suspend Offline collection upon new
                                        command.
                                        Offline surface scan supported.
                                        Self-test supported.
                                        Conveyance Self-test supported.
                                        Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                                        power-saving mode.
                                        Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                                        General Purpose Logging supported.
Short self-test routine 
recommended polling time:        (   2) minutes.
Extended self-test routine
recommended polling time:        (  39) minutes.
Conveyance self-test routine
recommended polling time:        (   5) minutes.
SCT capabilities:              (0x103f) SCT Status supported.
                                        SCT Error Recovery Control supported.
                                        SCT Feature Control supported.
                                        SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000b   200   196   051    Pre-fail  Always       -       1
  3 Spin_Up_Time            0x0007   169   169   021    Pre-fail  Always       -       2566
  4 Start_Stop_Count        0x0032   100   100   040    Old_age   Always       -       67
  5 Reallocated_Sector_Ct   0x0033   199   199   140    Pre-fail  Always       -       13
  7 Seek_Error_Rate         0x000a   200   200   051    Old_age   Always       -       0
  9 Power_On_Hours          0x0032   045   045   000    Old_age   Always       -       40164
 10 Spin_Retry_Count        0x0012   100   253   051    Old_age   Always       -       0
 11 Calibration_Retry_Count 0x0012   100   253   051    Old_age   Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       62
194 Temperature_Celsius     0x0022   116   106   000    Old_age   Always       -       27
196 Reallocated_Event_Count 0x0032   187   187   000    Old_age   Always       -       13
197 Current_Pending_Sector  0x0012   191   188   000    Old_age   Always       -       113
198 Offline_Uncorrectable   0x0012   200   197   000    Old_age   Always       -       3
199 UDMA_CRC_Error_Count    0x000a   200   253   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x0008   198   193   051    Old_age   Offline      -       60

SMART Error Log Version: 1
ATA Error Count: 24 (device log contains only the most recent five errors)
        CR = Command Register [HEX]
        FR = Features Register [HEX]
        SC = Sector Count Register [HEX]
        SN = Sector Number Register [HEX]
        CL = Cylinder Low Register [HEX]
        CH = Cylinder High Register [HEX]
        DH = Device/Head Register [HEX]
        DC = Device Command Register [HEX]
        ER = Error register [HEX]
        ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.

Error 24 occurred at disk power-on lifetime: 40148 hours (1672 days + 20 hours)
  When the command that caused the error occurred, the device was doing SMART Offline or Self-test.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 40 df 87 c3 e0  Error: UNC 64 sectors at LBA = 0x00c387df = 12814303

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  25 00 40 c0 87 c3 41 00   2d+04:24:43.728  READ DMA EXT
  25 00 40 80 87 c3 41 00   2d+04:24:43.728  READ DMA EXT
  25 00 40 40 87 c3 41 00   2d+04:24:43.728  READ DMA EXT
  25 00 40 00 87 c3 41 00   2d+04:24:43.728  READ DMA EXT
  25 00 40 c0 86 c3 41 00   2d+04:24:43.728  READ DMA EXT

Error 23 occurred at disk power-on lifetime: 40148 hours (1672 days + 20 hours)
[...]

All the five errors in the smart log are "only" UNC errors on read: if I understand correctly, googling around, this could be a bad cable connection rather than drive really failing (a couple days ago someone worked on the rack where this server is hosted...).
Anyway, the other drive seems fine and since this 2-disks raid1 unit is just the / partition (it's a plain arch installation, the server is a "number cruncher"; all actual data are in a separate 4-disks raid5 unit) I just backuped /etc for excess of zeal.
For now I think I'll wait for the weekend for running long smart tests/badblocks and such.

Any further hint or comment is still appreciated! Do you think it's time to replace the drive?
Thanks,
domanov

R00KIE · 2013-04-11 15:19:54

I'd say replace that disk. You already have reallocated sectors and more are probably waiting to be reallocated, that drive _may_ be starting to fail.

domanov wrote:

  5 Reallocated_Sector_Ct   0x0033   199   199   140    Pre-fail  Always       -       13
196 Reallocated_Event_Count 0x0032   187   187   000    Old_age   Always       -       13
197 Current_Pending_Sector  0x0012   191   188   000    Old_age   Always       -       113
198 Offline_Uncorrectable   0x0012   200   197   000    Old_age   Always       -       3

domanov · 2013-04-12 09:14:42

R00KIE wrote:

I'd say replace that disk. You already have reallocated sectors and more are probably waiting to be reallocated, that drive _may_ be starting to fail.

Yes, thank you, I agree.

  5 Reallocated_Sector_Ct   0x0033   199   199   140    Pre-fail  Always       -       13
196 Reallocated_Event_Count 0x0032   187   187   000    Old_age   Always       -       13
197 Current_Pending_Sector  0x0012   191   188   000    Old_age   Always       -       113
198 Offline_Uncorrectable   0x0012   200   197   000    Old_age   Always       -       3

Right now the Current_Pending_Sector is at 114. The disk IS going to fail. Can I just wait till it happens (since it is a 2 disks RAID-1 and the other disk is fine) or could a failed disk cause offtime on the server? Any experience?

Cheers,
domanov

R00KIE · 2013-04-12 14:52:20

In theory raid1 should be able to continue working even when n-1 disks fail, however rebuilding the array when adding the new disks will stress the remaining disks, which might also lead them to fail (you can find this in almost every corner of the internet that talks about raid).

I also suspect that you may get some performance degradation when using a disk that is not working well. The drive will take some time until reporting an error when trying to read from those Current_Pending_Sector/Offline_Uncorrectable and after that the data will have to be retrieved from the other disks, how much performance degradation you get depends on how often you try to read from those sectors, and how often new sectors can't be read and become pending sectors.

Arch Linux

#1 2013-04-09 15:53:46

WRITE SAME failed messages on RAIDs

#2 2013-04-09 16:39:07

Re: WRITE SAME failed messages on RAIDs

#3 2013-04-09 16:50:39

Re: WRITE SAME failed messages on RAIDs

#4 2013-04-09 17:07:05

Re: WRITE SAME failed messages on RAIDs

#5 2013-04-09 17:30:40

Re: WRITE SAME failed messages on RAIDs

#6 2013-04-10 09:39:00

Re: WRITE SAME failed messages on RAIDs

#7 2013-04-11 15:19:54

Re: WRITE SAME failed messages on RAIDs

#8 2013-04-12 09:14:42

Re: WRITE SAME failed messages on RAIDs

#9 2013-04-12 14:52:20

Re: WRITE SAME failed messages on RAIDs

Board footer