MD Raid problems

UglyBob · 2015-12-03 20:23:34

Hi!

I have had soooo much trouble with my raid since installing arch linux (had ubuntu before). As it is a lot of new stuff to learn, it could be I configured something wrong so any hints are useful.

First problem:
Every time I reboot, my raid is marked as unclean and a resync is started that takes about 24h... *sigh* You can see the result below after a reboot:

[    7.407652] md: bind<sde1>
[    7.414657] md: bind<sdf1>
[    7.424487] md: bind<sdd1>
[    7.447411] md: bind<sdc1>
[    7.448879] input: HDA Digital PCBeep as /devices/pci0000:00/0000:00:1b.0/sound/card0/input6
[    7.450930] input: HDA Intel Front Mic as /devices/pci0000:00/0000:00:1b.0/sound/card0/input7
[    7.451106] input: HDA Intel Rear Mic as /devices/pci0000:00/0000:00:1b.0/sound/card0/input8
[    7.451272] input: HDA Intel Line as /devices/pci0000:00/0000:00:1b.0/sound/card0/input9
[    7.451457] input: HDA Intel Line Out as /devices/pci0000:00/0000:00:1b.0/sound/card0/input10
[    7.451630] input: HDA Intel Front Headphone as /devices/pci0000:00/0000:00:1b.0/sound/card0/input11
[    7.527207] raid6: sse2x1   gen()    99 MB/s
[    7.530478] Adding 8286204k swap on /dev/sda2.  Priority:-1 extents:1 across:8286204k SSFS
[    7.583386] raid6: sse2x1   xor()   653 MB/s
[    7.589432] EXT4-fs (sda1): mounted filesystem with ordered data mode. Opts: data=ordered
[    7.640268] raid6: sse2x2   gen()   213 MB/s
[    7.662956] iTCO_vendor_support: vendor-support=0
[    7.665988] iTCO_wdt: Intel TCO WatchDog Timer Driver v1.11
[    7.666089] iTCO_wdt: Found a NM10 TCO device (Version=2, TCOBASE=0x0860)
[    7.667395] iTCO_wdt: initialized. heartbeat=30 sec (nowayout=0)
[    7.668185] gpio_ich: GPIO from 462 to 511 on gpio_ich
[    7.678691] r8169 0000:03:0b.0 enp3s11: renamed from eth0
[    7.684950] ath: phy0: Enable LNA combining
[    7.686523] ath: EEPROM regdomain: 0x60
[    7.686525] ath: EEPROM indicates we should expect a direct regpair map
[    7.686529] ath: Country alpha2 being used: 00
[    7.686531] ath: Regpair used: 0x60
[    7.696729] raid6: sse2x2   xor()   986 MB/s
[    7.705282] ieee80211 phy0: Selected rate control algorithm 'minstrel_ht'
[    7.706846] ieee80211 phy0: Atheros AR9285 Rev:2 mem=0xffffc90000760000, irq=19
[    7.750896] ath9k 0000:02:00.0 wls35: renamed from wlan0
[    7.753396] raid6: sse2x4   gen()   396 MB/s
[    7.810104] raid6: sse2x4   xor()   481 MB/s
[    7.810108] raid6: using algorithm sse2x4 gen() 396 MB/s
[    7.810110] raid6: .... xor() 481 MB/s, rmw enabled
[    7.810112] raid6: using ssse3x2 recovery algorithm
[    7.811440] async_tx: api initialized (async)
[    7.812914] xor: measuring software checksum speed
[    7.843383]    prefetch64-sse:  5108.400 MB/sec
[    7.876682]    generic_sse:  5277.600 MB/sec
[    7.876686] xor: using function: generic_sse (5277.600 MB/sec)
[    7.887637] md: raid6 personality registered for level 6
[    7.887640] md: raid5 personality registered for level 5
[    7.887641] md: raid4 personality registered for level 4
[    7.888619] md/raid:md0: not clean -- starting background reconstruction
[    7.888653] md/raid:md0: device sdc1 operational as raid disk 0
[    7.888656] md/raid:md0: device sdd1 operational as raid disk 2
[    7.888659] md/raid:md0: device sdf1 operational as raid disk 3
[    7.888661] md/raid:md0: device sde1 operational as raid disk 1
[    7.889604] md/raid:md0: allocated 4366kB
[    7.889739] md/raid:md0: raid level 5 active with 4 out of 4 devices, algorithm 2
[    7.889740] RAID conf printout:
[    7.889742]  --- level:5 rd:4 wd:4
[    7.889745]  disk 0, o:1, dev:sdc1
[    7.889748]  disk 1, o:1, dev:sde1
[    7.889750]  disk 2, o:1, dev:sdd1
[    7.889752]  disk 3, o:1, dev:sdf1
[    7.889794] md0: Warning: Device sdd1 is misaligned
[    7.889900] md0: detected capacity change from 0 to 9001778479104
[    7.890155] md: resync of RAID array md0
[    7.890158] md: minimum _guaranteed_  speed: 1000 KB/sec/disk.
[    7.890161] md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for resync.
[    7.890170] md: using 128k window, over a total of 2930266432k.

2nd problem:
It takes forever to reboot. It could be related to the first problem, so I add it here as well. Haven't figured out if there is a way to debug this without having a screen connected to the server. I haven't right now and it's a bit hard to fix that. Any way I could make it save the last log somehow?

3rd problem:
Sometimes when copying large files I get a lot of call traces like this:

[315991.664637] ------------[ cut here ]------------
[315991.664667] WARNING: CPU: 2 PID: 257 at drivers/md/raid5.c:4244 break_stripe_batch_list+0x1b9/0x260 [raid456]()
[315991.664672] Modules linked in: iptable_filter cfg80211 rfkill raid456 async_raid6_recov async_memcpy async_pq async_xor xor async_tx iTCO_wdt evdev gpio_ich iTCO_vendor_support mac_hid raid6_pq coretemp i915 pcspkr md_mod psmouse serio_raw snd_hda_codec_realtek snd_hda_codec_generic i2c_i801 snd_hda_intel r8169 mii lpc_ich video snd_hda_codec drm_kms_helper snd_hda_core snd_hwdep snd_pcm drm snd_timer snd intel_agp i2c_algo_bit shpchp intel_gtt soundcore button acpi_cpufreq processor sch_fq_codel nfsd nfs auth_rpcgss oid_registry nfs_acl lockd grace sunrpc fscache ip_tables x_tables ext4 crc16 mbcache jbd2 sd_mod ata_generic pata_acpi atkbd libps2 ahci libahci ata_piix pata_jmicron uhci_hcd usbcore libata usb_common scsi_mod i8042 serio
[315991.664823] CPU: 2 PID: 257 Comm: md0_raid5 Not tainted 4.2.5-1-ARCH #1
[315991.664829] Hardware name:      /NM10, BIOS 080016  07/19/2011
[315991.664834]  0000000000000000 000000008bdf44d3 ffff8800b667ba98 ffffffff81570d0a
[315991.664844]  0000000000000000 0000000000000000 ffff8800b667bad8 ffffffff810748a6
[315991.664852]  ffff8800b667bb38 ffff8800461599a0 ffff8800b66e5348 ffff880082b74ce0
[315991.664860] Call Trace:
[315991.664876]  [<ffffffff81570d0a>] dump_stack+0x4c/0x6e
[315991.664887]  [<ffffffff810748a6>] warn_slowpath_common+0x86/0xc0
[315991.664895]  [<ffffffff810749da>] warn_slowpath_null+0x1a/0x20
[315991.664907]  [<ffffffffa05fcf39>] break_stripe_batch_list+0x1b9/0x260 [raid456]
[315991.664919]  [<ffffffffa060424c>] handle_stripe+0x9bc/0x2560 [raid456]
[315991.664930]  [<ffffffff812b4c6f>] ? cpumask_next_and+0x2f/0x40
[315991.664943]  [<ffffffffa0605f73>] handle_active_stripes.isra.22+0x183/0x4c0 [raid456]
[315991.664957]  [<ffffffffa060687f>] raid5d+0x49f/0x670 [raid456]
[315991.664973]  [<ffffffffa04550a0>] md_thread+0x130/0x140 [md_mod]
[315991.664984]  [<ffffffff810b4c80>] ? wake_atomic_t_function+0x60/0x60
[315991.664996]  [<ffffffffa0454f70>] ? md_wait_for_blocked_rdev+0x130/0x130 [md_mod]
[315991.665007]  [<ffffffff81092578>] kthread+0xd8/0xf0
[315991.665017]  [<ffffffff810924a0>] ? kthread_worker_fn+0x170/0x170
[315991.665028]  [<ffffffff8157665f>] ret_from_fork+0x3f/0x70
[315991.665038]  [<ffffffff810924a0>] ? kthread_worker_fn+0x170/0x170
[315991.665046] ---[ end trace 6558f4ff075bfa8c ]---

It also happened that the copy just hang forever (until rebooting). I have no idea what's going on.

Here is some info about my raid:

/dev/md0:
        Version : 0.90
  Creation Time : Fri Feb 13 22:16:46 2009
     Raid Level : raid5
     Array Size : 8790799296 (8383.56 GiB 9001.78 GB)
  Used Dev Size : -1
   Raid Devices : 4
  Total Devices : 4
Preferred Minor : 0
    Persistence : Superblock is persistent

    Update Time : Thu Dec  3 21:17:12 2015
          State : active, resyncing
 Active Devices : 4
Working Devices : 4
 Failed Devices : 0
  Spare Devices : 0

         Layout : left-symmetric
     Chunk Size : 64K

  Resync Status : 0% complete

           UUID : 4a529d12:38740437:6b618a7b:b902ea28
         Events : 0.142545

    Number   Major   Minor   RaidDevice State
       0       8       33        0      active sync   /dev/sdc1
       1       8       65        1      active sync   /dev/sde1
       2       8       49        2      active sync   /dev/sdd1
       3       8       81        3      active sync   /dev/sdf1

My mdadm.conf:

# mdadm configuration file
#
# mdadm will function properly without the use of a configuration file,
# but this file is useful for keeping track of arrays and member disks.
# In general, a mdadm.conf file is created, and updated, after arrays
# are created. This is the opposite behavior of /etc/raidtab which is
# created prior to array construction.
#
#
# the config file takes two types of lines:
#
#       DEVICE lines specify a list of devices of where to look for
#         potential member disks
#
#       ARRAY lines specify information about how to identify arrays so
#         so that they can be activated
#


# You can have more than one device line and use wild cards. The first
# example includes SCSI the first partition of SCSI disks /dev/sdb,
# /dev/sdc, /dev/sdd, /dev/sdj, /dev/sdk, and /dev/sdl. The second
# line looks for array slices on IDE disks.
#
#DEVICE /dev/sd[bcdjkl]1
#DEVICE /dev/hda1 /dev/hdb1
#
# The designation "partitions" will scan all partitions found in
# /proc/partitions
DEVICE partitions


# ARRAY lines specify an array to assemble and a method of identification.
# Arrays can currently be identified by using a UUID, superblock minor number,
# or a listing of devices.
#
#       super-minor is usually the minor number of the metadevice
#       UUID is the Universally Unique Identifier for the array
# Each can be obtained using
#
#       mdadm -D <md>
#
# To capture the UUIDs for all your RAID arrays to this file, run these:
#    to get a list of running arrays:
#    # mdadm -D --scan >>/etc/mdadm.conf
#    to get a list from superblocks:
#    # mdadm -E --scan >>/etc/mdadm.conf
#
#ARRAY /dev/md0 UUID=3aaa0122:29827cfa:5331ad66:ca767371
#ARRAY /dev/md1 super-minor=1
#ARRAY /dev/md2 devices=/dev/hda1,/dev/hdb1
#
# ARRAY lines can also specify a "spare-group" for each array.  mdadm --monitor
# will then move a spare between arrays in a spare-group if one array has a
# failed drive but no spare
#ARRAY /dev/md4 uuid=b23f3c6d:aec43a9f:fd65db85:369432df spare-group=group1
#ARRAY /dev/md5 uuid=19464854:03f71b1b:e0df2edd:246cc977 spare-group=group1
#


# When used in --follow (aka --monitor) mode, mdadm needs a
# mail address and/or a program.  To start mdadm's monitor mode, enable
# mdadm.service in systemd.
#
# If the lines are not found, mdadm will exit quietly
MAILADDR mattias.mansson@gmail.com
#PROGRAM /usr/sbin/handle-mdadm-events
ARRAY /dev/md0 metadata=0.90 UUID=4a529d12:38740437:6b618a7b:b902ea28

Kernel version:

Linux uglybob 4.2.5-1-ARCH #1 SMP PREEMPT Tue Oct 27 08:13:28 CET 2015 x86_64 GNU/Linux

Last edited by UglyBob (2015-12-04 08:06:26)

jasonwryan · 2015-12-03 20:30:43

Please edit your post to use code tags, it makes it much easier to read:
https://wiki.archlinux.org/index.php/Fo … s_and_code

mich41 · 2015-12-04 08:55:04

So are you getting kernel warnings and constant resyncs. I think it's entirely possible that the raid driver is somehow broken in this particular kernel version and that you are really pushing your luck booting this thing again and again.

First thing I'd do is to mount this array on a know-good kernel, let it resync and check that data are still there. Then maybe try linux-lts.

BTW, I'm not sure what to think about this:

[    7.889794] md0: Warning: Device sdd1 is misaligned

Why only sdd1? Aren't these disks identical?

UglyBob · 2015-12-04 09:20:09

mich41 wrote:

So are you getting kernel warnings and constant resyncs. I think it's entirely possible that the raid driver is somehow broken in this particular kernel version and that you are really pushing your luck booting this thing again and again.
First thing I'd do is to mount this array on a know-good kernel, let it resync and check that data are still there. Then maybe try linux-lts.
BTW, I'm not sure what to think about this:
[    7.889794] md0: Warning: Device sdd1 is misaligned
Why only sdd1? Aren't these disks identical?

Ok, I will look into switching to linux-lts.

Funny you would comment that, all these problems made me look into stuff about my disks and I discovered that even though I bought all four drives at the same time and the exact same model they are NOT identical. WD are really annoying and have released different subversion of the same drives without writing anything about this on the labels. So I have three identical drives and one different. Here you can see the this:

https://wikidevi.com/wiki/Western_Digital_Green

I assume this gives me worse performance, but I have never had any other problems before with ubuntu (and older kernel of course).

Last edited by UglyBob (2015-12-04 10:12:53)

TheChickenMan · 2015-12-04 11:38:51

I haven't noticed any issue with md raid 5 / raid 6 from the current kernel. Also, there is no real reason why you need to be using the same disks or even same size disks. If you have the partitions setup correctly it shouldn't matter at all. The raid will always work to the best possible performance level of the slowest disk. You don't lose anything just by having them be different. Trying the LTS kernel is probably a good idea though.

Also, why is your raid metadata version so old? The current version is at least 1.2 now.

ARRAY /dev/md0 metadata=0.90 UUID=4a529d12:38740437:6b618a7b:b902ea28

Last edited by TheChickenMan (2015-12-04 11:40:01)

UglyBob · 2015-12-04 11:56:10

TheChickenMan wrote:

I haven't noticed any issue with md raid 5 / raid 6 from the current kernel. Also, there is no real reason why you need to be using the same disks or even same size disks. If you have the partitions setup correctly it shouldn't matter at all. The raid will always work to the best possible performance level of the slowest disk. You don't lose anything just by having them be different. Trying the LTS kernel is probably a good idea though.
Also, why is your raid metadata version so old? The current version is at least 1.2 now.
ARRAY /dev/md0 metadata=0.90 UUID=4a529d12:38740437:6b618a7b:b902ea28

Yea, I know, but it's still annoying that I have a misaligned disk because of that. Could probably fix I guess, but haven't had time or will to do it yet.

That raid is pretty old, so I guess that's why? Just moved it from ubuntu to arch. Is it a problem and can I upgrade it then?

frostschutz · 2015-12-04 12:32:34

While 0.90 metadata is old, it should not cause problems (if it did not cause a problem in the first place when it was new). 0.90 metdata sits at the end of the device and has no real record of device name or size, so there can be confusion when the partition goes to the very end of the disk, the metadata might either refer to the last partition or the entire disk.

Can you show your partitions (parted /dev/disk unit s print free), and mdadm --examine /dev/sd*? Checking smartctl -a for each disk is also never a bad idea...

It's possible to upgrade 0.90 metadata to 1.0 metadata (on assemble, --update=metadata) but it should not be necessary. There may also be side effects (mdadm.conf needs an update and such).

Last edited by frostschutz (2015-12-04 12:34:55)

UglyBob · 2015-12-04 13:23:35

[mattias@uglybob ~]$ sudo parted /dev/sdc unit s print free
Model: ATA WDC WD30EZRX-00M (scsi)
Disk /dev/sdc: 5860533168s
Sector size (logical/physical): 512B/512B
Partition Table: gpt
Disk Flags: 

Number  Start  End          Size         File system  Name        Flags
 1      34s    5860533134s  5860533101s  ext3         Linux RAID  raid

[mattias@uglybob ~]$ sudo parted /dev/sdd unit s print free
Model: ATA WDC WD30EZRX-00M (scsi)
Disk /dev/sdd: 5860533168s
Sector size (logical/physical): 512B/4096B
Partition Table: gpt
Disk Flags: 

Number  Start  End          Size         File system  Name        Flags
 1      34s    5860533134s  5860533101s               Linux RAID  raid

[mattias@uglybob ~]$ sudo parted /dev/sde unit s print free
Model: ATA WDC WD30EZRX-00D (scsi)
Disk /dev/sde: 5860533168s
Sector size (logical/physical): 512B/4096B
Partition Table: gpt
Disk Flags: 

Number  Start  End          Size         File system  Name        Flags
 1      34s    5860533134s  5860533101s               Linux RAID  raid

[mattias@uglybob ~]$ sudo parted /dev/sdf unit s print free
Model: ATA WDC WD30EZRX-00M (scsi)
Disk /dev/sdf: 5860533168s
Sector size (logical/physical): 512B/4096B
Partition Table: gpt
Disk Flags: 

Number  Start  End          Size         File system  Name        Flags
 1      34s    5860533134s  5860533101s  ext3         Linux RAID  raid

I listed all partitions as you said, sda* and sdb* is not part of the raid though...

/dev/sda:
   MBR Magic : aa55
Partition[0] :       202752 sectors at         2048 (type 83)
Partition[1] :     16572416 sectors at       204800 (type 82)
Partition[2] :    471619584 sectors at     16777216 (type 83)
mdadm: No md superblock detected on /dev/sda1.
mdadm: No md superblock detected on /dev/sda2.
mdadm: No md superblock detected on /dev/sda3.
/dev/sdb:
   MBR Magic : aa55
Partition[0] :   1953521664 sectors at         2048 (type 83)
mdadm: No md superblock detected on /dev/sdb1.
/dev/sdc:
   MBR Magic : aa55
Partition[0] :   4294967295 sectors at            1 (type ee)
/dev/sdc1:
          Magic : a92b4efc
        Version : 0.90.00
           UUID : 4a529d12:38740437:6b618a7b:b902ea28
  Creation Time : Fri Feb 13 22:16:46 2009
     Raid Level : raid5
  Used Dev Size : -1364700864 (2794.52 GiB 3000.59 GB)
     Array Size : 8790799296 (8383.56 GiB 9001.78 GB)
   Raid Devices : 4
  Total Devices : 4
Preferred Minor : 0

    Update Time : Fri Dec  4 14:12:11 2015
          State : active
 Active Devices : 4
Working Devices : 4
 Failed Devices : 0
  Spare Devices : 0
       Checksum : 9ef959f7 - correct
         Events : 142747

         Layout : left-symmetric
     Chunk Size : 64K

      Number   Major   Minor   RaidDevice State
this     0       8       33        0      active sync   /dev/sdc1

   0     0       8       33        0      active sync   /dev/sdc1
   1     1       8       65        1      active sync   /dev/sde1
   2     2       8       49        2      active sync   /dev/sdd1
   3     3       8       81        3      active sync   /dev/sdf1
/dev/sdd:
   MBR Magic : aa55
Partition[0] :   4294967295 sectors at            1 (type ee)
/dev/sdd1:
          Magic : a92b4efc
        Version : 0.90.00
           UUID : 4a529d12:38740437:6b618a7b:b902ea28
  Creation Time : Fri Feb 13 22:16:46 2009
     Raid Level : raid5
  Used Dev Size : -1364700864 (2794.52 GiB 3000.59 GB)
     Array Size : 8790799296 (8383.56 GiB 9001.78 GB)
   Raid Devices : 4
  Total Devices : 4
Preferred Minor : 0

    Update Time : Fri Dec  4 14:12:11 2015
          State : active
 Active Devices : 4
Working Devices : 4
 Failed Devices : 0
  Spare Devices : 0
       Checksum : 9ef95a0b - correct
         Events : 142747

         Layout : left-symmetric
     Chunk Size : 64K

      Number   Major   Minor   RaidDevice State
this     2       8       49        2      active sync   /dev/sdd1

   0     0       8       33        0      active sync   /dev/sdc1
   1     1       8       65        1      active sync   /dev/sde1
   2     2       8       49        2      active sync   /dev/sdd1
   3     3       8       81        3      active sync   /dev/sdf1
/dev/sde:
   MBR Magic : aa55
Partition[0] :   4294967295 sectors at            1 (type ee)
/dev/sde1:
          Magic : a92b4efc
        Version : 0.90.00
           UUID : 4a529d12:38740437:6b618a7b:b902ea28
  Creation Time : Fri Feb 13 22:16:46 2009
     Raid Level : raid5
  Used Dev Size : -1364700864 (2794.52 GiB 3000.59 GB)
     Array Size : 8790799296 (8383.56 GiB 9001.78 GB)
   Raid Devices : 4
  Total Devices : 4
Preferred Minor : 0

    Update Time : Fri Dec  4 14:12:11 2015
          State : active
 Active Devices : 4
Working Devices : 4
 Failed Devices : 0
  Spare Devices : 0
       Checksum : 9ef95a19 - correct
         Events : 142747

         Layout : left-symmetric
     Chunk Size : 64K

      Number   Major   Minor   RaidDevice State
this     1       8       65        1      active sync   /dev/sde1

   0     0       8       33        0      active sync   /dev/sdc1
   1     1       8       65        1      active sync   /dev/sde1
   2     2       8       49        2      active sync   /dev/sdd1
   3     3       8       81        3      active sync   /dev/sdf1
/dev/sdf:
   MBR Magic : aa55
Partition[0] :   4294967295 sectors at            1 (type ee)
/dev/sdf1:
          Magic : a92b4efc
        Version : 0.90.00
           UUID : 4a529d12:38740437:6b618a7b:b902ea28
  Creation Time : Fri Feb 13 22:16:46 2009
     Raid Level : raid5
  Used Dev Size : -1364700864 (2794.52 GiB 3000.59 GB)
     Array Size : 8790799296 (8383.56 GiB 9001.78 GB)
   Raid Devices : 4
  Total Devices : 4
Preferred Minor : 0

    Update Time : Fri Dec  4 14:12:11 2015
          State : active
 Active Devices : 4
Working Devices : 4
 Failed Devices : 0
  Spare Devices : 0
       Checksum : 9ef95a2d - correct
         Events : 142747

         Layout : left-symmetric
     Chunk Size : 64K

      Number   Major   Minor   RaidDevice State
this     3       8       81        3      active sync   /dev/sdf1

   0     0       8       33        0      active sync   /dev/sdc1
   1     1       8       65        1      active sync   /dev/sde1
   2     2       8       49        2      active sync   /dev/sdd1
   3     3       8       81        3      active sync   /dev/sdf1

The smartctl info. Note that it currently is doing a resync on the raid though, don't know if that affect some results...

[mattias@uglybob ~]$ smartctl -a /dev/sdc
smartctl 6.4 2015-06-04 r4109 [x86_64-linux-4.2.5-1-ARCH] (local build)
Copyright (C) 2002-15, Bruce Allen, Christian Franke, www.smartmontools.org

Smartctl open device: /dev/sdc failed: Permission denied
[mattias@uglybob ~]$ sudo smartctl -a /dev/sdc
smartctl 6.4 2015-06-04 r4109 [x86_64-linux-4.2.5-1-ARCH] (local build)
Copyright (C) 2002-15, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Western Digital Green
Device Model:     WDC WD30EZRX-00MMMB0
Serial Number:    WD-WCAWZ2994924
Firmware Version: 09570115
User Capacity:    3,000,592,982,016 bytes [3.00 TB]
Sector Size:      512 bytes logical/physical
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ATA/ATAPI-7 (minor revision not indicated)
Local Time is:    Fri Dec  4 14:20:41 2015 CET
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00)	Offline data collection activity
					was never started.
					Auto Offline Data Collection: Disabled.
Total time to complete Offline 
data collection: 		(    0) seconds.
Offline data collection
capabilities: 			 (0x00) 	Offline data collection not supported.
SMART capabilities:            (0x0000)	Automatic saving of SMART data					is not implemented.
Error logging capability:        (0x00)	Error logging NOT supported.
					No General Purpose Logging support.

SMART Error Log not supported

SMART Self-test Log not supported

Selective Self-tests/Logging not supported

[mattias@uglybob ~]$ sudo smartctl -a /dev/sdd
smartctl 6.4 2015-06-04 r4109 [x86_64-linux-4.2.5-1-ARCH] (local build)
Copyright (C) 2002-15, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Western Digital Green
Device Model:     WDC WD30EZRX-00MMMB0
Serial Number:    WD-WCAWZ2972066
LU WWN Device Id: 5 0014ee 2075ec9c7
Firmware Version: 80.00A80
User Capacity:    3,000,592,982,016 bytes [3.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ATA8-ACS (minor revision not indicated)
SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 1.5 Gb/s)
Local Time is:    Fri Dec  4 14:21:04 2015 CET
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x82)	Offline data collection activity
					was completed without error.
					Auto Offline Data Collection: Enabled.
Self-test execution status:      (   0)	The previous self-test routine completed
					without error or no self-test has ever 
					been run.
Total time to complete Offline 
data collection: 		(50280) seconds.
Offline data collection
capabilities: 			 (0x7b) SMART execute Offline immediate.
					Auto Offline data collection on/off support.
					Suspend Offline collection upon new
					command.
					Offline surface scan supported.
					Self-test supported.
					Conveyance Self-test supported.
					Selective Self-test supported.
SMART capabilities:            (0x0003)	Saves SMART data before entering
					power-saving mode.
					Supports SMART auto save timer.
Error logging capability:        (0x01)	Error logging supported.
					General Purpose Logging supported.
Short self-test routine 
recommended polling time: 	 (   2) minutes.
Extended self-test routine
recommended polling time: 	 ( 483) minutes.
Conveyance self-test routine
recommended polling time: 	 (   5) minutes.
SCT capabilities: 	       (0x3035)	SCT Status supported.
					SCT Feature Control supported.
					SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       0
  3 Spin_Up_Time            0x0027   178   148   021    Pre-fail  Always       -       8083
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       49
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x002e   200   200   000    Old_age   Always       -       0
  9 Power_On_Hours          0x0032   066   066   000    Old_age   Always       -       24974
 10 Spin_Retry_Count        0x0032   100   253   000    Old_age   Always       -       0
 11 Calibration_Retry_Count 0x0032   100   253   000    Old_age   Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       49
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       39
193 Load_Cycle_Count        0x0032   117   117   000    Old_age   Always       -       251760
194 Temperature_Celsius     0x0022   116   107   000    Old_age   Always       -       36
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0030   200   200   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x0008   200   200   000    Old_age   Offline      -       0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
No self-tests have been logged.  [To run self-tests, use: smartctl -t]

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

[mattias@uglybob ~]$ sudo smartctl -a /dev/sde
smartctl 6.4 2015-06-04 r4109 [x86_64-linux-4.2.5-1-ARCH] (local build)
Copyright (C) 2002-15, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Western Digital Green
Device Model:     WDC WD30EZRX-00DC0B0
Serial Number:    WD-WMC1T0731635
LU WWN Device Id: 5 0014ee 0ae1b2ce5
Firmware Version: 80.00A80
User Capacity:    3,000,592,982,016 bytes [3.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ACS-2 (minor revision not indicated)
SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 1.5 Gb/s)
Local Time is:    Fri Dec  4 14:21:07 2015 CET
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x82)	Offline data collection activity
					was completed without error.
					Auto Offline Data Collection: Enabled.
Self-test execution status:      (   0)	The previous self-test routine completed
					without error or no self-test has ever 
					been run.
Total time to complete Offline 
data collection: 		(41520) seconds.
Offline data collection
capabilities: 			 (0x7b) SMART execute Offline immediate.
					Auto Offline data collection on/off support.
					Suspend Offline collection upon new
					command.
					Offline surface scan supported.
					Self-test supported.
					Conveyance Self-test supported.
					Selective Self-test supported.
SMART capabilities:            (0x0003)	Saves SMART data before entering
					power-saving mode.
					Supports SMART auto save timer.
Error logging capability:        (0x01)	Error logging supported.
					General Purpose Logging supported.
Short self-test routine 
recommended polling time: 	 (   2) minutes.
Extended self-test routine
recommended polling time: 	 ( 417) minutes.
Conveyance self-test routine
recommended polling time: 	 (   5) minutes.
SCT capabilities: 	       (0x70b5)	SCT Status supported.
					SCT Feature Control supported.
					SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       0
  3 Spin_Up_Time            0x0027   189   179   021    Pre-fail  Always       -       5516
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       52
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x002e   200   200   000    Old_age   Always       -       0
  9 Power_On_Hours          0x0032   066   066   000    Old_age   Always       -       25028
 10 Spin_Retry_Count        0x0032   100   253   000    Old_age   Always       -       0
 11 Calibration_Retry_Count 0x0032   100   253   000    Old_age   Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       52
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       40
193 Load_Cycle_Count        0x0032   115   115   000    Old_age   Always       -       256224
194 Temperature_Celsius     0x0022   116   108   000    Old_age   Always       -       34
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0030   200   200   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x0008   200   200   000    Old_age   Offline      -       0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
No self-tests have been logged.  [To run self-tests, use: smartctl -t]

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

[mattias@uglybob ~]$ sudo smartctl -a /dev/sdf
smartctl 6.4 2015-06-04 r4109 [x86_64-linux-4.2.5-1-ARCH] (local build)
Copyright (C) 2002-15, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Western Digital Green
Device Model:     WDC WD30EZRX-00MMMB0
Serial Number:    WD-WCAWZ2989894
LU WWN Device Id: 5 0014ee 2b209ae22
Firmware Version: 80.00A80
User Capacity:    3,000,592,982,016 bytes [3.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ATA8-ACS (minor revision not indicated)
SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 1.5 Gb/s)
Local Time is:    Fri Dec  4 14:21:09 2015 CET
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x82)	Offline data collection activity
					was completed without error.
					Auto Offline Data Collection: Enabled.
Self-test execution status:      (   0)	The previous self-test routine completed
					without error or no self-test has ever 
					been run.
Total time to complete Offline 
data collection: 		(49380) seconds.
Offline data collection
capabilities: 			 (0x7b) SMART execute Offline immediate.
					Auto Offline data collection on/off support.
					Suspend Offline collection upon new
					command.
					Offline surface scan supported.
					Self-test supported.
					Conveyance Self-test supported.
					Selective Self-test supported.
SMART capabilities:            (0x0003)	Saves SMART data before entering
					power-saving mode.
					Supports SMART auto save timer.
Error logging capability:        (0x01)	Error logging supported.
					General Purpose Logging supported.
Short self-test routine 
recommended polling time: 	 (   2) minutes.
Extended self-test routine
recommended polling time: 	 ( 475) minutes.
Conveyance self-test routine
recommended polling time: 	 (   5) minutes.
SCT capabilities: 	       (0x3035)	SCT Status supported.
					SCT Feature Control supported.
					SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       0
  3 Spin_Up_Time            0x0027   174   147   021    Pre-fail  Always       -       8275
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       51
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x002e   200   200   000    Old_age   Always       -       0
  9 Power_On_Hours          0x0032   066   066   000    Old_age   Always       -       25016
 10 Spin_Retry_Count        0x0032   100   253   000    Old_age   Always       -       0
 11 Calibration_Retry_Count 0x0032   100   253   000    Old_age   Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       51
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       39
193 Load_Cycle_Count        0x0032   116   116   000    Old_age   Always       -       252178
194 Temperature_Celsius     0x0022   119   110   000    Old_age   Always       -       33
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0030   200   200   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x0008   200   200   000    Old_age   Offline      -       0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
No self-tests have been logged.  [To run self-tests, use: smartctl -t]

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

Last edited by UglyBob (2015-12-04 13:25:29)

frostschutz · 2015-12-04 13:33:49

0.90 metadata officially only supports 2TB devices; you're using it with 3TB and mdadm --examine output looks strange (negative used dev size). And your partitions aren't aligned for all drives.

All in all this will give you a lot of headache and may already be the cause of the problems you're describing.

I would re-do this setup with MiB-aligned partitions and 1.2 metadata.

UglyBob · 2015-12-04 14:00:11

Ouch, I see that now yes... So I will have to re-partition every drive and rebuild the raid for each one then I guess? And do that before the metadata update or after?

alphaniner · 2015-12-04 14:46:50

WD are really annoying and have released different subversion of the same drives without writing anything about this on the labels.

My experience with WD drives is very different, but it's based on the 'Enterprise' line of drives. That said, my company once looked into using WD Greens with a low-end Linux-based SAN and it was a complete and utter disaster. I wasn't involved in the testing, unfortunately, so I don't know the details.

Anyway, back to your situation. Comparing the 'INFORMATION SECTION' of the smartctl -a output, there's no denying these drives are quite different. In particular are the ATA and SATA versions. Sdc doesn't even report a SATA version and has a weird firmware that doesn't match the current WD standard (##.##XX##)*. Makes me wonder if it's an OEM drive, did you happen to buy it from eBay or such? It also uses legacy sector size (512) whereas the others are AF (4K). And then there's the remaining three which report a SATA version but are only connecting at 1.5Gb/s.

Honestly, if I were you I'd look into selling the drives and replacing them with a different model line...

* I googled this f/w and found it was used on WD20EARS, a 2TB Caviar Green drive from the previous generation (SATA 2).

Last edited by alphaniner (2015-12-04 14:54:52)

UglyBob · 2015-12-04 14:54:24

Yeah, I don't like it either. But I have used this raid for several years at least without losing any data, which have been good enough for me. And no, the disks were bought from one of Sweden's biggest computer stores at the same time, so I think I was just unlucky and got 3 from the new batch and one from the old one (or vice versa), because I don't think the drives are lying on the shelf for that long there. I have talked to WD support and they say it's not really a problem, that they report different sector sizes but that I can just partition and format them as I like:

"Regarding the AF drives, changed the physical sector size of the disks without changing the model number, that´s right. But there is a definite mention of “Advanced Format drive” on the label of the hard drive.

And the model number you have: WD30EZRX is definitely an Advanced Format drive. In fact, all WD Green drives are AF drives. So basically your drive has been wrongly formatted."

I know now that I will never ever buy WD Green disks again though. I just thought it could be nice to save some power on disks that are usually idle. But if I buy WD next time, it will definitely be Red disks or Black ones. They should be identical it seems.

Last edited by UglyBob (2015-12-04 14:56:49)

frostschutz · 2015-12-04 15:01:30

I've only good experience with WD Green drives. Neither partitioning nor metadata versions are the drives fault. You can run a long smart self check to be sure but neither drive is reporting reallocated/pending/uncorrectable sectors either, so I have no reason to believe the drives are involved.

Do you have free space elsewhere so you can copy your data to and start with these disks from scratch?

Last edited by frostschutz (2015-12-04 15:02:00)

UglyBob · 2015-12-04 15:01:53

I now this is really noob questions, but as I have a lot of data on these raid I wonder if you could guide me on the steps (or if you have some good guide) on how to proceed on re-partioning the drives without moving all the data. I know it's possible but that you have to rebuild it for every disk, I just want to minimize the risk of losing data (or spending weeks without a working raid). It is mostly movies etc, so I wont die if I lose it, everything important is backed up, but still...

Very greatful for your help!

UglyBob · 2015-12-04 15:06:03

frostschutz wrote:

I've only good experience with WD Green drives. Neither partitioning nor metadata versions are the drives fault. You can run a long smart self check to be sure but neither drive is reporting reallocated/pending/uncorrectable sectors either, so I have no reason to believe the drives are involved.
Do you have free space elsewhere so you can copy your data to and start with these disks from scratch?

No, not really...part of the data I guess, but definitely not all of it... But I should be able to fail one disk at the time and re-partition it? It will take time, I know...

TheChickenMan · 2015-12-04 15:14:14

UglyBob wrote:

frostschutz wrote:
I've only good experience with WD Green drives. Neither partitioning nor metadata versions are the drives fault. You can run a long smart self check to be sure but neither drive is reporting reallocated/pending/uncorrectable sectors either, so I have no reason to believe the drives are involved.
Do you have free space elsewhere so you can copy your data to and start with these disks from scratch?
No, not really...part of the data I guess, but definitely not all of it... But I should be able to fail one disk at the time and re-partition it? It will take time, I know...

Failing the drives out one at a time still wouldn't allow you to change the metadata version to allow for the larger drive sizes unless you're thinking of partitioning them so that they are only using the first two TB. I'm not sure if there's a way to "upgrade" the metadata version in place. You might look to do some research on that.

Here are some good places to look for information about md raid on linux:
https://raid.wiki.kernel.org/index.php/Linux_Raid
https://raid.wiki.kernel.org/index.php/RAID_setup
http://www.ducea.com/2009/03/08/mdadm-cheat-sheet/

UglyBob · 2015-12-05 13:04:27

Ok, I have done some research and you're right, there seem to be no safe way to upgrade the metadata above 1.0, so I guess I will try to backup what I want to save and redo the raid from scratch.

One detail I'm wondering about though. If all the drives are AF drives as WD is telling me, should I format them with 512B sectors or 4K sectors then?

Also not a 100% sure about the partitioning. Some examples or raid creation mdadm --create is applied on the disk devices (/dev/sdX) and some examples on partitions (/dev/sdXn). What difference does it make if I want to use the whole disks for the raid?

frostschutz · 2015-12-05 13:07:52

failing one disk would work if your raid is in sync (or you fail the failed drive first?) and if you're willing to sacrifice redundancy during the process...

should I format them with 512B sectors or 4K sectors then?

4K sectors, but for partitions and such you should really go for MiB alignment, that's what works for all devices in the foreseeable future

What difference does it make if I want to use the whole disks for the raid?

a partition table is the most commonly understood way to declare that a disk is in use & for what

if you mdadm the disk directly, some programs may not be able to understand/recognize md metadata, consider the disk free and make it the default choice for formatting something else on them

UglyBob · 2015-12-05 15:04:39

Ok, I understand! Thanks a lot for all the help!

UglyBob · 2015-12-19 22:12:47

Hi again!

I've been away on a trip and been backing up during that time and now I'm ready to start. There is however a major question mark I can't seem to find any information of. That drive I have (WDC WD30EZRX-00MMMB0) that is different from the other three, as I think I said before it reports 512B logical AND physical sectors. According to WD it is not true (it reports 512B for backwards compatibility) and can be changed by formatting it in windows, but I really don't want to take it out of my server and put it in a Windows machine if I don't have to. Is there any trick in linux for these AF drives to change the physical sector size from 512B to 4K? I'm not sure if it would affect performance if I just change the logical sector to 4K when the other three devices have 4K physical sectors. Really don't want to do this wrong as it is too much work to rebuild the raid again...

Arch Linux

#1 2015-12-03 20:23:34

MD Raid problems

#2 2015-12-03 20:30:43

Re: MD Raid problems

#3 2015-12-04 08:55:04

Re: MD Raid problems

#4 2015-12-04 09:20:09

Re: MD Raid problems

#5 2015-12-04 11:38:51

Re: MD Raid problems

#6 2015-12-04 11:56:10

Re: MD Raid problems

#7 2015-12-04 12:32:34

Re: MD Raid problems

#8 2015-12-04 13:23:35

Re: MD Raid problems

#9 2015-12-04 13:33:49

Re: MD Raid problems

#10 2015-12-04 14:00:11

Re: MD Raid problems

#11 2015-12-04 14:46:50

Re: MD Raid problems

#12 2015-12-04 14:54:24

Re: MD Raid problems

#13 2015-12-04 15:01:30

Re: MD Raid problems

#14 2015-12-04 15:01:53

Re: MD Raid problems

#15 2015-12-04 15:06:03

Re: MD Raid problems

#16 2015-12-04 15:14:14

Re: MD Raid problems

#17 2015-12-05 13:04:27

Re: MD Raid problems

#18 2015-12-05 13:07:52

Re: MD Raid problems

#19 2015-12-05 15:04:39

Re: MD Raid problems

#20 2015-12-19 22:12:47

Re: MD Raid problems

Board footer