Resize of RAID5 array frozen

while · 2015-06-07 12:39:43

I recently added a new disk to my RAID5 array and started growing it. Absent minded as I am I rebooted the server during this reshape process I as another program was hung and blocking some ports. Thinking of it now it might have been because the array hung but I cannot be sure.

After this reboot the reshape process has frozen at `28%`. I can no longer mount the array, stop it or anything it just seem to have frozen up.

Here is some info on the array:

    # mdadm -D /dev/md0
 
    /dev/md0:
            Version : 1.2
      Creation Time : Sat Mar 28 17:31:15 2015
         Raid Level : raid5
         Array Size : 5860063744 (5588.59 GiB 6000.71 GB)
      Used Dev Size : 2930031872 (2794.30 GiB 3000.35 GB)
       Raid Devices : 4
      Total Devices : 4
        Persistence : Superblock is persistent
    
      Intent Bitmap : Internal
    
        Update Time : Sun Jun  7 11:04:28 2015
              State : clean, reshaping 
     Active Devices : 4
    Working Devices : 4
     Failed Devices : 0
      Spare Devices : 0
    
             Layout : left-symmetric
         Chunk Size : 256K
    
     Reshape Status : 28% complete
      Delta Devices : 1, (3->4)
    
               Name : ocular:0  (local to host ocular)
               UUID : e1f7a83b:2e43c552:84d09d04:b1416cb2
             Events : 344582
    
        Number   Major   Minor   RaidDevice State
           4       8       17        0      active sync   /dev/sdb1
           1       8       49        1      active sync   /dev/sdd1
           3       8       65        2      active sync   /dev/sde1
           5       8       33        3      active sync   /dev/sdc1

and /proc/mdstat shows

    Personalities : [raid6] [raid5] [raid4]
    md0 : active raid5 sdb1[4] sdc1[5] sde1[3] sdd1[1]
          5860063744 blocks super 1.2 level 5, 256k chunk, algorithm 2 [4/4] [UUUU]
          [=====>...............]  reshape = 28.6% (840259584/2930031872) finish=524064.9min speed=66K/sec
          bitmap: 3/22 pages [12KB], 65536KB chunk
    
    unused devices: <none>

Trying to mount the array just hangs

# mount /dev/md0 /mnt/storage/

And the same if I try to stop the array

# mdadm -S /dev/md0

I have also tried growing it down to 3 devices again but it is busy with the last reshape:

# mdadm --grow /dev/md0 --raid-devices=3
    mdadm: /dev/md0 is performing resync/recovery and cannot be reshaped

I tried to mark the new drive as faulty to see if the reshape would stop but to no avail. It works to mark it as failed but nothing happens.

I also tried to run a check instead of a reshape (as I read somewhere this fixed a similar problem) but the device is busy

    # echo check>/sys/block/md0/md/sync_action
    -bash: echo: write error: Device or resource busy

What does this mean? I'm on really thin ice here with no idea what to do so any help is greatly appreciated.

Pretty sure the reboot was not the cause of the problem. It seems to be some problem with the reshape that causes the array to hang. I get these errors in dmesg:

    [  360.625322] INFO: task md0_reshape:126 blocked for more than 120 seconds.
    [  360.625351]       Not tainted 4.0.4-2-ARCH #1
    [  360.625367] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
    [  360.625394] md0_reshape     D ffff88040af57a58     0   126      2 0x00000000
    [  360.625397]  ffff88040af57a58 ffff88040cf58000 ffff8800da535b20 00000001642a9888
    [  360.625399]  ffff88040af57fd8 ffff8800da429000 ffff8800da429008 ffff8800da429208
    [  360.625401]  0000000096400e00 ffff88040af57a78 ffffffff81576707 ffff8800da429000
    [  360.625403] Call Trace:
    [  360.625410]  [<ffffffff81576707>] schedule+0x37/0x90
    [  360.625428]  [<ffffffffa0120de9>] get_active_stripe+0x5c9/0x760 [raid456]
    [  360.625432]  [<ffffffff810b6c70>] ? wake_atomic_t_function+0x60/0x60
    [  360.625436]  [<ffffffffa01246e0>] reshape_request+0x5b0/0x980 [raid456]
    [  360.625439]  [<ffffffff81579053>] ? schedule_timeout+0x123/0x250
    [  360.625443]  [<ffffffffa011743f>] sync_request+0x28f/0x400 [raid456]
    [  360.625449]  [<ffffffffa00da486>] ? is_mddev_idle+0x136/0x170 [md_mod]
    [  360.625454]  [<ffffffffa00de4ba>] md_do_sync+0x8ba/0xe70 [md_mod]
    [  360.625457]  [<ffffffff81576002>] ? __schedule+0x362/0xa30
    [  360.625462]  [<ffffffffa00d9e54>] md_thread+0x144/0x150 [md_mod]
    [  360.625464]  [<ffffffff810b6c70>] ? wake_atomic_t_function+0x60/0x60
    [  360.625468]  [<ffffffffa00d9d10>] ? md_start_sync+0xf0/0xf0 [md_mod]
    [  360.625471]  [<ffffffff81093418>] kthread+0xd8/0xf0
    [  360.625473]  [<ffffffff81093340>] ? kthread_worker_fn+0x170/0x170
    [  360.625476]  [<ffffffff8157a398>] ret_from_fork+0x58/0x90
    [  360.625478]  [<ffffffff81093340>] ? kthread_worker_fn+0x170/0x170

Also, looking at CPU usage md0_raid5 seems to be having problems:

     PID USER      PR  NI    VIRT    RES  %CPU %MEM     TIME+ S COMMAND
     125 root      20   0    0.0m   0.0m 100.0  0.0  35:57.44 R  `- md0_raid5
     126 root      20   0    0.0m   0.0m   0.0  0.0   0:00.06 D  `- md0_reshape

Could this be why the reshape has stopped?

Is it possible to revert to using 3 drives again without losing data?

frostschutz · 2015-06-07 12:53:36

Which kernel are you using? `uname -a`

Can you show `mdadm --examine /dev/sd*` if that does not hang?

Is it possible to revert to using 3 drives again without losing data?

Yes, but it's a pain.

Since you already seem to be in a deadlock situation, you might not be able to avoid a reboot. Maybe reboot using a different kernel just in case, or something like the SystemRescueCD, maybe it will be able to resume the operation normally.

Although it may be better not to resume it if you failed one of your drives. If you did that you're looking at a grow operation without redundancy and if it proceeds that way you lose your ability to revert so you can only hope it runs through & none of the remaining disks are bad & you're able to restore redundancy.

while · 2015-06-07 13:20:29

Sure. Here is the info;

Linux ocular 4.0.4-2-ARCH #1 SMP PREEMPT Fri May 22 03:05:23 UTC 2015 x86_64 GNU/Linux

And here is the mdadm examine:

/dev/sdb1:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x4d
     Array UUID : e1f7a83b:2e43c552:84d09d04:b1416cb2
           Name : ocular:0  (local to host ocular)
  Creation Time : Sat Mar 28 17:31:15 2015
     Raid Level : raid5
   Raid Devices : 4

 Avail Dev Size : 5860268943 (2794.39 GiB 3000.46 GB)
     Array Size : 8790095616 (8382.89 GiB 9001.06 GB)
  Used Dev Size : 5860063744 (2794.30 GiB 3000.35 GB)
    Data Offset : 262144 sectors
     New Offset : 260608 sectors
   Super Offset : 8 sectors
          State : active
    Device UUID : b0b8a3c9:9a11499b:65e16567:6d40095e

Internal Bitmap : 8 sectors from superblock
  Reshape pos'n : 2520575232 (2403.81 GiB 2581.07 GB)
  Delta Devices : 1 (3->4)

    Update Time : Sun Jun  7 14:07:18 2015
  Bad Block Log : 512 entries available at offset 72 sectors - bad blocks present.
       Checksum : 47269a16 - correct
         Events : 344588

         Layout : left-symmetric
     Chunk Size : 256K

   Device Role : Active device 0
   Array State : AAAA ('A' == active, '.' == missing, 'R' == replacing)
/dev/sdc1:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x45
     Array UUID : e1f7a83b:2e43c552:84d09d04:b1416cb2
           Name : ocular:0  (local to host ocular)
  Creation Time : Sat Mar 28 17:31:15 2015
     Raid Level : raid5
   Raid Devices : 4

 Avail Dev Size : 5860268943 (2794.39 GiB 3000.46 GB)
     Array Size : 8790095616 (8382.89 GiB 9001.06 GB)
  Used Dev Size : 5860063744 (2794.30 GiB 3000.35 GB)
    Data Offset : 262144 sectors
     New Offset : 260608 sectors
   Super Offset : 8 sectors
          State : active
    Device UUID : b42d1b98:db70d93e:a60cd400:03de5f0e

Internal Bitmap : 8 sectors from superblock
  Reshape pos'n : 2520575232 (2403.81 GiB 2581.07 GB)
  Delta Devices : 1 (3->4)

    Update Time : Sun Jun  7 14:07:18 2015
  Bad Block Log : 512 entries available at offset 72 sectors
       Checksum : 2f3372a - correct
         Events : 344588

         Layout : left-symmetric
     Chunk Size : 256K

   Device Role : Active device 3
   Array State : AAAA ('A' == active, '.' == missing, 'R' == replacing)
/dev/sdd1:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x45
     Array UUID : e1f7a83b:2e43c552:84d09d04:b1416cb2
           Name : ocular:0  (local to host ocular)
  Creation Time : Sat Mar 28 17:31:15 2015
     Raid Level : raid5
   Raid Devices : 4

 Avail Dev Size : 5860064143 (2794.30 GiB 3000.35 GB)
     Array Size : 8790095616 (8382.89 GiB 9001.06 GB)
  Used Dev Size : 5860063744 (2794.30 GiB 3000.35 GB)
    Data Offset : 262144 sectors
     New Offset : 260608 sectors
   Super Offset : 8 sectors
          State : active
    Device UUID : 081357be:722dfa8f:81dbfbea:981e0740

Internal Bitmap : 8 sectors from superblock
  Reshape pos'n : 2520575232 (2403.81 GiB 2581.07 GB)
  Delta Devices : 1 (3->4)

    Update Time : Sun Jun  7 14:07:18 2015
  Bad Block Log : 512 entries available at offset 72 sectors
       Checksum : 961bc882 - correct
         Events : 344588

         Layout : left-symmetric
     Chunk Size : 256K

   Device Role : Active device 1
   Array State : AAAA ('A' == active, '.' == missing, 'R' == replacing)
/dev/sde1:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x4d
     Array UUID : e1f7a83b:2e43c552:84d09d04:b1416cb2
           Name : ocular:0  (local to host ocular)
  Creation Time : Sat Mar 28 17:31:15 2015
     Raid Level : raid5
   Raid Devices : 4

 Avail Dev Size : 5860064143 (2794.30 GiB 3000.35 GB)
     Array Size : 8790095616 (8382.89 GiB 9001.06 GB)
  Used Dev Size : 5860063744 (2794.30 GiB 3000.35 GB)
    Data Offset : 262144 sectors
     New Offset : 260608 sectors
   Super Offset : 8 sectors
          State : active
    Device UUID : be585918:766c2ef1:b67c083b:de9cac20

Internal Bitmap : 8 sectors from superblock
  Reshape pos'n : 2520575232 (2403.81 GiB 2581.07 GB)
  Delta Devices : 1 (3->4)

    Update Time : Sun Jun  7 14:07:18 2015
  Bad Block Log : 512 entries available at offset 72 sectors - bad blocks present.
       Checksum : 82046cc0 - correct
         Events : 344588

         Layout : left-symmetric
     Chunk Size : 256K

   Device Role : Active device 2
   Array State : AAAA ('A' == active, '.' == missing, 'R' == replacing)

I already tried rebooting again and it resumes from the same position. With the failed disk present again.

Last edited by while (2015-06-07 14:02:26)

frostschutz · 2015-06-07 13:36:27

It says "bad blocks present" for /dev/sdb1 and /dev/sde1, can you check SMART data of all disks if there's any issue? mdadm --examine-badblocks /dev/???

Otherwise it does not look like you failed a disk; so, barring any bad block issues, I'd try to reboot and hope it will resume the grow operation normally.

If you actually do have bad blocks it may be the source of your issue. The bad block list is a rather new feature...

Last edited by frostschutz (2015-06-07 13:38:26)

while · 2015-06-07 14:18:26

Seems to be badblocks as you said. Here is the output:

Bad-blocks on /dev/sdb1:
          1680781608 for 408 sectors
          1683782656 for 8 sectors
          1688501712 for 512 sectors
          1688502224 for 240 sectors
          4315347104 for 128 sectors
          5729057784 for 128 sectors
          5729063200 for 32 sectors
          5729065008 for 8 sectors
          5729066808 for 128 sectors
          5729074040 for 8 sectors
          5729075840 for 8 sectors
          5729077648 for 128 sectors
          5729081376 for 8 sectors
          5729083176 for 312 sectors
          5729084984 for 32 sectors
          5729086784 for 512 sectors
          5729087296 for 64 sectors
          5729088592 for 512 sectors
          5729089104 for 80 sectors
          5729090400 for 88 sectors
          5729092200 for 16 sectors
          5729094008 for 8 sectors
          5729095808 for 512 sectors
          5729096320 for 512 sectors
          5729096832 for 72 sectors
          5729097616 for 384 sectors
          5729099424 for 8 sectors
          5729101224 for 128 sectors
          5729103032 for 16 sectors
          5729104840 for 512 sectors
          5729105352 for 512 sectors
          5729105864 for 184 sectors
          5729106648 for 8 sectors
          5729110376 for 512 sectors
          5729110888 for 504 sectors
          5729112184 for 16 sectors
          5729113992 for 112 sectors
          5729115792 for 128 sectors
          5729117600 for 128 sectors
          5729119408 for 512 sectors
          5729119920 for 112 sectors
          5729121208 for 128 sectors
          5730507008 for 8 sectors
          5730508816 for 128 sectors
          5730510616 for 40 sectors
          5730512424 for 8 sectors
          5730514224 for 128 sectors
          5730517968 for 112 sectors
          5730518792 for 512 sectors
          5730519304 for 512 sectors
          5730519816 for 320 sectors
          5730520600 for 8 sectors
          5730521584 for 176 sectors
          5730536016 for 8 sectors
          5730537824 for 128 sectors
          5730539624 for 8 sectors
          5730543240 for 8 sectors
          5730550584 for 8 sectors
          5730552384 for 128 sectors
          5730553208 for 8 sectors
          5730554192 for 16 sectors
          5730555016 for 8 sectors
          5730556816 for 128 sectors
          5730557800 for 8 sectors
          5730558624 for 72 sectors
          5730559608 for 440 sectors
          5730561408 for 144 sectors
          5730562360 for 8 sectors
          5730563216 for 512 sectors
          5730563728 for 8 sectors
          5730598592 for 16 sectors
          5730600392 for 512 sectors
          5730600904 for 512 sectors
          5730601416 for 88 sectors
          5730604008 for 8 sectors
          5730605800 for 336 sectors
          5730607608 for 128 sectors
          5730608592 for 512 sectors
          5730609104 for 32 sectors
          5730609408 for 128 sectors
          5730610400 for 32 sectors
          5730611216 for 128 sectors
          5730613024 for 24 sectors
          5730614008 for 8 sectors
          5730614824 for 128 sectors
          5730615816 for 32 sectors
          5730616632 for 512 sectors
          5730617144 for 248 sectors
          5730617616 for 8 sectors
          5730618440 for 72 sectors
          5730620248 for 512 sectors
          5730620760 for 128 sectors
          5730621224 for 128 sectors
          5730623976 for 128 sectors
          5730625784 for 512 sectors
          5730626296 for 160 sectors
          5730627584 for 512 sectors
          5730628096 for 512 sectors
          5730628608 for 48 sectors
          5730629392 for 24 sectors
          5730631200 for 144 sectors
          5730633008 for 128 sectors
          5730634808 for 256 sectors
          5730636616 for 384 sectors
          5730638424 for 120 sectors
          5730640224 for 128 sectors
          5730642032 for 184 sectors
          5730643840 for 512 sectors
          5730644352 for 48 sectors
          5730649280 for 8 sectors
          5730652992 for 8 sectors
          5730653016 for 104 sectors
          5730685624 for 8 sectors

Bad-blocks on /dev/sde1:
          1680781608 for 408 sectors
          1683782656 for 8 sectors
          1688501712 for 512 sectors
          1688502224 for 240 sectors
          4315347104 for 128 sectors
          5729057784 for 128 sectors
          5729063200 for 32 sectors
          5729065008 for 8 sectors
          5729066808 for 128 sectors
          5729074040 for 8 sectors
          5729075840 for 8 sectors
          5729077648 for 128 sectors
          5729081376 for 8 sectors
          5729083176 for 312 sectors
          5729084984 for 32 sectors
          5729086784 for 512 sectors
          5729087296 for 64 sectors
          5729088592 for 512 sectors
          5729089104 for 80 sectors
          5729090400 for 88 sectors
          5729092200 for 16 sectors
          5729094008 for 8 sectors
          5729095808 for 512 sectors
          5729096320 for 512 sectors
          5729096832 for 72 sectors
          5729097616 for 384 sectors
          5729099424 for 8 sectors
          5729101224 for 128 sectors
          5729103032 for 16 sectors
          5729104840 for 512 sectors
          5729105352 for 512 sectors
          5729105864 for 184 sectors
          5729106648 for 8 sectors
          5729110376 for 512 sectors
          5729110888 for 504 sectors
          5729112184 for 16 sectors
          5729113992 for 112 sectors
          5729115792 for 128 sectors
          5729117600 for 128 sectors
          5729119408 for 512 sectors
          5729119920 for 112 sectors
          5729121208 for 128 sectors
          5730507008 for 8 sectors
          5730508816 for 128 sectors
          5730510616 for 40 sectors
          5730512424 for 8 sectors
          5730514224 for 128 sectors
          5730517968 for 112 sectors
          5730518792 for 512 sectors
          5730519304 for 512 sectors
          5730519816 for 320 sectors
          5730520600 for 8 sectors
          5730521584 for 176 sectors
          5730536016 for 8 sectors
          5730537824 for 128 sectors
          5730539624 for 8 sectors
          5730543240 for 8 sectors
          5730550584 for 8 sectors
          5730552384 for 128 sectors
          5730553208 for 8 sectors
          5730554192 for 16 sectors
          5730555016 for 8 sectors
          5730556816 for 128 sectors
          5730557800 for 8 sectors
          5730558624 for 72 sectors
          5730559608 for 440 sectors
          5730561408 for 144 sectors
          5730562360 for 8 sectors
          5730563216 for 512 sectors
          5730563728 for 8 sectors
          5730598592 for 16 sectors
          5730600392 for 512 sectors
          5730600904 for 512 sectors
          5730601416 for 88 sectors
          5730604008 for 8 sectors
          5730605800 for 336 sectors
          5730607608 for 128 sectors
          5730608592 for 512 sectors
          5730609104 for 32 sectors
          5730609408 for 128 sectors
          5730610400 for 32 sectors
          5730611216 for 128 sectors
          5730613024 for 24 sectors
          5730614008 for 8 sectors
          5730614824 for 128 sectors
          5730615816 for 32 sectors
          5730616632 for 512 sectors
          5730617144 for 248 sectors
          5730617616 for 8 sectors
          5730618440 for 72 sectors
          5730620248 for 512 sectors
          5730620760 for 128 sectors
          5730621224 for 128 sectors
          5730623976 for 128 sectors
          5730625784 for 512 sectors
          5730626296 for 160 sectors
          5730627584 for 512 sectors
          5730628096 for 512 sectors
          5730628608 for 48 sectors
          5730629392 for 24 sectors
          5730631200 for 144 sectors
          5730633008 for 128 sectors
          5730634808 for 256 sectors
          5730636616 for 384 sectors
          5730638424 for 120 sectors
          5730640224 for 128 sectors
          5730642032 for 184 sectors
          5730643840 for 512 sectors
          5730644352 for 48 sectors
          5730649280 for 8 sectors
          5730652992 for 8 sectors
          5730653016 for 104 sectors
          5730685624 for 8 sectors

The other two seems to have no bad blocks.

I already tried rebooting again and it resumes from the same position (still stick in the same position). With the failed disk present again. That's why its back.

What does this imply? Are the disks in bad shape? One of them is new and the other is old (only a few months though) and they are of different fabrications so it seems strange to me that both would be broken.

Last edited by while (2015-06-07 14:19:06)

frostschutz · 2015-06-07 14:23:13

The bad blocks list looks identical for both so I think it got cloned for some reason or other.

I don't have much experience with the bad block log, but I think it likely you hit a bug of some kind. Taking this bad blocks at face value your RAID would be broken because with two disks having bad blocks in the same positions your RAID would be dead [for those sectors] since you only have one disk redundancy.

Did you check SMART? smartctl -a for all disks and if you didn't run a self-test recently do a smartctl -t long and see if the disks pass.

while · 2015-06-07 14:37:27

No, I didn't even have it installed. I'm running the tests now on both disks. Ill post the result as soon as its done.

while · 2015-06-07 20:17:24

Takes a lont time to run the extended tests. Here is the results:

sdb

smartctl 6.4 2015-06-04 r4109 [x86_64-linux-4.0.4-2-ARCH] (local build)
Copyright (C) 2002-15, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Seagate Barracuda 7200.14 (AF)
Device Model:     ST3000DM001-1ER166
Serial Number:    Z500TQ59
LU WWN Device Id: 5 000c50 07a00147c
Firmware Version: CC25
User Capacity:    3,000,592,982,016 bytes [3.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    7200 rpm
Form Factor:      3.5 inches
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ACS-2, ACS-3 T13/2161-D revision 3b
SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 3.0 Gb/s)
Local Time is:    Sun Jun  7 22:14:26 2015 CEST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
See vendor-specific Attribute list for marginal Attributes.

General SMART Values:
Offline data collection status:  (0x00)	Offline data collection activity
					was never started.
					Auto Offline Data Collection: Disabled.
Self-test execution status:      (  25)	The self-test routine was aborted by
					the host.
Total time to complete Offline 
data collection: 		(   89) seconds.
Offline data collection
capabilities: 			 (0x73) SMART execute Offline immediate.
					Auto Offline data collection on/off support.
					Suspend Offline collection upon new
					command.
					No Offline surface scan supported.
					Self-test supported.
					Conveyance Self-test supported.
					Selective Self-test supported.
SMART capabilities:            (0x0003)	Saves SMART data before entering
					power-saving mode.
					Supports SMART auto save timer.
Error logging capability:        (0x01)	Error logging supported.
					General Purpose Logging supported.
Short self-test routine 
recommended polling time: 	 (   1) minutes.
Extended self-test routine
recommended polling time: 	 ( 342) minutes.
Conveyance self-test routine
recommended polling time: 	 (   2) minutes.
SCT capabilities: 	       (0x1085)	SCT Status supported.

SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000f   119   099   006    Pre-fail  Always       -       216517664
  3 Spin_Up_Time            0x0003   094   094   000    Pre-fail  Always       -       0
  4 Start_Stop_Count        0x0032   100   100   020    Old_age   Always       -       15
  5 Reallocated_Sector_Ct   0x0033   100   100   010    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x000f   100   253   030    Pre-fail  Always       -       538608
  9 Power_On_Hours          0x0032   100   100   000    Old_age   Always       -       37
 10 Spin_Retry_Count        0x0013   100   100   097    Pre-fail  Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   020    Old_age   Always       -       15
183 Runtime_Bad_Block       0x0032   100   100   000    Old_age   Always       -       0
184 End-to-End_Error        0x0032   100   100   099    Old_age   Always       -       0
187 Reported_Uncorrect      0x0032   100   100   000    Old_age   Always       -       0
188 Command_Timeout         0x0032   100   100   000    Old_age   Always       -       0 0 0
189 High_Fly_Writes         0x003a   100   100   000    Old_age   Always       -       0
190 Airflow_Temperature_Cel 0x0022   052   044   045    Old_age   Always   In_the_past 48 (Min/Max 43/50 #240)
191 G-Sense_Error_Rate      0x0032   100   100   000    Old_age   Always       -       0
192 Power-Off_Retract_Count 0x0032   100   100   000    Old_age   Always       -       1
193 Load_Cycle_Count        0x0032   100   100   000    Old_age   Always       -       68
194 Temperature_Celsius     0x0022   048   056   000    Old_age   Always       -       48 (0 27 0 0 0)
197 Current_Pending_Sector  0x0012   100   100   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0010   100   100   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x003e   200   200   000    Old_age   Always       -       0
240 Head_Flying_Hours       0x0000   100   253   000    Old_age   Offline      -       31h+24m+45.066s
241 Total_LBAs_Written      0x0000   100   253   000    Old_age   Offline      -       7774670316
242 Total_LBAs_Read         0x0000   100   253   000    Old_age   Offline      -       2541026610

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Completed without error       00%        37         -

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

sdc

smartctl 6.4 2015-06-04 r4109 [x86_64-linux-4.0.4-2-ARCH] (local build)
Copyright (C) 2002-15, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Seagate Barracuda 7200.14 (AF)
Device Model:     ST3000DM001-1ER166
Serial Number:    Z500TPSE
LU WWN Device Id: 5 000c50 079ffde12
Firmware Version: CC25
User Capacity:    3,000,592,982,016 bytes [3.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    7200 rpm
Form Factor:      3.5 inches
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ACS-2, ACS-3 T13/2161-D revision 3b
SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 3.0 Gb/s)
Local Time is:    Mon Jun  8 07:47:08 2015 CEST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00)	Offline data collection activity
					was never started.
					Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0)	The previous self-test routine completed
					without error or no self-test has ever 
					been run.
Total time to complete Offline 
data collection: 		(   89) seconds.
Offline data collection
capabilities: 			 (0x73) SMART execute Offline immediate.
					Auto Offline data collection on/off support.
					Suspend Offline collection upon new
					command.
					No Offline surface scan supported.
					Self-test supported.
					Conveyance Self-test supported.
					Selective Self-test supported.
SMART capabilities:            (0x0003)	Saves SMART data before entering
					power-saving mode.
					Supports SMART auto save timer.
Error logging capability:        (0x01)	Error logging supported.
					General Purpose Logging supported.
Short self-test routine 
recommended polling time: 	 (   1) minutes.
Extended self-test routine
recommended polling time: 	 ( 333) minutes.
Conveyance self-test routine
recommended polling time: 	 (   2) minutes.
SCT capabilities: 	       (0x1085)	SCT Status supported.

SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000f   105   099   006    Pre-fail  Always       -       9204656
  3 Spin_Up_Time            0x0003   094   094   000    Pre-fail  Always       -       0
  4 Start_Stop_Count        0x0032   100   100   020    Old_age   Always       -       14
smartctl 6.4 2015-06-04 r4109 [x86_64-linux-4.0.4-2-ARCH] (local build)
Copyright (C) 2002-15, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Seagate Barracuda 7200.14 (AF)
Device Model:     ST3000DM001-1ER166
Serial Number:    Z500TPSE
LU WWN Device Id: 5 000c50 079ffde12
Firmware Version: CC25
User Capacity:    3,000,592,982,016 bytes [3.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    7200 rpm
Form Factor:      3.5 inches
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ACS-2, ACS-3 T13/2161-D revision 3b
SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 3.0 Gb/s)
Local Time is:    Mon Jun  8 07:47:08 2015 CEST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00)	Offline data collection activity
					was never started.
					Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0)	The previous self-test routine completed
					without error or no self-test has ever 
					been run.
Total time to complete Offline 
data collection: 		(   89) seconds.
Offline data collection
capabilities: 			 (0x73) SMART execute Offline immediate.
					Auto Offline data collection on/off support.
					Suspend Offline collection upon new
					command.
					No Offline surface scan supported.
					Self-test supported.
					Conveyance Self-test supported.
					Selective Self-test supported.
SMART capabilities:            (0x0003)	Saves SMART data before entering
					power-saving mode.
					Supports SMART auto save timer.
Error logging capability:        (0x01)	Error logging supported.
					General Purpose Logging supported.
Short self-test routine 
recommended polling time: 	 (   1) minutes.
Extended self-test routine
recommended polling time: 	 ( 333) minutes.
Conveyance self-test routine
recommended polling time: 	 (   2) minutes.
SCT capabilities: 	       (0x1085)	SCT Status supported.

SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000f   105   099   006    Pre-fail  Always       -       9204656
  3 Spin_Up_Time            0x0003   094   094   000    Pre-fail  Always       -       0
  4 Start_Stop_Count        0x0032   100   100   020    Old_age   Always       -       14
  5 Reallocated_Sector_Ct   0x0033   100   100   010    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x000f   100   253   030    Pre-fail  Always       -       125014
  9 Power_On_Hours          0x0032   100   100   000    Old_age   Always       -       46
 10 Spin_Retry_Count        0x0013   100   100   097    Pre-fail  Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   020    Old_age   Always       -       14
183 Runtime_Bad_Block       0x0032   100   100   000    Old_age   Always       -       0
184 End-to-End_Error        0x0032   100   100   099    Old_age   Always       -       0
187 Reported_Uncorrect      0x0032   100   100   000    Old_age   Always       -       0
188 Command_Timeout         0x0032   100   100   000    Old_age   Always       -       0 0 0
189 High_Fly_Writes         0x003a   098   098   000    Old_age   Always       -       2
190 Airflow_Temperature_Cel 0x0022   052   046   045    Old_age   Always       -       48 (Min/Max 45/54)
191 G-Sense_Error_Rate      0x0032   100   100   000    Old_age   Always       -       0
192 Power-Off_Retract_Count 0x0032   100   100   000    Old_age   Always       -       0
193 Load_Cycle_Count        0x0032   100   100   000    Old_age   Always       -       124
194 Temperature_Celsius     0x0022   048   054   000    Old_age   Always       -       48 (0 26 0 0 0)
197 Current_Pending_Sector  0x0012   100   100   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0010   100   100   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x003e   200   200   000    Old_age   Always       -       0
240 Head_Flying_Hours       0x0000   100   253   000    Old_age   Offline      -       30h+24m+05.235s
241 Total_LBAs_Written      0x0000   100   253   000    Old_age   Offline      -       1682280669
242 Total_LBAs_Read         0x0000   100   253   000    Old_age   Offline      -       1017214

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Completed without error       00%        42         -

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

sdd

smartctl 6.4 2015-06-04 r4109 [x86_64-linux-4.0.4-2-ARCH] (local build)
Copyright (C) 2002-15, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Western Digital Red
Device Model:     WDC WD30EFRX-68EUZN0
Serial Number:    WD-WMC4N0E8VXW6
LU WWN Device Id: 5 0014ee 6afab7ec7
Firmware Version: 82.00A82
User Capacity:    3,000,592,982,016 bytes [3.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    5400 rpm
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ACS-2 (minor revision not indicated)
SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 3.0 Gb/s)
Local Time is:    Mon Jun  8 07:49:46 2015 CEST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00)	Offline data collection activity
					was never started.
					Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0)	The previous self-test routine completed
					without error or no self-test has ever 
					been run.
Total time to complete Offline 
data collection: 		(39120) seconds.
Offline data collection
capabilities: 			 (0x7b) SMART execute Offline immediate.
					Auto Offline data collection on/off support.
					Suspend Offline collection upon new
					command.
					Offline surface scan supported.
					Self-test supported.
					Conveyance Self-test supported.
					Selective Self-test supported.
SMART capabilities:            (0x0003)	Saves SMART data before entering
					power-saving mode.
					Supports SMART auto save timer.
Error logging capability:        (0x01)	Error logging supported.
					General Purpose Logging supported.
Short self-test routine 
recommended polling time: 	 (   2) minutes.
Extended self-test routine
recommended polling time: 	 ( 393) minutes.
Conveyance self-test routine
recommended polling time: 	 (   5) minutes.
SCT capabilities: 	       (0x703d)	SCT Status supported.
					SCT Error Recovery Control supported.
					SCT Feature Control supported.
					SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       0
  3 Spin_Up_Time            0x0027   186   182   021    Pre-fail  Always       -       5691
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       22
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x002e   200   200   000    Old_age   Always       -       0
  9 Power_On_Hours          0x0032   098   098   000    Old_age   Always       -       1681
 10 Spin_Retry_Count        0x0032   100   253   000    Old_age   Always       -       0
 11 Calibration_Retry_Count 0x0032   100   253   000    Old_age   Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       22
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       13
193 Load_Cycle_Count        0x0032   200   200   000    Old_age   Always       -       556
194 Temperature_Celsius     0x0022   116   095   000    Old_age   Always       -       34
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0030   100   253   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x0008   200   200   000    Old_age   Offline      -       0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Completed without error       00%      1679         -

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

sde

smartctl 6.4 2015-06-04 r4109 [x86_64-linux-4.0.4-2-ARCH] (local build)
Copyright (C) 2002-15, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Western Digital Red
Device Model:     WDC WD30EFRX-68EUZN0
Serial Number:    WD-WMC4N0F55REK
LU WWN Device Id: 5 0014ee 6afacdc98
Firmware Version: 82.00A82
User Capacity:    3,000,592,982,016 bytes [3.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    5400 rpm
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ACS-2 (minor revision not indicated)
SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 3.0 Gb/s)
Local Time is:    Mon Jun  8 07:51:08 2015 CEST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00)	Offline data collection activity
					was never started.
					Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0)	The previous self-test routine completed
					without error or no self-test has ever 
					been run.
Total time to complete Offline 
data collection: 		(39540) seconds.
Offline data collection
capabilities: 			 (0x7b) SMART execute Offline immediate.
					Auto Offline data collection on/off support.
					Suspend Offline collection upon new
					command.
					Offline surface scan supported.
					Self-test supported.
					Conveyance Self-test supported.
					Selective Self-test supported.
SMART capabilities:            (0x0003)	Saves SMART data before entering
					power-saving mode.
					Supports SMART auto save timer.
Error logging capability:        (0x01)	Error logging supported.
					General Purpose Logging supported.
Short self-test routine 
recommended polling time: 	 (   2) minutes.
Extended self-test routine
recommended polling time: 	 ( 397) minutes.
Conveyance self-test routine
recommended polling time: 	 (   5) minutes.
SCT capabilities: 	       (0x703d)	SCT Status supported.
					SCT Error Recovery Control supported.
					SCT Feature Control supported.
					SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       0
  3 Spin_Up_Time            0x0027   182   177   021    Pre-fail  Always       -       5900
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       22
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x002e   200   200   000    Old_age   Always       -       0
  9 Power_On_Hours          0x0032   098   098   000    Old_age   Always       -       1680
 10 Spin_Retry_Count        0x0032   100   253   000    Old_age   Always       -       0
 11 Calibration_Retry_Count 0x0032   100   253   000    Old_age   Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       22
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       13
193 Load_Cycle_Count        0x0032   200   200   000    Old_age   Always       -       556
194 Temperature_Celsius     0x0022   116   091   000    Old_age   Always       -       34
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0030   100   253   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x0008   200   200   000    Old_age   Offline      -       0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Completed without error       00%      1672         -

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

All seems to be fine as far as I can see.

Last edited by while (2015-06-08 05:53:55)

frostschutz · 2015-06-08 19:01:38

Right. Not sure what else can be done here, if the grow does not resume in either ArchLinux or SystemRescueCD.

The painful method: https://raid.wiki.kernel.org/index.php/ … erlay_file

Use that to create two independent sets of overlays. Like snapshots they will let you play with data without actually changing anything on the disks (if you're careful to use the overlays exclusively).

One set will represent your RAID after the grow, and on it you will find the first 28% (or whatever) of your data; the other will represent the RAID before the grow and it holds the remaining 72%. There will be a certain overlap zone that should be identical on both, since the RAID-after-grow has more capacity (spread over 4 disks instead of 3) so if it already holds 28% of the data, it won't have overwritten 28% of the old representation. So you might also be able to merge at 25%/75%. Don't take these values literally, I'm not 100% sure how to interpret the grow progress bar or reshape pos'n.

Commands to create those two arrays should roughly look like this (you might have to adapt these):

mdadm --create /dev/md44 --assume-clean --metadata=1.2 --level=5 --chunk=256 --layout=ls --data-offset=260608 --raid-devices=4 /disk/overlayA_sdb1 /disk/overlayA_sdd1 /disk/overlayA_sde1 /disk/overlayA_sdc1
mdadm --create /dev/md33 --assume-clean --metadata=1.2 --level=5 --chunk=256 --layout=ls --data-offset=262144 --raid-devices=3 /disk/overlayB_sdb1 /disk/overlayB_sdd1 /disk/overlayB_sde1

You have to get the settings (metadata version, offset, chunksize, disk order, etc.) entirely correct, please check if I understood the examine output correctly. Was sdc1 the disk you added in the grow operation?

With this /dev/md33 should represent your old RAID with 3 disks and /dev/md44 is the new one with 4 disks.

If that theory is correct there should be identical data to be found around the reshape position (2403.81GiB):

dd if=/dev/md44 bs=1M skip=$((2400*1024)) count=1 | hexdump -C
dd if=/dev/md44 bs=1M skip=$((2400*1024)) count=1 | md5sum
dd if=/dev/md33 bs=1M skip=$((2400*1024)) count=1 | md5sum

The hexdump is to check that it's not zero but more distinct/random looking data. Zeroes you can find anywhere so that's not good to go by.

If everything worked out so far you can get the whole device using device mapper linear.

cutoff=$((2400*1024*1024*1024/512))
total=$(blockdev --getsz /dev/md33)
dmsetup create md44md33 --table "0 $cutoff linear 9:44 0
$cutoff $total linear 9:33 $cutoff"

And then see if /dev/mapper/md44md33 is mountable and check as many files as you can (large files physically located before and after the cutoff point).

That's the theory anyways. It might take a few tries to get it right.

while · 2015-06-08 22:38:56

This is awesome. Thank you. I will try it if my current approach is not working.

I got a tip in the linux-raid mailing list that this could be a bug related to badblocks in the kernel. I applied a patch to fix it and am waiting for the kernel to compile now. Ill report back as soon as ive tried it out.

while · 2015-06-09 07:09:22

Seems like the kernel fix did it. I patched the kernel using the following patch:

--- a/drivers/md/raid5.c        2015-06-08 23:05:02.808214213 +0200
+++ b/drivers/md/raid5.c        2015-06-08 23:05:47.601355604 +0200
@@ -3855,7 +3855,7 @@
         */
        if (s.failed > conf->max_degraded) {
                sh->check_state = 0;
-               sh->reconstruct_state = 0;
+               //sh->reconstruct_state = 0;
                if (s.to_read+s.to_write+s.written)
                        handle_failed_stripe(conf, sh, &s, disks, &s.return_bi);
                if (s.syncing + s.replacing)

After rebooting the reshape is continuing fine.

Arch Linux

#1 2015-06-07 12:39:43

Resize of RAID5 array frozen

#2 2015-06-07 12:53:36

Re: Resize of RAID5 array frozen

#3 2015-06-07 13:20:29

Re: Resize of RAID5 array frozen

#4 2015-06-07 13:36:27

Re: Resize of RAID5 array frozen

#5 2015-06-07 14:18:26

Re: Resize of RAID5 array frozen

#6 2015-06-07 14:23:13

Re: Resize of RAID5 array frozen

#7 2015-06-07 14:37:27

Re: Resize of RAID5 array frozen

#8 2015-06-07 20:17:24

Re: Resize of RAID5 array frozen

#9 2015-06-08 19:01:38

Re: Resize of RAID5 array frozen

#10 2015-06-08 22:38:56

Re: Resize of RAID5 array frozen

#11 2015-06-09 07:09:22

Re: Resize of RAID5 array frozen

Board footer