You are not logged in.

#1 2015-06-15 01:13:51

c3kay
Member
From: New Zealand
Registered: 2015-04-19
Posts: 61

[SOLVED] Missing disk from RAID5 array, but right amount of space

I grew my Arch server's RAID5 array from 4 to 5 2TB disks not long ago. Everything went great and the server's been great, with no problems. Today while I was doing some system maintenance I happened to run lsblk and found that the latest disk is missing. Which was news to me, because I still have all the available space.

At this stage I know nothing else. It's running as it should. If I hadn't been doing this maintenance I wouldn't have noticed. I can't remember, but I assume I did the rebuild correctly because I remember seeing the new drive in lsblk and in the array. But right now it's nowhere to be seen unless I open up the case. But again, strangely, I still have all the space I should have with 5x2TB disks.

Something has gone wrong. I swear after the rebuild I did

sudo mdadm --detail --scan >> /etc/mdadm.conf

Content of mdadm.conf is

# mdadm configuration file
#
# mdadm will function properly without the use of a configuration file,
# but this file is useful for keeping track of arrays and member disks.
# In general, a mdadm.conf file is created, and updated, after arrays
# are created. This is the opposite behavior of /etc/raidtab which is
# created prior to array construction.
#
#
# the config file takes two types of lines:
#
#       DEVICE lines specify a list of devices of where to look for
#         potential member disks
#
#       ARRAY lines specify information about how to identify arrays so
#         so that they can be activated
#


# You can have more than one device line and use wild cards. The first
# example includes SCSI the first partition of SCSI disks /dev/sdb,
# /dev/sdc, /dev/sdd, /dev/sdj, /dev/sdk, and /dev/sdl. The second
# line looks for array slices on IDE disks.
#
#DEVICE /dev/sd[bcdjkl]1
#DEVICE /dev/hda1 /dev/hdb1
#
# The designation "partitions" will scan all partitions found in
# /proc/partitions
DEVICE partitions


# ARRAY lines specify an array to assemble and a method of identification.
# Arrays can currently be identified by using a UUID, superblock minor number,
# or a listing of devices.
#
#       super-minor is usually the minor number of the metadevice
#       UUID is the Universally Unique Identifier for the array
# Each can be obtained using
#
#       mdadm -D <md>
#
# To capture the UUIDs for all your RAID arrays to this file, run these:
#    to get a list of running arrays:
#    # mdadm -D --scan >>/etc/mdadm.conf
#    to get a list from superblocks:
#    # mdadm -E --scan >>/etc/mdadm.conf
#
#ARRAY /dev/md0 UUID=3aaa0122:29827cfa:5331ad66:ca767371
#ARRAY /dev/md1 super-minor=1
#ARRAY /dev/md2 devices=/dev/hda1,/dev/hdb1
#
# ARRAY lines can also specify a "spare-group" for each array.  mdadm --monitor
# will then move a spare between arrays in a spare-group if one array has a
# failed drive but no spare
#ARRAY /dev/md4 uuid=b23f3c6d:aec43a9f:fd65db85:369432df spare-group=group1
#ARRAY /dev/md5 uuid=19464854:03f71b1b:e0df2edd:246cc977 spare-group=group1
#


# When used in --follow (aka --monitor) mode, mdadm needs a
# mail address and/or a program.  To start mdadm's monitor mode, enable
# mdadm.service in systemd.
#
# If the lines are not found, mdadm will exit quietly
#MAILADDR root@mydomain.tld
#PROGRAM /usr/sbin/handle-mdadm-events

But I don't think that's it. Because /dev/sdf just doesn't exist.

The last time the system saw /dev/sdf was May 26th. journalctl -x | grep sdf

May 26 10:45:01 server kernel: sd 5:0:0:0: [sdf] 3907029168 512-byte logical blocks: (2.00 TB/1.81 TiB)
May 26 10:45:01 server kernel: sd 5:0:0:0: [sdf] 4096-byte physical blocks
May 26 10:45:01 server kernel: sd 5:0:0:0: [sdf] Write Protect is off
May 26 10:45:01 server kernel: sd 5:0:0:0: [sdf] Mode Sense: 00 3a 00 00
May 26 10:45:01 server kernel: sd 5:0:0:0: [sdf] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
May 26 10:45:01 server kernel:  sdf: sdf1
May 26 10:45:01 server kernel: sd 5:0:0:0: [sdf] Attached SCSI disk
May 26 10:45:01 server kernel: md: bind<sdf1>
May 26 10:45:01 server kernel: md/raid:md127: device sdf1 operational as raid disk 4
May 26 10:45:01 server kernel:  disk 4, o:1, dev:sdf1
May 26 15:02:21 server kernel:  disk 4, o:1, dev:sdf1
May 26 15:02:21 server kernel:  disk 4, o:1, dev:sdf1

lsblk results. sda is my system drive. sdb through sde are the storage drives. md127 is the RAID device. sdf is missing.

NAME      MAJ:MIN RM   SIZE RO TYPE  MOUNTPOINT
sda         8:0    0 596.2G  0 disk  
├─sda1      8:1    0     1G  0 part  /boot
├─sda2      8:2    0   128G  0 part  /
├─sda3      8:3    0     8G  0 part  [SWAP]
└─sda4      8:4    0 459.2G  0 part  /home
sdb         8:16   0   1.8T  0 disk  
└─sdb1      8:17   0   1.8T  0 part  
  └─md127   9:127  0   7.3T  0 raid5 /media/storage
sdc         8:32   0   1.8T  0 disk  
└─sdc1      8:33   0   1.8T  0 part  
  └─md127   9:127  0   7.3T  0 raid5 /media/storage
sdd         8:48   0   1.8T  0 disk  
└─sdd1      8:49   0   1.8T  0 part  
  └─md127   9:127  0   7.3T  0 raid5 /media/storage
sde         8:64   0   1.8T  0 disk  
└─sde1      8:65   0   1.8T  0 part  
  └─md127   9:127  0   7.3T  0 raid5 /media/storage

cat /proc/mdstat

Personalities : [raid6] [raid5] [raid4] 
md127 : active raid5 sde1[5] sdc1[2] sdd1[4] sdb1[0]
      7813527552 blocks super 1.2 level 5, 512k chunk, algorithm 2 [5/4] [U_UUU]
      
unused devices: <none>

smartctl --scan

/dev/sda -d scsi # /dev/sda, SCSI device
/dev/sdb -d scsi # /dev/sdb, SCSI device
/dev/sdc -d scsi # /dev/sdc, SCSI device
/dev/sdd -d scsi # /dev/sdd, SCSI device
/dev/sde -d scsi # /dev/sde, SCSI device

Is it possible I have a failed disk in there, and I'm just not being told by mdadm for some reason? But surely if the disk was failed, then I'd have some indication. It's in there, it's powered on. I have no degrade errors through mdadm. It's just... missing.

Please help. sad

Last edited by c3kay (2015-06-15 01:19:37)

Offline

#2 2015-06-15 01:19:14

c3kay
Member
From: New Zealand
Registered: 2015-04-19
Posts: 61

Re: [SOLVED] Missing disk from RAID5 array, but right amount of space

Wait, forget it. I misread mdstat in a panic. I see. Solved.

Well, sort of. I'm half forgiving myself because it doesn't appear as failed in mdadm --detail. It's just missing. I worked out which physical drive it is. But even when I can hear it plugged in and hear it spin up, it's not appearing in the array. I'm going to guess it's a power or connection issue rather than a failure. I just have to root out spare cables to isolate the problem.

Last edited by c3kay (2015-06-15 02:04:15)

Offline

Board footer

Powered by FluxBB