mdadm confusion [solved]

scott_fakename · 2021-08-22 03:46:20

I'm confused about MDADM behavior.

I was experimenting with RAID just to try to understand it a bit, by setting up RAID on some small loop devices.

I was under the impression that with raid5, you have redundancy so you can recover from a bad drive -- I read on the kernel wiki that if linux software raid detects a read error from one of the disks, it would just fetch from one of the other two instead. So I set up a 3 "disk" raid-5 array like so:

[root@localhost ~]# for ((i = 0; i < 3; ++i)); do truncate -s1G "file$i"; losetup "/dev/loop$i" "file$i"; done

[root@localhost ~]# mdadm --create --verbose --level=5 /dev/md/loopraid --raid-devices=3 /dev/loop{0,1,2}
mdadm: layout defaults to left-symmetric
mdadm: layout defaults to left-symmetric
mdadm: chunk size defaults to 512K
mdadm: size set to 1046528K
mdadm: Defaulting to version 1.2 metadata
mdadm: array /dev/md/loopraid started.

[root@localhost ~]# mkfs.xfs /dev/md/loopraid 
log stripe unit (524288 bytes) is too large (maximum is 256KiB)
log stripe unit adjusted to 32KiB
meta-data=/dev/md/loopraid       isize=512    agcount=8, agsize=65408 blks
         =                       sectsz=512   attr=2, projid32bit=1
         =                       crc=1        finobt=1, sparse=1, rmapbt=0
         =                       reflink=1    bigtime=0 inobtcount=0
data     =                       bsize=4096   blocks=523264, imaxpct=25
         =                       sunit=128    swidth=256 blks
naming   =version 2              bsize=4096   ascii-ci=0, ftype=1
log      =internal log           bsize=4096   blocks=2560, version=2
         =                       sectsz=512   sunit=8 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0

[root@localhost ~]# mount /dev/md/loopraid /mnt

I then added a bunch of files to /mnt/ and took their checksums. Then, I deliberately corrupted one of the backing raid files.

[root@localhost ~]# cat /proc/mdstat 
Personalities : [raid6] [raid5] [raid4] 
md127 : active raid5 loop2[3] loop1[1] loop0[0]
      2093056 blocks super 1.2 level 5, 512k chunk, algorithm 2 [3/3] [UUU]
      
unused devices: <none>

[root@localhost ~]# truncate -s0 file0 

[root@localhost ~]# truncate -s1G file0 

[root@localhost ~]# cat /proc/mdstat 
Personalities : [raid6] [raid5] [raid4] 
md127 : active raid5 loop2[3] loop1[1] loop0[0]
      2093056 blocks super 1.2 level 5, 512k chunk, algorithm 2 [3/3] [UUU]
      
unused devices: <none>

[root@localhost ~]# echo check >/sys/block/md127/md/sync_action 

[root@localhost ~]# cat /proc/mdstat 
Personalities : [raid6] [raid5] [raid4] 
md127 : active raid5 loop2[3] loop1[1] loop0[0]
      2093056 blocks super 1.2 level 5, 512k chunk, algorithm 2 [3/3] [UUU]
      [=============>.......]  check = 67.5% (707548/1046528) finish=0.0min speed=235849K/sec
      
unused devices: <none>

[root@localhost ~]# cat /proc/mdstat 
Personalities : [raid6] [raid5] [raid4] 
md127 : active raid5 loop2[3] loop1[1] loop0[0]
      2093056 blocks super 1.2 level 5, 512k chunk, algorithm 2 [3/3] [UUU]
      
unused devices: <none>

At this point, /sys/block/md127/md/mismatch_cnt showed a a bunch of mismatched blocks, but when I tried taking the files' checksum again, it just revealed that the files were corrupted.

I tried this same experiment with a few different variations. In one case, i thought that maybe the reason this wasn't working was because after I corrupted my loop file, when the software raid controller tried to read from the loop device it got garbage data but not a read error. So, I tried putting dm-integrity on each of the loop devices and them make the raid out of the three dm-integrity block devices, but it had the same behavior -- /proc/mdstat never showed that it was detecting disk problems at any point until I manually did mdadm --fail on a disk, in which case it properly repaired.

Is this the expected behavior? Am I doing something wrong? How would it be possible for a corrupted backing device to be detected automatically?

Last edited by scott_fakename (2021-08-22 17:41:13)

scott_fakename · 2021-08-22 18:33:27

It turns out my initial suspicion was right; if I use un-integrity checked loop files, it will let the files get corrupted, but if I use dm-integrity it detects the corruption and corrects it when I run "repair" but on un-integrity checked ones, "repair" doesn't. I still don't understand why. But at least I guess I understand how to make mdadm automatically recover corrupted disks: It requires integrity checking on the backing file.

frostschutz · 2021-08-22 18:59:32

RAID relies on the drives to report read errors, and not return wrong data. RAID can detect mismatches, but it does not know which data is correct, so it is unable to fix mismatches correctly without human intervention.

Arch Linux

#1 2021-08-22 03:46:20

mdadm confusion [solved]

#2 2021-08-22 18:33:27

Re: mdadm confusion [solved]

#3 2021-08-22 18:59:32

Re: mdadm confusion [solved]

Board footer