[SOLVED] mdadm RAID6 will not re-assemble to open LUKS partition...

SirSkorpan · 2023-10-29 14:11:02

Hi,

I've been running a LUKS on software RAID setup. But today at a reboot I would get this error when opening the raid crypt:

Device /dev/md116 does not exist or access denied.

if I do an lsblk I can see the array as in it shows up, but it shows as having no size:

NAME                 MAJ:MIN RM   SIZE RO TYPE  MOUNTPOINTS
sda                    8:0    0  10.9T  0 disk
└─md116                9:116  0     0B  0 raid6
sdb                    8:16   0  10.9T  0 disk
└─md116                9:116  0     0B  0 raid6
sdc                    8:32   0  10.9T  0 disk
└─md116                9:116  0     0B  0 raid6
sdd                    8:48   0  10.9T  0 disk
└─md116                9:116  0     0B  0 raid6
sde                    8:64   0  10.9T  0 disk
└─md116                9:116  0     0B  0 raid6
sdf                    8:80   0  10.9T  0 disk
└─md116                9:116  0     0B  0 raid6

checking the details with mdadm --detail /dev/md116 gives:

/dev/md116:
           Version : 1.2
     Creation Time : Sat Aug  7 11:45:22 2021
        Raid Level : raid6
     Used Dev Size : 18446744073709551615
      Raid Devices : 6
     Total Devices : 6
       Persistence : Superblock is persistent

       Update Time : Sun Oct 29 12:52:44 2023
             State : active, FAILED, Not Started
    Active Devices : 6
   Working Devices : 6
    Failed Devices : 0
     Spare Devices : 0

            Layout : left-symmetric
        Chunk Size : 512K

Consistency Policy : unknown

              Name : cloud-server:116  (local to host cloud-server)
              UUID : 89494dd3:eaac1f86:f7b6d8b7:1a93fbf0
            Events : 34595

    Number   Major   Minor   RaidDevice State
       -       0        0        0      removed
       -       0        0        1      removed
       -       0        0        2      removed
       -       0        0        3      removed
       -       0        0        4      removed
       -       0        0        5      removed

       -       8       64        2      sync   /dev/sde
       -       8       32        0      sync   /dev/sdc
       -       8        0        5      sync   /dev/sda
       -       8       80        3      sync   /dev/sdf
       -       8       48        1      sync   /dev/sdd
       -       8       16        4      sync   /dev/sdb

And the State : active, FAILED, Not Started and the removed devices seems like an issue to me. Checking dmesg | grep -i "md116" shows me:

[  141.063892] md/raid:md116: device sda operational as raid disk 5
[  141.063902] md/raid:md116: device sdb operational as raid disk 4
[  141.063906] md/raid:md116: device sdc operational as raid disk 0
[  141.063909] md/raid:md116: device sdf operational as raid disk 3
[  141.063913] md/raid:md116: device sde operational as raid disk 2
[  141.063916] md/raid:md116: device sdd operational as raid disk 1
[  141.065680] md/raid:md116: raid level 6 active with 6 out of 6 devices, algorithm 2
[  141.066049] md116: invalid bitmap file superblock: bad magic
[  141.066054] md116: failed to create bitmap (-22)

Not sure what it means but I suppose it is no good

I tried to find out more with mdadm --examine /dev/md116 /dev/sda /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf and all the devices seem to have bad blocks: Bad Block Log : 512 entries available at offset 56 sectors and non matching checksums Checksum : 77a2caa4 - expected 77a2caa3 (though they are suspiciously similar both the number of bad block entries and the checksum mismatches):

/dev/sda:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : 89494dd3:eaac1f86:f7b6d8b7:1a93fbf0
           Name : cloud-server:116  (local to host cloud-server)
  Creation Time : Sat Aug  7 11:45:22 2021
     Raid Level : raid6
   Raid Devices : 6

 Avail Dev Size : 23437506560 sectors (10.91 TiB 12.00 TB)
     Array Size : 46875013120 KiB (43.66 TiB 48.00 TB)
    Data Offset : 264192 sectors
   Super Offset : 8 sectors
   Unused Space : before=264112 sectors, after=0 sectors
          State : clean
    Device UUID : 01531429:166963b4:6c2ce9fb:374b2deb

    Update Time : Sun Oct 29 12:52:44 2023
  Bad Block Log : 512 entries available at offset 56 sectors
       Checksum : 77a2caa4 - expected 77a2caa3
         Events : 34595

         Layout : left-symmetric
     Chunk Size : 512K

   Device Role : Active device 5
   Array State : AAAAAA ('A' == active, '.' == missing, 'R' == replacing)
/dev/sdb:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : 89494dd3:eaac1f86:f7b6d8b7:1a93fbf0
           Name : cloud-server:116  (local to host cloud-server)
  Creation Time : Sat Aug  7 11:45:22 2021
     Raid Level : raid6
   Raid Devices : 6

 Avail Dev Size : 23437506560 sectors (10.91 TiB 12.00 TB)
     Array Size : 46875013120 KiB (43.66 TiB 48.00 TB)
    Data Offset : 264192 sectors
   Super Offset : 8 sectors
   Unused Space : before=264112 sectors, after=0 sectors
          State : clean
    Device UUID : 658035c8:5ffb3f1d:255d55a5:f4064c85

    Update Time : Sun Oct 29 12:52:44 2023
  Bad Block Log : 512 entries available at offset 56 sectors
       Checksum : c32b76c7 - expected c32b76c6
         Events : 34595

         Layout : left-symmetric
     Chunk Size : 512K

   Device Role : Active device 4
   Array State : AAAAAA ('A' == active, '.' == missing, 'R' == replacing)
/dev/sdc:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : 89494dd3:eaac1f86:f7b6d8b7:1a93fbf0
           Name : cloud-server:116  (local to host cloud-server)
  Creation Time : Sat Aug  7 11:45:22 2021
     Raid Level : raid6
   Raid Devices : 6

 Avail Dev Size : 23437506560 sectors (10.91 TiB 12.00 TB)
     Array Size : 46875013120 KiB (43.66 TiB 48.00 TB)
    Data Offset : 264192 sectors
   Super Offset : 8 sectors
   Unused Space : before=264112 sectors, after=0 sectors
          State : clean
    Device UUID : a2eab5e8:9f7feb94:44f2e309:620b0ef4

    Update Time : Sun Oct 29 12:52:44 2023
  Bad Block Log : 512 entries available at offset 56 sectors
       Checksum : 2ea7fecd - expected 2ea7fecc
         Events : 34595

         Layout : left-symmetric
     Chunk Size : 512K

   Device Role : Active device 0
   Array State : AAAAAA ('A' == active, '.' == missing, 'R' == replacing)
/dev/sdd:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : 89494dd3:eaac1f86:f7b6d8b7:1a93fbf0
           Name : cloud-server:116  (local to host cloud-server)
  Creation Time : Sat Aug  7 11:45:22 2021
     Raid Level : raid6
   Raid Devices : 6

 Avail Dev Size : 23437506560 sectors (10.91 TiB 12.00 TB)
     Array Size : 46875013120 KiB (43.66 TiB 48.00 TB)
    Data Offset : 264192 sectors
   Super Offset : 8 sectors
   Unused Space : before=264112 sectors, after=0 sectors
          State : clean
    Device UUID : 8df0b2d4:c7aec221:e8160b9b:a213c79a

    Update Time : Sun Oct 29 12:52:44 2023
  Bad Block Log : 512 entries available at offset 56 sectors
       Checksum : df5c60c4 - expected df5c60c3
         Events : 34595

         Layout : left-symmetric
     Chunk Size : 512K

   Device Role : Active device 1
   Array State : AAAAAA ('A' == active, '.' == missing, 'R' == replacing)
/dev/sde:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : 89494dd3:eaac1f86:f7b6d8b7:1a93fbf0
           Name : cloud-server:116  (local to host cloud-server)
  Creation Time : Sat Aug  7 11:45:22 2021
     Raid Level : raid6
   Raid Devices : 6

 Avail Dev Size : 23437506560 sectors (10.91 TiB 12.00 TB)
     Array Size : 46875013120 KiB (43.66 TiB 48.00 TB)
    Data Offset : 264192 sectors
   Super Offset : 8 sectors
   Unused Space : before=264112 sectors, after=0 sectors
          State : clean
    Device UUID : d53091e7:f2b12176:9daf7896:be69bc84

    Update Time : Sun Oct 29 12:52:44 2023
  Bad Block Log : 512 entries available at offset 56 sectors
       Checksum : 2bfc930a - expected 2bfc9309
         Events : 34595

         Layout : left-symmetric
     Chunk Size : 512K

   Device Role : Active device 2
   Array State : AAAAAA ('A' == active, '.' == missing, 'R' == replacing)
/dev/sdf:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : 89494dd3:eaac1f86:f7b6d8b7:1a93fbf0
           Name : cloud-server:116  (local to host cloud-server)
  Creation Time : Sat Aug  7 11:45:22 2021
     Raid Level : raid6
   Raid Devices : 6

 Avail Dev Size : 23437506560 sectors (10.91 TiB 12.00 TB)
     Array Size : 46875013120 KiB (43.66 TiB 48.00 TB)
    Data Offset : 264192 sectors
   Super Offset : 8 sectors
   Unused Space : before=264112 sectors, after=0 sectors
          State : clean
    Device UUID : 04a53073:b7996aa6:268d7a88:de566b07

    Update Time : Sun Oct 29 12:52:44 2023
  Bad Block Log : 512 entries available at offset 56 sectors
       Checksum : 5c95b9a7 - expected 5c95b9a6
         Events : 34595

         Layout : left-symmetric
     Chunk Size : 512K

   Device Role : Active device 3
   Array State : AAAAAA ('A' == active, '.' == missing, 'R' == replacing)

I've seen people with similar problems running mdadm with -U=no-bitmap and others losing all data and recreating the array, but I'm hesitant to just run commands without knowing a little bit more since I rather not lose the data. That is the information I've been able to gather, the question now is, what does this mean? And will it be possible to recover my data? Not sure where to go from here

Any help would be greatly appreciated!

Last edited by SirSkorpan (2023-11-05 22:40:43)

frostschutz · 2023-10-29 14:26:50

Does the checksum change if you mdadm --stop the array and then --examine again?

If the checksum is wrong, mdadm will no longer accept it. as for WHY it is wrong, I don't know. mdadm calculates checksums in a weird way, that's how you get these very similar looking checksums.

You could backup the headers, then hexedit them to make the checksum match. or (very carefully) use mdadm --create but you have to get drive order, disk offset, raid layout etc. etc. etc. perfectly right. (see https://unix.stackexchange.com/a/131927/30851 )

I'm not a fan of using bare drives (without partitions) in a RAID array. If anything tries to read/change/create a partition table, it will harm your mdadm headers. But that should not be what happened here, you triggered some other issue somehow.

Last edited by frostschutz (2023-10-29 14:27:18)

SirSkorpan · 2023-10-29 15:03:01

Thanks for the reply!

frostschutz wrote:

Does the checksum change if you mdadm --stop the array and then --examine again?

No, running:

mdadm --stop /dev/md116
mdadm --examine /dev/md116 /dev/sda /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf

gives the same output as before, no diff at all. Is that good or bad (considering the situation)?

I think the quote from the stack exchange answer might fit this situation,

There is nothing wrong with --create - if you know what you are doing.
The only problem is: You don't know.

So recreating might not work, I don't have the exact command that I ran to create the array (should make a note the next time...). As for backing up headers, that might be a way forward. Do I interpret this correctly, the intention would then be to set the "expected" checksum in the header to match the calculated checksum from the mdadm --examine output (I suppose that it is good that it didn't change when I stopped the array then)?

In that case I would need to know where to find that value in a hexeditor, change it and restore the updated headers.

frostschutz wrote:

I'm not a fan of using bare drives (without partitions) in a RAID array. If anything tries to read/change/create a partition table, it will harm your mdadm headers. But that should not be what happened here, you triggered some other issue somehow.

That is a good piece of wisdom, I did not know that, also yes I most probably did do something to cause the issue, but what that was I don't know

Last edited by SirSkorpan (2023-10-29 15:04:21)

frostschutz · 2023-10-29 15:15:29

You get all the info from the examine... device role gives you the drive order (starts from 0), offset, chunksize, layout etc. it's all in the examine. But yes, it's very dangerous - one wrong step, and it starts to sync or write data, and it corrupts everything. That's why you use overlays...

As for the metadata, the mdadm 1.2 header is at a 4K offset. The checksum itself is in there somewhere.

# mdadm --examine /dev/sdx1
Checksum : ecbf426b
# hexdump -C -n 8192 /dev/sdx1
000010d0  ff ff ff ff ff ff ff ff  6b 42 bf ec 80 00 00 00  |........kB......|

So in this case it's offset 0x10da (4314 bytes) - note the byte order.

Note that doing this is also dangerous... checksums exist for a reason after all, and I'm not sure if it would fix your issue.

Use overlays for this, if you can https://raid.wiki.kernel.org/index.php/ … erlay_file

Last edited by frostschutz (2023-10-29 15:17:25)

SirSkorpan · 2023-10-29 15:56:18

frostschutz wrote:

You get all the info from the examine... device role gives you the drive order (starts from 0), offset, chunksize, layout etc. it's all in the examine. But yes, it's very dangerous - one wrong step, and it starts to sync or write data, and it corrupts everything. That's why you use overlays...

You mean use overlays when recreating or as in not running the array on bare drives?

frostschutz wrote:

Note that doing this is also dangerous... checksums exist for a reason after all, and I'm not sure if it would fix your issue.

I suppose this is true for re-creating too? Or would re-create (given I am able to conjure up the right command, with the right inputs and the right order) definitely fix it (and the danger here is messing up the command)?

frostschutz wrote:

Use overlays for this, if you can https://raid.wiki.kernel.org/index.php/ … erlay_file

Not sure how this, overlays, work, reading the link it seems we are creating a files in memory (loop devices) and we basically make the changes in that file rather than on the device. I don't have enough memory to hold the 1% they recommend, it be put on another disk rather than in memory? Also, how would I make the changes permanent (if they are done on the overlay) or is it a matter of first testing with an overlay and then doing it for real?

SirSkorpan · 2023-10-29 17:46:04

So having a look at re-creating the array, using your (frostschutz's) answer on stack exchange and the mdadm man page a a guide. I come up with the following command:

mdadm --create /dev/md116 --assume-clean \
    --level=6 --chunk=512K --metadata=1.2 --data-offset=132096K \
    --raid-devices=6 /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdb /dev/sda

Where the --level, --raid-devices, --chunk, and --metadata is directly from the result of --examine. A couple of things, in the man pages there is no "s" suffix for the --data-offset (which you use in the stack exchange example) so I took:

<the number of data offset sectors from the examine>*512/1024

to get the offset in Kilobytes. The 512 is the logical sector size from "fdisk -l | grep -A3 '/dev/sda'".

And the list of devices has the device with "Device Role : Active device 0" first, ending with "Device Role : Active device 5", with the others in ascending order.

Does the name of array matter (md116 is the name of the one that already exists")?

Does this seem to make sense? If so, I guess I should try to figure out if I can do anything with overlays to get some safety

SirSkorpan · 2023-10-29 22:20:26

After a little bit more reading, my assumption that the overlays were in memory seems to have been flawed, they are on disk, and I have a spare 12T disk that I should be able to use.

For my own learning and possibly someone else's, I'll try to provide what little understanding I have gained if I, or anyone else, need this later. I've never used this feature but, from my understanding, it seems to work like this:

An overlay is another filesystem that is "placed" on top of the first one, the first one being then being "covered" by the second. The rest of the system will then interact with this "union" filesystem (the "union" being the combination of the overlay and the covered filesystem), I believe the covered filesystem could be called the lower filesystem and the overlay the upper filesystem. When working with the union filesystem, if an object (e.g. a file) exists on both levels the upper level object will be interacted with (changes, etc.) while the lower is hidden. I'm not sure here but I believe that if the object doesn't exist it will be created in the upper filesystem and if it only exists in the lower (it might always exist on both, not sure) or is unchanged it will be served from the lower filesystem but any change happens in the upper filesystem. I've tried to piece this together from overlayfs docs and device mapper and snapshots docs from kernel.org.

I followed the guide provided above and, to remember, this is what I did (I'll cut out the use of parallel from the commands as they hide the "real" command a bit, so in the example I only do it for /dev/sda in the example but as per the guide we do it FOR ALL block devices in the raid):

First I created a block device file, to be a loop device that can be linked to a file:

mknod -m 660 /dev/loop1 b 7 1

I'm not sure what the MAJOR (7) and MINOR (1) numbers are in that command, found this answer on stack exchange (thanks again frostschutz ) but it goes over my head a bit. But the gist of it is that we now have a device file (loop device) that can be associated with a file and then mounted.

I then created a file to serve as the "write area" for our overlay. This is just a regular file on the filesystem, we can make it as large as the filesystem we want to (as in larger than the available space) as long the filesystem we create it on support sparse files (from what I understand this means that it doesn't actually have to allocate any space on disk until we write anything to it). The guide seems to hint that we could make the file smaller as well if the filesystem doesn't support sparse files and we don't have the space.

mkdir /mnt/temp
mount /dev/sdh1 /mnt/temp
pushd /mnt/temp
truncate -s12000G overlay-sda

The "/dev/sdh1" device is my spare 12T drive. We then connect the pieces, we get the size of the physical block device (/dev/sda in the example) and we connect the loop device with the "/mnt/temp/overlay-sda" file, I believe it is the losetup -f option that makes the losetup command find the /dev/loop1 device automatically. We echo some setup options to dmsetup to create the overlay mapping.

size=$(blockdev --getsize /dev/sda);
loop=$(losetup -f --show -- overlay-sda);
echo 0 $size snapshot /dev/sda $loop P 8 | dmsetup create sda

I think that the options we send to dmsetup means somthing like this, "start at byte 0 and end at $size of /dev/sda create a snapshot of /dev/sda on target $loop (our linked file) make it persistent (P)" the 8 seems to be "chunk size", but I'm not sure what that means.

Either way, there is now a "/dev/mapper/sda" device that should be mapping any attempt to change objects in the filesystem to the /mnt/temp/overlay-sda file, if I've understood this correctly. A "lsblk" gives:

NAME                 MAJ:MIN RM   SIZE RO TYPE  MOUNTPOINTS
loop1                  7:1    0  11.7T  0 loop
└─sda                254:6    0  10.9T  0 dm
sda                    8:0    0  10.9T  0 disk
└─sda                254:9    0  10.9T  0 dm

and "dmsetup status"

sda: 0 23437770752 snapshot 16/25165824000 16

I'll hold off with going further for today and come back to this after work tomorrow or tuesday, mainly to clear my mind a bit and perhaps there is some feedback to what I'm trying to do here (I might very well be out to lunch, so please feel free to correct any misunderstandings). I'll put this here and let it simmer a bit

Last edited by SirSkorpan (2023-10-31 22:05:26)

seth · 2023-10-30 21:45:05

The checksum off-by-one eerily looks like https://bugzilla.redhat.com/show_bug.cgi?id=1966712 what falls in line w/ the resp. error and the feature map being reported as 0x0
https://bugzilla.redhat.com/show_bug.cgi?id=1966712#c2 is in https://git.kernel.org/pub/scm/utils/md … r1.c#n2694 sine 2021 though, so that bug is unlikely the cause of your situation.

But today at a reboot I would

Let's talk about the day before - did you run any updates before that reboot? Hard shutdown? Power outage? Why did you reboot?
Did you try to assemble the raid from a different SW stack (live distro, lts kernel)?

SirSkorpan · 2023-10-30 23:07:37

Ah, yes I did try to assemble it from a live distro and hadn't booted again since! (I believe I tried both Arch and Ubuntu) I was having issues with a pair of files, don't remember the exact error but basically I couldn't access/copy/or remove them and it was suggested that I should run fsck. But even if I unmounted the file system it was reported as "in use" to fsck wouldn't run. So I was planning on live-booting and running fsck from there (in hindsight I should have just commented the mount line out in fstab and rebooted, but yeah).

The logs/checks and plan to try to fix it was all done on the original system.

Does this change the course of action I should (can?) take?

Last edited by SirSkorpan (2023-10-30 23:08:51)

seth · 2023-10-31 07:19:38

So you ran an fsck on some part of the array from a different (older) system?
Did you rewrite the superblock or change the bitmap location? Did you assmeble the array after that fsck on the live system?
Can you still from said live system?

Does this change the course of action I should (can?) take?

It's more about gaging whether this could be salvageable at all - any kind of corruption at or beneath LUKS is obviously a huge problem - even if you can re-assemble the array, there's no guarantee that there's a usable crypt left

SirSkorpan · 2023-10-31 08:07:23

seth wrote:

So you ran an fsck on some part of the array from a different (older) system?

No the only time I tried to runt fsck was on the original system. But it would not run, giving me a message to the effect of "filesystem in use", though I had already unmounted it.

I then tried to assemble the array on the live system, but as I remember it didn't work. I believe I got the same result there as I now get on the original system. But since I saw the array name in lsblk I tried to unlock the crypt and that didn't work either (as no crypt could be found). My memory is a little fuzzy on the details but the short of it is that no fsck ran on the disks, as far as I'm aware. (though, thinking about it the Ubuntu live disk runs something like that when it starts up, but I would assume it only does it on the USB? )

seth wrote:

It's more about gaging whether this could be salvageable at all - any kind of corruption at or beneath LUKS is obviously a huge problem - even if you can re-assemble the array, there's no guarantee that there's a usable crypt left

Yeah, fair enough, the thought had crossed my mind. I'll have to bite that bullet if I get that far.

seth · 2023-10-31 19:10:07

So this all started out w/

I was having issues with a pair of files, don't remember the exact error but basically I couldn't access/copy/or remove them and it was suggested that I should run fsck

but that fsck never happened and now all raid members have a clean state but the chcksum is off by exactly one bit after

md116: invalid bitmap file superblock: bad magic

I vote for "-U=no-bitmap" but that's just a gut feeling and I've no idea how to verify that other than trying - you need an overlay FOR EVERY DISK IN THE RAID, not just one!

SirSkorpan · 2023-10-31 21:51:04

seth wrote:

So this all started out w/
I was having issues with a pair of files, don't remember the exact error but basically I couldn't access/copy/or remove them and it was suggested that I should run fsck
but that fsck never happened and now all raid members have a clean state but the chcksum is off by exactly one bit after
md116: invalid bitmap file superblock: bad magic

yes, that's it in a nutshell

seth wrote:

I vote for "-U=no-bitmap" but that's just a gut feeling and I've no idea how to verify that other than trying

It seems like a reasonable way to go as well to get it re-assembled. I read in the manual that "The no-bitmap option can be used when an array has an internal bitmap which is corrupt in some way so that assembling the array normally fails. It will cause any internal bitmap to be ignored.". It doesn't say anything about fixing the bitmap (I must confess that I'm not sure what the bitmap actually is in this context), but will the act of re-assembling the array do this or will I need to use the "no-bitmap" option from here on out?

seth wrote:

you need an overlay FOR EVERY DISK IN THE RAID, not just one!

Yes! I created one for every disk, maybe I should have been even more clear about that (I'll go back and edit that I think).

Two things about the overlay though.

1. Do I get this right that we use the overlay to see if the action we take fixes the assembly and if it does we do the same action again without the overlay? Or will the result here be that I may be able to get the assembly up and running in a readonly state (more or less) and will have to backup anything I want to save onto another drive?

2. In the guide from kernel.org it says:

It will rebuild on the overlay file, so you should pause the rebuild as the overlay file will otherwise eat your disk space:
 echo 0 > /proc/sys/dev/raid/speed_limit_max
 echo 0 > /proc/sys/dev/raid/speed_limit_min

If I do that (pausing rebuild), will I still be able to unlock the LUKS crypt (provided it is not corrupt)?

frostschutz · 2023-11-01 09:47:39

rebuild only happens in specific situations... with --create --assume-clean (or by specifying the redundant drives as 'missing') there is no rebuild. there is also --freeze-reshape option to prevent rebuild/reshape on assemble. and yes, overlay is for experimenting, you can also use it for readonly/backup purposes. once you are sure which course to take for recovery, you can attempt to do it on the real disks.

Regarding your situation, I have no other ideas, also not sure regarding the cause of checksum mismatch. Your create command looks like it should work and there should be no harm trying it with overlays. If there's another way to get around the checksum mismatch in the header (short of hexediting it), I don't know about it but it's not like I deal with this situation much, so... you're on your own , good luck!

SirSkorpan · 2023-11-02 16:48:50

frostschutz wrote:

rebuild only happens in specific situations... with --create --assume-clean (or by specifying the redundant drives as 'missing') there is no rebuild. there is also --freeze-reshape option to prevent rebuild/reshape on assemble. and yes, overlay is for experimenting, you can also use it for readonly/backup purposes. once you are sure which course to take for recovery, you can attempt to do it on the real disks.

Great that clears things up a bit about the process, at least I feel a little bit safer even if I'm still on thin ice

frostschutz wrote:

Your create command looks like it should work and there should be no harm trying it with overlays. If there's another way to get around the checksum mismatch in the header (short of hexediting it), I don't know about it but it's not like I deal with this situation much, so... you're on your own

No worries, you've helped plenty, thanks a lot!

frostschutz wrote:

good luck!

And thanks, I may need all the luck I can get!

SirSkorpan · 2023-11-05 22:39:28

For some closure on this. Long story short, I got the array back in working order, with the crypt intact and my data unharmed. Thanks again frostschutz and seth for great help!

As for the method, I followed pretty much the procedure as described above, here is a summary of the steps:

# stop the non working array
mdadm --stop /dev/md116

# setup a spare drive for the overlay, creating a partition on the spare device and mounting it
fdisk /dev/sdh
mkfs.ext4 /dev/sdh1
mkdir /mnt/temp
mount /dev/sdh1 /mnt/temp
cd /mnt/temp/

# create the overlay 
DEVICES="/dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdb /dev/sda"
parallel 'test -e /dev/loop{#} || mknod -m 660 /dev/loop{#} b 7 {#}' ::: $DEVICES
parallel truncate -s12000G overlay-{/} ::: $DEVICES
parallel 'size=$(blockdev --getsize {}); loop=$(losetup -f --show -- overlay-{/}); echo 0 $size snapshot {} $loop P 8 | dmsetup create {/}' ::: $DEVICES
# had to stop the array again as it activated once the overlay mapping  was created
mdadm --stop /dev/md116

# get order (Device Role : Active device X), X starts from 0
# also chunk size (from Chunk Size : Y), raid-devices (Raid Devices : Z), metadata
# (Version : A), level (Raid Level : B) and data offset sectors (Data Offset : C sectors)
mdadm --examine /dev/md116 /dev/sda /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf | less

# Convert data offset sectors to Kilobytes
# <the number of data offset sectors from the examine>*512/1024 or
# C*512/1024 (from the example above)

# Attempt recreating the array (from stopped status), assume clean tells command
# not to rebuild.
mdadm --create /dev/md116 --assume-clean \
    --level=6 --chunk=512K --metadata=1.2 --data-offset=132096K \
    --raid-devices=6 /dev/mapper/sdc /dev/mapper/sdd /dev/mapper/sde /dev/mapper/sdf /dev/mapper/sdb /dev/mapper/sda

cryptsetup open --type luks --key-file crypt/key/file /dev/md116 data

# check that lvm is detected
lsblk

# the volume group is called data_vg and the logical volume is called system_data

# run fsck
fsck -y /dev/mapper/data_vg-system_data

mkdir /mnt/rescue
mount /dev/mapper/data_vg-system_data /mnt/rescue

# backed up my files to an external disk

# close down the, lvm, crypt and array to prepare to recreate without overlays
umount /mnt/rescue
vgchange -a n data_vg
cryptsetup close --type luks data-3
mdadm --stop /dev/md116

# deactivate and remove overlays
pushd /mnt/temp
parallel 'dmsetup remove {/}; rm overlay-{/}' ::: $DEVICES
parallel losetup -d ::: /dev/loop[0-9]*
popd
umount /mnt/temp

# Double check that the mdadm --create command is still valid
mdadm --examine /dev/md116 /dev/sda /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf | less

mdadm --create /dev/md116 --assume-clean \
    --level=6 --chunk=512K --metadata=1.2 --data-offset=132096K \
    --raid-devices=6 /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdb /dev/sda

# unlock luks, activate lvm volume group
cryptsetup open --type luks --key-file crypt/key/file /dev/md116 data
vgchange -a n data_vg

# run fsck
fsck -y /dev/mapper/data_vg-system_data

mount /dev/mapper/data_vg-system_data /mnt/data

# check that we can see the files
ls -lah /mnt/data

# then reboot the system and make sure all services are up and running, serve the expected things and what not...
reboot

Arch Linux

#1 2023-10-29 14:11:02

[SOLVED] mdadm RAID6 will not re-assemble to open LUKS partition...

#2 2023-10-29 14:26:50

Re: [SOLVED] mdadm RAID6 will not re-assemble to open LUKS partition...

#3 2023-10-29 15:03:01

Re: [SOLVED] mdadm RAID6 will not re-assemble to open LUKS partition...

#4 2023-10-29 15:15:29

Re: [SOLVED] mdadm RAID6 will not re-assemble to open LUKS partition...

#5 2023-10-29 15:56:18

Re: [SOLVED] mdadm RAID6 will not re-assemble to open LUKS partition...

#6 2023-10-29 17:46:04

Re: [SOLVED] mdadm RAID6 will not re-assemble to open LUKS partition...

#7 2023-10-29 22:20:26

Re: [SOLVED] mdadm RAID6 will not re-assemble to open LUKS partition...

#8 2023-10-30 21:45:05

Re: [SOLVED] mdadm RAID6 will not re-assemble to open LUKS partition...

#9 2023-10-30 23:07:37

Re: [SOLVED] mdadm RAID6 will not re-assemble to open LUKS partition...

#10 2023-10-31 07:19:38

Re: [SOLVED] mdadm RAID6 will not re-assemble to open LUKS partition...

#11 2023-10-31 08:07:23

Re: [SOLVED] mdadm RAID6 will not re-assemble to open LUKS partition...

#12 2023-10-31 19:10:07

Re: [SOLVED] mdadm RAID6 will not re-assemble to open LUKS partition...

#13 2023-10-31 21:51:04

Re: [SOLVED] mdadm RAID6 will not re-assemble to open LUKS partition...

#14 2023-11-01 09:47:39

Re: [SOLVED] mdadm RAID6 will not re-assemble to open LUKS partition...

#15 2023-11-02 16:48:50

Re: [SOLVED] mdadm RAID6 will not re-assemble to open LUKS partition...

#16 2023-11-05 22:39:28

Re: [SOLVED] mdadm RAID6 will not re-assemble to open LUKS partition...

Board footer