You are not logged in.

#1 2011-09-03 04:46:53

georgia_tech_swagger
Member
From: Upstate, SC
Registered: 2008-07-02
Posts: 138
Website

How I saved my mdadm RAID setup from Kernel 3.0 and udev

This took me 3.5 hours.   Hopefully I'm saving others the same misery.


Problem: 
When I upgraded to kernel 3.0 and rebooted, it could no longer find my mdadm driven software RAID volumes.


Solution: 
Step 1)  You will need to download the very latest ArchLinux ISO.   It MUST BE ArchLinux's live CD, as we need mkinitcpio.   Download, burn, and boot the disc.

Step 2)  If you're like me, even the Arch Linux boot CD will epic fail on you, claiming it cannot find /dev/disk/by-label/ARCH_201108.   For me, the by-label folder didn't even exist!  HA!

Step 2-1)  in the ramfs prompt: 

mkdir /dev/disk/by-label

Step 2-2)  in the ramfs prompt: 

ls /dev/disk/by-id/  ; echo "... find your optical or USB drive in here.   For me it was ata-PIONEER-DVD-RW-blahblahblah"

Step 2-3)  in the ramfs prompt: 

ln -s /dev/disk/by-id/YOUR-OPTICAL-OR-USB-DRIVE-HERE /dev/disk/by-label/ARCH_201108

Step 2-4)  in the ramfs prompt: 

exit

   (Arch will now boot up)

Step 3)  Your mdadm RAIDs should have been autostarted beginning with /dev/md125.   See your volumes by running:

ls /dev | grep md

Step 4)  Run: 

mkdir /mnt/temp

Step 5)  Mount your rootfs and any appropriate partitions (boot if you have a separate one for it like me.)    Please VERIFY that what you THOUGHT was rootfs really was by running ls on the mountpoint after mounting.

mount /dev/md127 /mnt/temp ; ls /mnt/temp ; echo "If that wasn't my rootfs, I should run umount /mnt/temp and try a different md12x"
mount /dev/md126 /mnt/temp/boot ; ls /mnt/temp/boot ; echo "Only if you have a separate boot like me, if that wasn't boot I should run umount /mnt/temp/boot and try a different md12x"

Step 6)  By now you should know which /dev/md(x) is which partition.   Now let's find out their UUID ... not to actually use the UUID per say but because the /dev/disk/by-id entry for the mdadm partition will be md-uuid-(UUID-HERE).   

mdadm --detail /dev/md125 

Repeat the above for all md12(x) devices, noting the UUID of each.

Step 7)  Now edit /etc/fstab ... instead of having "UUID=" or "/dev/md(x)" you'll want to instead have for each entry in /etc/fstab the by-id entry, like so:

/dev/disk/by-id/md-uuid-(THE-UUID-YOU-WROTE-DOWN-EARLIER-FOR-THIS-DEVICE)

Step 8)  Now edit /boot/grub/menu.lst ... instead of having "root=UUID=" or "root=/dev/mdx" you'll want to have the same thing as you put in fstab:

kernel /vmlinuz-linux root=/dev/disk/by-id/md-uuid-(THE-UUID-YOU-WROTE-DOWN-EARLIER-FOR-THIS-DEVICE)

Step 9)  I don't think this is strictly necessary, but at this point I'm extremely paranoid so lets exorcise any remaining demons:

mkinitcpio -b /mnt/temp -p linux

Step 10)  Reboot, and your computer is now a non-paperweight again.



Solutions that WILL NOT work:
- Referencing your mdadm volumes by UUID... they won't show up under /dev/disks/by-uuid
- Referencing by mdadm volumes by /dev/md(x) .... which volume is root or swap or boot changes after every single boot.  Fun huh?
- Moving mdadm before udev in mkinitcpio.conf hooks
- Using the so called "mdadm_udev" hook in the Bug Tracker and referenced elsewhere on these forums


Thoughts:
WTF is udev doing???   That's a serious horror show.   Why does udev push mdadm and /etc/mdadm.conf out of the way now?   Why does udev think it's a good idea to initialize starting at /dev/md125 now?

Last edited by georgia_tech_swagger (2011-09-03 04:53:23)


Res Publica Non Dominetur

Laptop:  Arch x86 | Thinkpad X220 | Core i5 2410-M | 8 GB DDR3 | Sandy Bridge
Desktop:  Arch x86_64 | Custom | Core i7 920 | 6 GB DDR3 | GeForce 260 GTX

Offline

#2 2011-09-05 15:18:19

d_dave
Member
Registered: 2008-04-02
Posts: 18

Re: How I saved my mdadm RAID setup from Kernel 3.0 and udev

Saving my array took a little longer - it originally looked like:

/dev/md0[/dev/sda1 /dev/sdb1] = /
/dev/md1[/dev/sda3 /dev/sdb3] = /home
/dev/sda2 = swap

after updating to kernel 3.0.4 when it wouldn't boot from the live disk mdadm would see

/dev/md127[/dev/sdb3 U]
/dev/md126[/dev/sdb1 U]
/dev/md0[/dev/sda _U]

no listings in disk/by-uuid
listings in disk/by-id

[UUID]
{UUID]-part1
[UUID]-part2
[UUID]-part3

after much hairpulling and angry words, I finally fixed it by stopping md127 and md126, added sdb to /dev/md0, created swap space in [UUID]-part2 swapon and used the appropiate entries from disk/by-id [UUID-part*] in fstab and menu.lst [both for root=[UUID]-part1 and resume=[UUID]-part2].

It is not paranoid delusion to worry about updating the image - the mkinitramfs image must have the array correctly identified in it by mdadm.conf.  Part of the problem was when I updated the image it was only being written to one disk - the one NOT being booted too.

(my) thoughts:

Udev has targeted mdadm and resistance is futile. Udev is using mdadm in an odd way so my array is now a single block device.  I really don't need nor want swap to be mirrored. Why it sees /dev/sda as the second disk in /dev/md0 and booted to /dev/sdb1 I have no idea nor do I understand why it would not show a by-id for /dev/sdb* even though mdadm finds them.

I really don't think we have seen the end of this.

Offline

#3 2011-10-23 19:38:22

d_dave
Member
Registered: 2008-04-02
Posts: 18

Re: How I saved my mdadm RAID setup from Kernel 3.0 and udev

okay we are up to kernel 3.0.7 and it breaks again; 
as it boots you see the message "/dev/md0 assembled"
then it sends you to busy box.

editing grub so "root=/dev/md0" results in /dev/md0p1 /dev/md0p2 and /dev/md0p3 being created

But if you edit grub and set "root=/dev/md0p1" it only creates /dev/md0.  if from busybox you "mdassemble" you get a bunch of errors but it actually has created /dev/md0p1 so you can exit out of busy box and resume booting.

Offline

#4 2011-11-06 16:34:44

d_dave
Member
Registered: 2008-04-02
Posts: 18

Re: How I saved my mdadm RAID setup from Kernel 3.0 and udev

Almost forgot to update this;

To fix my RAID I had to revert to my original parameters by zeroing the superblocks and recreating the arrays by chrooting in;

/dev/md0[/dev/sda1 /dev/sdb1] = /
/dev/md1[/dev/sda3 /dev/sdb3] = /home
/dev/sda2 = swap

However disk/-by-id no longer works in neither grub nor fstab. The only thing that works is referring to them as "/dev/md0" and "/dev/md1". I placed the entries in /etc/mdadm.conf from "mdadm --detail --scan" rebuilt the mkinitcpio image and was able to boot and later resume from hibernate so its up and working again.

I am once again back in business.  With no data loss.

Offline

Board footer

Powered by FluxBB