You are not logged in.

#1 2011-08-11 19:04:32

Baraclese
Member
Registered: 2008-05-28
Posts: 48

mdadm problem after upgrade to linux 3.0

I use a raid 1 of two disks (I don't boot from this raid), configured via /etc/mdadm.conf.
I put mdadm into the hooks section of /etc/mkinitcpio.conf (HOOKS="base udev autodetect pata sata mdadm filesystems usbinput".)
I mount /dev/md0 in fstab.

This setup has been working for a long time, until now.

It seems that it takes too long now at boot time to detect the disks before the filesystem checks starts, so that the filesystem check on dev/md0 fails with
fsck.ext4: No such file or directory while trying to open /dev/md0

Therefore I end up at the recovery console.

The solution I now employ is: I manually added a sleep 40 command before the file system check starts in rc.sysinit. Do you have any better ideas?

First I tried sleeping for 10 seconds but that didn't help. I saw an launchpad thread about a similar problem from which I retrieved the idea of sleeping, see https://bugs.launchpad.net/ubuntu/+sour … bug/278176

Last edited by Baraclese (2011-08-11 19:06:31)

Offline

#2 2011-08-11 20:19:04

aprins
Member
Registered: 2011-08-11
Posts: 6

Re: mdadm problem after upgrade to linux 3.0

Hi All,

Here a problem after upgrading (3.0 and mdadm) with booting... I guess familiar to Baraclese problem :-(

My server with RAID-1 does not boot because it can not find /dev/md3
(I have /dev/md1 /dev/md2 /dev/md3 /dev/md4 which are combinations of /dev/sda1 /dev/sdb; where the numbers correspond).
Now after grub it seems /dev/md3 can not be found (my rootfs partition).

In the emergency shell I did some checking with mdadm and /proc/mdstat

the last show me that the md partition are: /dev/md124 /dev/md125 /dev/md126 /dev/md127

# mdadm -E /dev/sda1

shows me preffered minor '1'and the corresponding number for the other partitions.

#mdadm -E /dev/md1

cannot open /dev/md1

#mdadm -E /dev/md124

mdadm: No md superblock detected on /dev/md124

#mdadm --detail --scan

ARRAY /dev/md/3_0 metadata=0.90 UUID=.............
ARRAY /dev/md/4_0 metadata=0.90 UUID=.............
ARRAY /dev/md/2_0 metadata=0.90 UUID=.............
ARRAY /dev/md/1_0 metadata=0.90 UUID=.............

ofcourse on my rootfs will contain a /etc/mdadm.conf but cannot reach it now... ;(
the emergency mdadm.conf shows:

ARRAY /dev/md1 devices=/dev/sda1,/dev/sdb1
ARRAY /dev/md2 devices=/dev/sda2,/dev/sdb2
ARRAY /dev/md3 devices=/dev/sda3,/dev/sdb3
ARRAY /dev/md4 devices=/dev/sda4,/dev/sdb4

So it seems to be some mismatch in de /dev/md(number) before and after upgrade?

Is there a solution where i'm sure I don't loose any information?

Hopefully somebody knows how to handle to get my server online again....

Thanks in advance

P.S. when I set in grub root=/dev/md/3_0 it boots until Checking Filesystems... it looks for /dev/md3 sad

Last edited by aprins (2011-08-15 05:19:02)

Offline

#3 2011-08-11 20:35:24

aprins
Member
Registered: 2011-08-11
Posts: 6

Re: mdadm problem after upgrade to linux 3.0

#mount -t ext4 /dev/md/3_0 /mnt/md3

This works and let me browse over the disk...

Any idea how I can make this for as belongs?
Must I change something on met rootfs-partition?
Must I change something in grub?
or.....

Last edited by aprins (2011-08-15 05:19:22)

Offline

#4 2011-08-13 05:15:44

thetrivialstuff
Member
Registered: 2006-05-10
Posts: 191

Re: mdadm problem after upgrade to linux 3.0

This is related to this change, I think:

https://bugs.archlinux.org/task/23905

And also these two bug reports:

https://bugs.archlinux.org/task/25132
https://bugs.archlinux.org/task/25158

The problem is that mdadm assembly at boot is now handled by udev, so the contents of mdadm.conf are not directly in play any more. I'm not exactly clear how how mdadm.conf is read, or even *if* it is -- I think udev doesn't read it directly; it just finds devices that look RAID-like, tries to run mdadm -I (incremental assembly) on them, whereupon mdadm looks at mdadm.conf to see if the device name udev gave it is in there and what to do with it.

If you named your device by one of the persistent naming schemes (e.g. /dev/disk/by-path) and udev calls mdadm with the name of the actual device node instead (e.g. /dev/sda5), mdadm goes, "hey, this device isn't in my config file, so it's not part of any arrays" and ignores it. Meanwhile, udev says to itself, "well, I already tried /dev/sda5, so there's no point trying any of the many symlinks to it" and mdadm never gets called with the right name.

At least I *think* that's what's going on...

~Felix.

Offline

#5 2011-08-13 20:05:21

aprins
Member
Registered: 2011-08-11
Posts: 6

Re: mdadm problem after upgrade to linux 3.0

Hey Felix,

Thanks for the reports, I guess you are right that i'm into the same 'trap'...
Its about udev/mdadm.... so config file mdadm.conf and commandline are not used anymore.

Now I try to get the system working again; still not with success :-(

Only the devices I got with the new version (e.g.):
/dev/md3_0 which was also shown as /dev/md126...

is now (after I mounted it one in the emergency shell) .dev.md126_0.

When I try to boot further with this device it fails on a filesystem check that waits for /dev/md3.

So my question is:
- How can I force mdadm to use device names like /dev/md3 ?
- Is it safe to do: mdadm -S /dev/md126_0 ?

if I do mdadm -A --scan it tells: mdmadm: Cannot start array: No such device.

What is the way to get /dev/md1 till /dev/md4 again? (using the emergency shell).

Thanks in advance!

Offline

#6 2011-08-13 20:23:34

thetrivialstuff
Member
Registered: 2006-05-10
Posts: 191

Re: mdadm problem after upgrade to linux 3.0

aprins,

If you're in the emergency shell and you have devices like /dev/md3_0 or /dev/md126 showing up, you can try stopping them:

mdadm --stop /dev/md126

(do this for each of the wrongly named arrays that udev started)

And then try assembling/starting your array:

mdadm --assemble /dev/md3

(mdadm -A --scan might work too; I haven't actually used it.)

The problem is that if an array is already assembled (but under the wrong name) by udev, the devices are in use, so mdadm can't assemble your properly named array until the bad one is stopped.

It sounds as though that might not work for you either, though, if you're getting "No such device" -- probably udev has bailed before it got a chance to set up the persistent naming symlinks you use for your arrays. In my emergency shell I can see /dev/disk/by-path and by-id, but not by-uuid or by-label. If you're using either of those for your mdadm.conf DEVICE names, you won't be able to assemble if they're missing. If you can get your system booted some other way, the workaround below will probably solve this as well. If you're stuck at the emergency shell, you can try to assemble your arrays using the /dev/sdX names instead, if you can work out which of your drives is which... I'm going to see if I can figure out how to get udev to finish creating the by-uuid/by-label symlinks, though, so check back here.


The workaround:

Try moving the file /lib/udev/rules.d/64-md-raid.rules to somewhere else (e.g. make a directory /lib/udev/rules.d-disabled and put it there). That should (I think) stop udev from getting involved at all in raid assembly, and then my patched version of /lib/initcpio/hooks/mdadm should be OK.

After you've killed the 64-md-raid.rules file and changed the mdadm hook, re-run mkinitcpio -p linux and try rebooting :)

~Felix.

Last edited by thetrivialstuff (2011-08-13 23:01:17)

Offline

#7 2011-08-13 21:59:54

thetrivialstuff
Member
Registered: 2006-05-10
Posts: 191

Re: mdadm problem after upgrade to linux 3.0

aprins wrote:

if I do mdadm -A --scan it tells: mdadm: Cannot start array: No such device.

Update: Turns out I was wrong about the reason for /dev/disk/by-uuid and by-label not showing up -- udev is running all the way through its rules, but for some reason, it's ignoring the logic that creates the by-uuid symlinks for RAID members.

OK, I'm possibly an idiot -- raid member UUID's are identical across all members, so doing the stuff I had previously put in this post (about changing the udev rules to create uuid links for the un-assembled RAID pieces) would most likely cause problems instead of fixing anything.

On the plus side, I sort of understand udev rules now :P

~Felix.

Last edited by thetrivialstuff (2011-08-13 23:07:15)

Offline

#8 2011-08-14 01:58:39

alexmat
Member
Registered: 2004-12-31
Posts: 100

Re: mdadm problem after upgrade to linux 3.0

Any luck on getting this fixed?  I have a remote box that is now down with this issue. What's the most straight forward way to deal with this?

Offline

#9 2011-08-14 02:11:01

thetrivialstuff
Member
Registered: 2006-05-10
Posts: 191

Re: mdadm problem after upgrade to linux 3.0

alexmat wrote:

Any luck on getting this fixed?  I have a remote box that is now down with this issue. What's the most straight forward way to deal with this?

Try adding these four lines just above the last closing brace in /lib/initcpio/hooks/mdadm:

    echo "Assembling any remaining RAID arrays:"
    for i in `grep ^ARRAY /etc/mdadm.conf | cut -d" " -f2`; do
        mdadm --assemble "$i";
    done

Then re-run mkinitcpio. If you need to do this from the arch boot CD, the procedure is:

- manually assemble the array for your root filesystem and /boot
- mount them somewhere (e.g. mount your root filesystem array in /mnt, then mount boot on /mnt/boot)
- bind mount sys, proc, and dev:

for i in sys proc dev; do mount -o bind /$i /mnt/$i; done

- chroot into /mnt
- then run mkinitcpio -p linux to re-generate the boot images
- leave the chroot, unmount everything, stop the raid arrays, and reboot to the hard drive, and hopefully it's fixed.

~Felix.

Last edited by thetrivialstuff (2011-08-14 02:16:32)

Offline

#10 2011-08-14 09:29:47

aprins
Member
Registered: 2011-08-11
Posts: 6

Re: mdadm problem after upgrade to linux 3.0

Hi Felix,

Just did try the procedure:

mdadm -S /dev/md124
mdadm -S /dev/md125
mdadm -S /dev/md126
mdadm -S /dev/md127

mdadm -A /dev/md1
mdadm -A /dev/md2
mdadm -A /dev/md3
mdadm -A /dev/md4

But still gives the message:

mdadm: Cannot start array: No such device.
# mdadm -A /dev/md1 -v

Shows me correct information that /dev/sda1 and /dev/sdb1 belongs the the /dev/md1 (for all partitions).
I wonder why mdadm in the emergency shell tells 'No such device'... I checked /etc/mdadm.conf and only tells about /dev/mdx and /dev/sdx -> so no UUID information etc. etc.

So i'm still trapped sad

Last edited by aprins (2011-08-15 05:20:22)

Offline

#11 2011-08-14 17:28:47

thetrivialstuff
Member
Registered: 2006-05-10
Posts: 191

Re: mdadm problem after upgrade to linux 3.0

Can you post your entire /etc/mdadm.conf ?

I assume /dev/sda1 and /dev/sdb1 both exist when you're in the recovery shell? Anything interesting in the dmesg while you're at the recovery shell?

~Felix.

Offline

#12 2011-08-14 19:44:02

aprins
Member
Registered: 2011-08-11
Posts: 6

Re: mdadm problem after upgrade to linux 3.0

Hi Felix,

Hereby my entire /etc/mdadm.conf (in the emergency shell and is the same if my old one).

DEVICE partitions
ARRAY /dev/md1 devices=/dev/sda1,/dev/sdb1
ARRAY /dev/md2 devices=/dev/sda2,/dev/sdb2
ARRAY /dev/md3 devices=/dev/sda3,/dev/sdb3
ARRAY /dev/md4 devices=/dev/sda4,/dev/sdb4

Sure the devices sda1-4 and sdb1-4 exist in the /dev directory...

dmesg gives some interesting information:

md: bind<sdb3>
md: bind<sdb4>
md: bind<sdb2>
md: bind<sdb1>
md: raid1 personality registered for level 1
bio: create slab <bio-1> at 1
md/raid1:md126: active with 2 out of 2 mirrors
md126: detected capacity change from 0 to 991146147840
md: bind <sda2>
md/raid1:md125: active with 2 out of 2 mirrors
md125: detected capacity change from 0 to 1077411840
 md126: unknown partition table
md: bind <sda3>
md/raid1:md127: active with 2 out of 2 mirrors
md125: detected capacity change from 0 to 7871463424
md: bind<sda1>
 md125: unknown partition table
md/raid1:md124: active with 2 out of 2 mirrors
md124: detected capacity change from 0 to 106823680
 md127: unknown partition table
 md124: unknown partition table

The file /proc/mdstat

md124 : active (auto-read-only) raid1 sda1[0] sdb1[0]
     104320 blocks [2/2] [UU]

md125 : active (auto-read-only) raid1 sda2[0] sdb2[0]
     1052160 blocks [2/2] [UU]

md126 : active (auto-read-only) raid1 sda4[0] sdb4[0]
     967916160 blocks [2/2] [UU]

md127 : active (auto-read-only) raid1 sda3[0] sdb3[0]
     7686976 blocks [2/2] [UU]

And when I do a mdadm -D --scan:

ARRAY /dev/md/127_0 metadata=0.90 UUID=ccd261f1:3ed84616:ebd2de6e:0c338c66
ARRAY /dev/md/126_0 metadata=0.90 UUID=<unique number>
ARRAY /dev/md/2_0 metadata=0.90 UUID=<unique number>
ARRAY /dev/md/1_0 metadata=0.90 UUID=<unique number>

Thats all information I could gather for now.

Thx again!

Last edited by aprins (2011-08-15 05:14:30)

Offline

#13 2011-08-14 19:48:18

bernarcher
Forum Fellow
From: Germany
Registered: 2009-02-17
Posts: 2,281

Re: mdadm problem after upgrade to linux 3.0

Hi aprins,

could you please enclose the listing in [ code ]...[ /code ] tags (without the intervening spaces near the brackets), just to keep it readable.

It is more than just etiquette: https://wiki.archlinux.org/index.php/Fo … s_and_Code


To know or not to know ...
... the questions remain forever.

Offline

#14 2011-08-14 20:56:34

thetrivialstuff
Member
Registered: 2006-05-10
Posts: 191

Re: mdadm problem after upgrade to linux 3.0

aprins wrote:

The file /proc/mdstat
md124 : active (auto-read-only) raid1 sda1[0] sdb1[0]
     104320 blocks [2/2] [UU]

md125 : active (auto-read-only) raid1 sda2[0] sdb2[0]
     1052160 blocks [2/2] [UU]

md126 : active (auto-read-only) raid1 sda4[0] sdb4[0]
     967916160 blocks [2/2] [UU]

md127 : active (auto-read-only) raid1 sda3[0] sdb3[0]
     7686976 blocks [2/2] [UU]

...well, I guess what you could do in a pinch is

ln -s /dev/md124 /dev/md1
ln -s /dev/md125 /dev/md2
ln -s /dev/md126 /dev/md3
ln -s /dev/md127 /dev/md4

That'll at least get you booted, and then you can maybe figure out what's wrong from there.

But it's strange that /dev/sda1 (for instance) is available and working for the automatic /dev/md12X arrays, but then seems to stop working between when you stop that array and try to start the properly named one...

~Felix.

Offline

#15 2011-08-15 02:09:45

archtaku
Member
Registered: 2010-07-02
Posts: 84

Re: mdadm problem after upgrade to linux 3.0

@thetrivialstuff: I'd like to buy you a pint, good sir. I used the Arch Wiki to create my RAID array but am still pretty much an LVM n00b. Would never have happened upon this myself. This got me up and running again. Thanks!

Offline

#16 2011-08-15 15:56:36

aprins
Member
Registered: 2011-08-11
Posts: 6

Re: mdadm problem after upgrade to linux 3.0

Great, the symbolic links make the system boot again...  :-)

But now I would like to find the right solution to prevent future (upgrade) problems...

Thx for this step!

Offline

#17 2011-08-15 18:22:59

thetrivialstuff
Member
Registered: 2006-05-10
Posts: 191

Re: mdadm problem after upgrade to linux 3.0

aprins wrote:

Great, the symbolic links make the system boot again...  :-)

But now I would like to find the right solution to prevent future (upgrade) problems...

Good luck... I've been searching for such a solution for years, but they keep finding ways to change the mdadm hook that break my setup ;)

I suspect the only way to have trouble-free RAID across upgrades in Arch is to have everything assemble with autodetect (i.e. don't use a custom mdadm.conf and just hope that mdadm assembles them correctly from "DEVICE partitions") and use labels or UUID's for your filesystems in fstab so that /dev/md/(random number) won't matter (and hope that you never accidentally leave a USB stick plugged in that happens to have the same label).

That much automatic detection makes me uneasy, though, since it's error-prone or a security hazard, depending how you look at it, so I'll continue to just set aside an afternoon for pacman -Syu and only do it every couple months :)

Once I've finished university, I'm hoping to use some of my copious free time to really get involved in Arch development & testing, and work on these kinds of things before they "go public"...

~Felix.

Offline

#18 2011-08-19 07:51:32

Baraclese
Member
Registered: 2008-05-28
Posts: 48

Re: mdadm problem after upgrade to linux 3.0

One of the latest kernel updates fixed the problem for me.

Offline

#19 2011-09-18 03:26:50

despotic
Member
Registered: 2010-04-17
Posts: 15

Re: mdadm problem after upgrade to linux 3.0

thetrivialstuff wrote:
alexmat wrote:

Any luck on getting this fixed?  I have a remote box that is now down with this issue. What's the most straight forward way to deal with this?

Try adding these four lines just above the last closing brace in /lib/initcpio/hooks/mdadm:

    echo "Assembling any remaining RAID arrays:"
    for i in `grep ^ARRAY /etc/mdadm.conf | cut -d" " -f2`; do
        mdadm --assemble "$i";
    done

Then re-run mkinitcpio. If you need to do this from the arch boot CD, the procedure is:

- manually assemble the array for your root filesystem and /boot
- mount them somewhere (e.g. mount your root filesystem array in /mnt, then mount boot on /mnt/boot)
- bind mount sys, proc, and dev:

for i in sys proc dev; do mount -o bind /$i /mnt/$i; done

- chroot into /mnt
- then run mkinitcpio -p linux to re-generate the boot images
- leave the chroot, unmount everything, stop the raid arrays, and reboot to the hard drive, and hopefully it's fixed.

~Felix.

Thanks, this helped me out with the same problem

Offline

#20 2011-09-20 21:27:50

Nvveen
Member
Registered: 2010-06-14
Posts: 11

Re: mdadm problem after upgrade to linux 3.0

Same here, confirmed working.

Offline

Board footer

Powered by FluxBB