You are not logged in.

#1 2013-08-25 07:29:36

aweb
Member
Registered: 2010-12-18
Posts: 17

Unpleasant initcpio problem with lvm + "partitionable" raid

Upon upgrading to linux 3.10.9-1 and all other recent packages (from whatever was current a few weeks ago), my system stopped booting today.  The problem is that the initramfs stopped recognizing an lvm pv that was on a "partitionable" raid.  Specifically, my boot command line is:

root=/dev/mapper/<my-vg-name>-root rw md=d0,<my-raid-uuid>

and my mkinitcpio.conf file is:

HOOKS="base udev autodetect modconf block keyboard mdadm lvm2 filesystems fsck"

The problem seems to be that with lvmetad running, /dev/md_d0  just doesn't get scanned for a pv.  Nothing I do makes it show up in "lvm pvs" output, when the boot scripts dump me in a shell.  However, if I kill and restart lvmetad, I can then access my lvm partitions.  Also if I kill but don't restart lvmetad, things work.  However, attempting to get lvmetad to re-scan partitions (even with something like udevadm trigger) does not work.

I was able to solve the problem by changing my raid device to /dev/md0 (i.e., boot with md=0,<my-raid-uuid>).  Since partitionable raid is no different from regular raid these days, this seems like an okay workaround.  However, I suspect other people will run into this problem (which is quite annoying when your system doesn't boot).  Hence, I would like to file a useful bug report.  At the very least there is a documentation bug (since mkinitcpio -H mdadm still mentions md=d0).  At worst, there is a race condition that happens not to be triggering for md=0, but might in the future.  Unfortunately, I don't understand the problem well enough to file a useful report, and so am wondering if other people can give me a better understanding of the problem.

Thanks.

Offline

#2 2013-08-25 14:16:01

WonderWoofy
Member
From: Los Gatos, CA
Registered: 2012-05-19
Posts: 8,412

Re: Unpleasant initcpio problem with lvm + "partitionable" raid

Ha, your handle is aweb… there is an awebb around here who is quite funny.  I thought you were him at first.

In any case, I have not experienced this problem, as I don't use lvm or mdadm anymore.  I just use btrfs ftw!  But I did notice that you are using the lvm2 hook (obviously) which is now driven by lvmetad, and therefore udev.  So you also have udev in your initramfs.  So why are you then using the mdadm hook and not mdadm_udev?  mdadm_udev in my experience works far better, and would centralise everything to being handled by udev's autodetection.

I'm not sure that this would make a difference at all, I just thought I would point that out, as I am fairly certain that the udev way is the much preferred method.  This is probably especially true as we prepare to move to systemd within the initramfs.

In the mean time, I guess if nothing else, this post can be a quick bump to the top of the "new" list for you…

Offline

#3 2013-08-25 17:35:59

aweb
Member
Registered: 2010-12-18
Posts: 17

Re: Unpleasant initcpio problem with lvm + "partitionable" raid

Well, I can confirm that mdadm_udev works, and is nice because it doesn't require me to add the raid uuid to the boot command line.  I didn't know about it.  However, it sidesteps the particular issue I ran into, because it doesn't use a "partitionable" raid device like /dev/md_d0, either.

I still wish I understood what is going wrong.  Plausible workarounds would be for the mkinitcpio mdadm hook to ditch "partitionable" raid, or to issue some warning that mdadm is deprecated and people should use mdadm_udev.  But without understanding the root issue, I'm not sure the bug can't also happen in other situations (e.g., where there are multiple raid arrays to assemble so the one containing the vg with the root file system loses some race condition with lvmetad).

On a related note, is there a reason mdadm_udev (and mdadm for that matter) don't show up in the output of mkinitcpio -L?

Offline

#4 2013-08-25 18:47:10

WonderWoofy
Member
From: Los Gatos, CA
Registered: 2012-05-19
Posts: 8,412

Re: Unpleasant initcpio problem with lvm + "partitionable" raid

It shows up for me just fine… as does mkinitcpio -H mdadm_udev.

Offline

#5 2013-08-26 02:22:37

aweb
Member
Registered: 2010-12-18
Posts: 17

Re: Unpleasant initcpio problem with lvm + "partitionable" raid

WonderWoofy wrote:

It shows up for me just fine… as does mkinitcpio -H mdadm_udev.

I agree that mkinitcpio -H works, but try running:

LANG=C mkinitcpio -L

Somehow a bunch of hooks are missing from the output.  I find it mildly disturbing that mkinitcpio behaves differently depending on the locale, especially since it uses bsdtar which can be quite finicky with locale stuff.

Offline

#6 2013-08-26 02:23:57

WonderWoofy
Member
From: Los Gatos, CA
Registered: 2012-05-19
Posts: 8,412

Re: Unpleasant initcpio problem with lvm + "partitionable" raid

Why are you using LANG=C?  I get hardly anything with LANG=C.  In fact I don't even get mdadm, so I think what you are trying to do there just shouldn'tbe done.

Offline

#7 2013-08-26 03:04:16

aweb
Member
Registered: 2010-12-18
Posts: 17

Re: Unpleasant initcpio problem with lvm + "partitionable" raid

I don't understand.  Are you saying the C locale is officially unsupported by arch?  That's crazy.  I use the C locale because I don't want localization.  So I'm not actually typing LANG=C mkinitcpio, I'm just typing "mkinitcpio".  I just suggested LANG=C to make sure you were in the same locale as I am.

Offline

#8 2013-08-26 03:17:41

WonderWoofy
Member
From: Los Gatos, CA
Registered: 2012-05-19
Posts: 8,412

Re: Unpleasant initcpio problem with lvm + "partitionable" raid

I think what you were looking for (in order to not have localization of code output) is LC_COLLATE=C.  If you use this, then you will have all the options available.  I'm not sure why LANG=C doesn't work with mkinitcpio, but I just think that if it breaks it… don't use it.

Offline

#9 2013-08-26 03:42:56

aweb
Member
Registered: 2010-12-18
Posts: 17

Re: Unpleasant initcpio problem with lvm + "partitionable" raid

No, I definitely want LANG=C.  LC_COLLATE=C just means stuff gets sorted alphabetically.  I specifically don't want a UTF-8 locale, because I use the high bit to denote meta.  Anyway, I will report this bug, but it's not the original bug I was experiencing.

Offline

#10 2013-08-26 03:45:35

WonderWoofy
Member
From: Los Gatos, CA
Registered: 2012-05-19
Posts: 8,412

Re: Unpleasant initcpio problem with lvm + "partitionable" raid

Hey, at least you've still got a bug to report wink

Offline

#11 2013-08-26 05:27:19

progandy
Member
Registered: 2012-05-17
Posts: 2,143

Re: Unpleasant initcpio problem with lvm + "partitionable" raid

Maybe try LC_ALL=C instead of LANG=C. At least that is what I use if I need stable output in scripts.

Edit: It is the column binary that dislikes the C locale. I don't know why.
Edit: column in C-mode hates the UTF8 annotation characters used to mark deprecated hooks

Last edited by progandy (2013-08-26 05:44:39)

Offline

#12 2013-08-26 15:12:47

WonderWoofy
Member
From: Los Gatos, CA
Registered: 2012-05-19
Posts: 8,412

Re: Unpleasant initcpio problem with lvm + "partitionable" raid

Just for closure here, aweb filed a bug report here.

As I had thought, setting your locale to C is borked, as falconindy had this to say (amongst some other rather entertaining things):

In the bug repot, falconindy wrote:

You really need to have a UTF-8 LC_CTYPE these days, or all sorts of programs are going to behave strangely.

Offline

#13 2013-08-26 16:18:46

progandy
Member
Registered: 2012-05-17
Posts: 2,143

Re: Unpleasant initcpio problem with lvm + "partitionable" raid

WonderWoofy wrote:

Just for closure here, aweb filed a bug report here.

As I had thought, setting your locale to C is borked, as falconindy had this to say (amongst some other rather entertaining things):

In the bug repot, falconindy wrote:

You really need to have a UTF-8 LC_CTYPE these days, or all sorts of programs are going to behave strangely.

Another good reason to use xynes en_XX variant which supports utf8 while trying to conform to ISO and POSIX standards instead of POSIX or C as locale.

Offline

Board footer

Powered by FluxBB