You are not logged in.

#1 2019-06-04 02:15:42

mikebutash
Member
Registered: 2017-05-01
Posts: 32

Broken long-time install after pacman -Syyu, emergency boot

I've upgraded my long-time (3+ yr) arch install with a full upgrade, and it has broken the os from booting now, unable to even unlock my luks drives.  The upgrade went well, no errors, no interruptions, but on boot it can't pivot the root partition now.  This is mdraid (which builds fine on boot), each md volume for /boot and a second for luks/pv/lvm volume, which root, var, and others live.  Trying to manually cryptosetup luksOpen the PV/VG/LV's all show fine after, but trying to mount the /dev/mapper/*-root volume comes back "unknown file system "ext4".  This worked countless upgrades prior, and no changes to init confs or such since built 3-4 years ago.

I've been deathly afraid to upgrade my desktop, as attempting to replace ubuntu with arch for the past year I've not gotten it to boot with almost the exact same setup, that I'd mostly written off as a kernel/os bug in arch no one sees but me, but now it's affecting my desktop, which has worked through countless upgrades so far perfectly to now.

I don't know if this is a problem with the kernel build with mkinitcpio, or the 5.x kernel (happend in 4.18-19 on my laptop too), or the kernel code itself, but this seems entirely dysfunction if using a disk setup like mine with mdraid+luks+lvm.  My laptop was even just luks+lvm, no mdraid (only 1 disk), and it breaks in the same way.

Anyone else seeing this using luks+lvm with modern kernels?

I'm rather SOL without my desktop, and exactly why I upgrade infrequently being burned by ubuntu for 15 years prior with constantly bad upgrades.  So far arch has been great, but something seems really broken of late.

Any help would be appreciated.

I am planning on using an iso to boot this thing, but speaking from the same experience having attempted this and countless ways rebuilding the kernel and init modules, and given up fixing my arch install on the laptop, I'm not sure I'm going to get any further, but unlike my laptop, I don't have ubuntu and windoze functional to fall-back on there.

Offline

#2 2019-06-04 02:28:51

jasonwryan
Anarchist
From: .nz
Registered: 2009-05-09
Posts: 30,424
Website

Re: Broken long-time install after pacman -Syyu, emergency boot

Do you mean you hadn't updated in 3+ years, or you have regularly updated over that time and only the last one is troublesome?

The missing file system suggests to me that you have upgraded with an unmounted boot and have a kernel mismatch. I doubt it is package related as my desktop is LVM on LUKS on RAID1 and I have had no issues...

From the chroot, make sure everything is mounted and rerun the upgrade (-Syu). Look in / for evidence of the failed upgrade (the kernel and initrd will have been copied there).


Arch + dwm   •   Mercurial repos  •   Surfraw

Registered Linux User #482438

Offline

#3 2019-06-04 03:19:43

mikebutash
Member
Registered: 2017-05-01
Posts: 32

Re: Broken long-time install after pacman -Syyu, emergency boot

Sorry, not quite clear there - I meant infrequently, but maybe every 6mo or so I update it, sometimes more, sometimes less.  Never had a problem doing so in 3-4 years.

So I got my system to boot finally digging around again in a panic on my laptop, and oddly on a whim tried mounting specifically with -t ext4 while in emergency my root lv as /new_root, and that worked!  I could then see my partitions.  Progress!

I then tried exec'ing switch_root the /new_root with /sbin/init, but traceback'd and died ugly.  Did "exec /usr/bin/switch_root /new_root /sbin/init", but complained about no systemd in dying...

This led me to some further googling and digging that said to just exit the remote shell after getting /new_root/ to mount, and that got me to desktop!

Now WTF broke in the boot process with arch?

Oddly, this got me scratching my head with my laptop, and tried the same on it, and after a year or more of not getting that to boot with arch, same thing gets it to boot fully without an iso boot!  Seriously, wtf now?

My /etc/mkinitcpio.conf has not changed in 3-4 years, and is duplicated on my laptop as a point of sanity when I couldn't get it to boot installing it there.  It doesn't seem to be including ext4 to auto-detect it as an os, based my doing a "mount /dev/mapper/*-root /new_root" on the fs, and coming up "unknown filesystem 'ext4'", and coincidentally breaking the boot process on two different systems.

This has always worked seamlessly until this last upgrade, so something broke along the way.

So I can manually build the filesystem, pivot, and get into the os, but ideas on how to fix it?  Some google shows I'm not the only poor bastard experiencing this.

Here's my mkinitcpio.conf hooks, this has always been all that was necessary to now...  Maybe need to add the fs like ext* to modules manually?

HOOKS="base udev autodetect modconf block mdadm encrypt lvm2 filesystems shutdown usr keyboard fsck"

This seems some sort of nasty pitfall of reality vs. published directions with modern arch updates.

Offline

#4 2019-06-04 21:16:11

mikebutash
Member
Registered: 2017-05-01
Posts: 32

Re: Broken long-time install after pacman -Syyu, emergency boot

So I've been working with my laptop arch install now that I know how to boot around it's issues on both of my systems, and so far have found no success in any method tried to not end up at emergency shell.

I've been tweaking with my mkinitcpio.conf, as well as /etc/default/grub to see if anything worked, so far nothing.

My desktop, I've booted prior using 'GRUB_CMDLINE_LINUX="cryptdevice=/dev/md1:cryptvolume"' all this time, and no issues.  That was my original setup on my laptop per my old recipe for build, and no joy with that or any other countless attempts at using UUID for cryptdevice/root.

Both systems, if I "mount -t ext4 /dev/mapper/host-root /new_root", and exit, system boots fine as it should.  I really can't figure what I'm missing in terms of changed requirements for this to boot properly.

Laptop looks like this currently with relevant bits (dell xps15, most default flags are for its hardware):

/etc/default/grub:
GRUB_CMDLINE_LINUX_DEFAULT='nomodeset pcie_port_pm=off acpi_backlight=vendor acpi_osi=Linux acpi_osi=! acpi_osi="Windows 2009" rcutree.rcu_idle_gp_delay=1'
GRUB_CMDLINE_LINUX="cryptdevice=UUID=<cryptuuid>:spv0 root=UUID=<rootuuid>"
GRUB_ENABLE_CRYPTODISK=y

blkid:
/dev/nvme0n1p6: UUID="<cryptuuid>" TYPE="crypto_LUKS" PARTLABEL="Linux filesystem" PARTUUID="<cryptpartuuid>"
/dev/mapper/host--vg0-root1: UUID="<rootuuid>" TYPE="ext4"

mkinitcpio:
MODULES="ext2 ext3 ext4 btrfs xfs nvidia nvidia_modeset nvidia_uvm nvidia_drm"
HOOKS="base udev autodetect keyboard modconf block mdadm_udev encrypt lvm2 filesystems shutdown usr fsck"

lsblk:
NAME                    MAJ:MIN RM   SIZE RO TYPE  MOUNTPOINT
nvme0n1                 259:0    0 953.9G  0 disk  
|-nvme0n1p1             259:1    0   499M  0 part  /boot/efi  # system efi
|-nvme0n1p2             259:2    0   128M  0 part  # windoze boot
|-nvme0n1p3             259:3    0 131.5G  0 part  # windoze c:
|-nvme0n1p4             259:4    0   834M  0 part  # dell recovery part
|-nvme0n1p5             259:5    0   499M  0 part  /boot  # linux boot
|-nvme0n1p6             259:6    0 808.1G  0 part  # linux luks/crypt volume
| `-spv0                254:0    0 808.1G  0 crypt 
|   |-host--vg0-root0   254:1    0    17G  0 lvm   # ubuntu root
|   |-host--vg0-swap0   254:2    0     4G  0 lvm   # ubuntu swap
|   |-host--vg0-var0    254:3    0     6G  0 lvm   # ubuntu var
|   |-host--vg0-varlog0 254:4    0     3G  0 lvm   # ubuntu var/log
|   |-host--vg0-home0   254:5    0    90G  0 lvm   # ubuntu home
|   |-host--vg0-ext0    254:6    0   340G  0 lvm   # other storage
|   |-host--vg0-root1   254:7    0    24G  0 lvm   /  # arch root
|   |-host--vg0-var1    254:8    0    10G  0 lvm   /var  # arch var
|   |-host--vg0-varlog1 254:9    0     3G  0 lvm   /var/log  # arch var/log
|   `-host--vg0-home1   254:10   0    10G  0 lvm   /home  # arch home
`-nvme0n1p7             259:7    0    12G  0 part   # dell recovery part

Windows and ubuntu boot just fine still via grub, only arch is broken.

Last edited by mikebutash (2019-06-06 23:16:58)

Offline

#5 2019-06-05 06:56:23

V1del
Forum Moderator
Registered: 2012-10-16
Posts: 21,427

Re: Broken long-time install after pacman -Syyu, emergency boot

Kernel mismatch doesn't sound too far off, what's your output for

pacman -Q linux
uname -a

when booted into the system? Also please use [ code ] [ /code ] tags (without the spaces) to wrap command output in code tags

Offline

#6 2019-06-05 23:25:04

mikebutash
Member
Registered: 2017-05-01
Posts: 32

Re: Broken long-time install after pacman -Syyu, emergency boot

It's definitely booting the right kernel, on both systems.  My desktop only has arch on it, so no mixing possible there.

[mb@host]$ pacman -Q linux
linux 5.1.6.arch1-1

[mb@host]$ uname -a
Linux host 5.1.6-arch1-1-ARCH #1 SMP PREEMPT Fri May 31 15:17:53 UTC 2019 x86_64 GNU/Linux

Offline

#7 2019-06-10 03:00:40

mikebutash
Member
Registered: 2017-05-01
Posts: 32

Re: Broken long-time install after pacman -Syyu, emergency boot

I'm still troubleshooting this basketcase of shenanigans across systems, and find very much things different and odd.

I have my fstab defined by uuids and boot fine as built installing arch.  Installing arch, I made a new /home lv to test with, but intentions were to replace my new /home with my old /home.  I did so, and now the pc won't boot off that uuid.  Switching it back to the host--vg0-home0 instead of *-home1 results in failing to boot with 0 2 in the fstab.  If I drop to a vty prior to sddm login with kde, and replace *-home1 with -home0 at /home, I get the desired result.  Something seems entirely broken with automounting now..

Also, any sort of general mounting seems to require it forcing a fs type with -t ext4 or whatever, then things mount, exactly as with the initial /new_root switch_root pivot.

It seems the kernel can't seem to automatically determine file systems as it had prior.  Even though my defining the cryptdevice and root uuid, it still refuses to mount it and drops me to emergency shell.

I've tried including all related modules as pointers to mount the fs better, but it still failes, forcing me to manually unlock the luks volume, mount -t ext4 the volume as /new_root, exit, and boot normally from there.

Offline

#8 2019-06-10 11:42:01

Lone_Wolf
Member
From: Netherlands, Europe
Registered: 2005-10-04
Posts: 11,868

Re: Broken long-time install after pacman -Syyu, emergency boot

What is the value of use_lvmetad in your /etc/lvm/lvm.conf ?

If it's 1 , many users have had troubles recently with that.
Try setting it to 0 and reboot to verify if it helps.


Disliking systemd intensely, but not satisfied with alternatives so focusing on taming systemd.


(A works at time B)  && (time C > time B ) ≠  (A works at time C)

Offline

#9 2019-06-10 17:18:11

mikebutash
Member
Registered: 2017-05-01
Posts: 32

Re: Broken long-time install after pacman -Syyu, emergency boot

It is indeed set to "use_lvmetad = 1", which I've had issues with lvmetad and contstructing the fstab in arch, particularly if effecting work from a boot cd that systemd isn't active yet on, but my issue seems early(ier) in the pre-boot stage from grub to unlock my luks/lvm root and even read that file yet.  It's definitely a problem now, but not sure mine as I worked around it building my fstab manually.

If this were to impact this early in the initrd, I'm presuming I need to rebuild with mkinitcpio, not sure grub cares about this yet as the root vol where this exists isn't unlocked yet where mine crashes to emergency shell.  At this point I'm down to try anything as both my main pc's are impacted by this bug currently breaking a normal boot now.

I'm going to do the lvmetad=0 and reboot a bit later after work settles down, I've enabled grub debugging as well to see if I can catch why it's not bootstrapping of the crypto volume, or asking me to unlock it at very least as it normally would during boot.

Thanks for this - will update soon.

Offline

#10 2019-06-10 22:33:57

mikebutash
Member
Registered: 2017-05-01
Posts: 32

Re: Broken long-time install after pacman -Syyu, emergency boot

Ok, well that certainly didn't help, and in fact seems to have left my system even less bootable than before.  Now after grub and kernel drops emergency shell, I can cryptsetup luksOpen the volume, but my lvm's aren't being started automatically.  I'm presuming lvmetad normally takes care of this, but setting it to =0 stops it doing so(?).

I had to manually do a "lvm vgchange -ay" after verifying everything was present doing pvscan and vgdisplay, but lvscan showed all inactive until doing the vgchange.  Some panic when I couldn't find the normal pvscan, vgchange, etc commands/binaries...  Took some figuring out to tell lvm did them all in busybox...

If not using lvmetad, what *should* have occurred here?  Is there something else for hooks I should include removing lvmetad in the mkinitcpio.conf?  I remember trying to use the grub-mkconfig crapping the bed during normal arch iso install in chroot with a lack of running systemd at the time, so directions seem to indicate needing to update grub-mkconfig to work without lvmetad running.  Even when starting the service manually, it didn't work during install, forcing me to manually build my fstab.  Noobs will love this, I'm sure, I'm certainly not used to having do so either.

Unless you have a better idea/method of introducing "use_lvmetad=0" that actually starts the lv's at boot, not sure it should NOT be set to run with =1.

I'm all for learning more about the guts of linux, but I can't imagine anyone doing much but cursing arch at the moment if they want encrypted disks at boot.  I loved it until someone changed something catastrophically surprising with a non-working computer after upgrade/install, though much less at the moment...

Offline

#11 2019-06-11 11:44:23

Lone_Wolf
Member
From: Netherlands, Europe
Registered: 2005-10-04
Posts: 11,868

Re: Broken long-time install after pacman -Syyu, emergency boot

There have been cases where lvm_etad took so long to reply that calling programs assumed the devices were not present and tried to move on.
It appears that's not what causes your problem.

There are many possible causes and somehow they'll need to be narrowed down.
Normally you'd have been asked for pacman log, but with approx 6 months of updates at once that list will be way too long.

Have you tried booting with the fallback image and an older kernel like linux-lts ?

Another option is to try a systemd init instead of a busybox init .
See https://wiki.archlinux.org/index.php/Mk … mmon_hooks .


Disliking systemd intensely, but not satisfied with alternatives so focusing on taming systemd.


(A works at time B)  && (time C > time B ) ≠  (A works at time C)

Offline

#12 2019-07-31 02:53:48

mikebutash
Member
Registered: 2017-05-01
Posts: 32

Re: Broken long-time install after pacman -Syyu, emergency boot

So some general closure on this - I got this working, but there were a few major issues.

First, arch moving to later kernel/grub/etc required a different format definition of the cryptodisk and how it was defined.  I had to move my existing system to use the "GRUB_CMDLINE_LINUX="cryptdevice=*" nomenclature, which I was mapping direct against a /dev/md/* device prior.

A routine upgrade left me unable to even build my disks at boot.  Grr.

I'd recommend some validation around this, as older systems like mine doing md+luks+lvm+fs requires the chain, and that starts at grub.  Great for standardizing, but remember us little folk pre-standard doing cool things too.

Second, I was being dumb trying to rebuild my grub under arch.  My use of your grub-install was thinking like ubuntu, you knew by default where to write the grub file, so the stdout made me think it had done so.  I thought it would, and didn't think otherwise, coming from ubuntu prior with update-grub knowing where to drop a config.  Too much back and forth, felt dumb, but once I committed the proper grub.cfg, it found my cryptodisk, and loaded.

Still can't figure out how to get a graphical/pretty grub boot on it, but not having to manually build the disks from luks up is a win, even just with a text console boot.

Appreciate the help and pointers along the way.

Offline

Board footer

Powered by FluxBB