systemd boot problems with LVM2

JATothrim · 2022-04-07 15:26:22

I have somewhat complex disk setup:

sda                                    8:0    0 111,8G  0 disk
├─sda1                                 8:1    0    16M  0 part
├─sda2                                 8:2    0 111,3G  0 part
└─sda3                                 8:3    0   520M  0 part
sdb                                    8:16   0 111,8G  0 disk
└─sdb1                                 8:17   0 111,8G  0 part
  └─vghyper-Hackery                  254:8    0    40G  0 lvm
sdc                                    8:32   0 931,5G  0 disk
└─sdc1                                 8:33   0   930G  0 part
  └─md127                              9:127  0 929,9G  0 raid1
    ├─vghyper-ArchDataOne            254:1    0   125G  0 lvm   /media/Data
    ├─vghyper-ProjectData_corig      254:4    0   150G  0 lvm
    └─vghyper-ArchDataTwo            254:6    0   400G  0 lvm   /media/DataTwo
sdd                                    8:48   0 931,5G  0 disk
└─sdd1                                 8:49   0   930G  0 part
  └─md127                              9:127  0 929,9G  0 raid1
    ├─vghyper-ArchDataOne            254:1    0   125G  0 lvm   /media/Data
    ├─vghyper-ProjectData_corig      254:4    0   150G  0 lvm
    └─vghyper-ArchDataTwo            254:6    0   400G  0 lvm   /media/DataTwo
sde                                    8:64   0   1,8T  0 disk
├─sde1                                 8:65   0 931,5G  0 part  /media/DataExt
├─sde2                                 8:66   0     1G  0 part
├─sde3                                 8:67   0   100G  0 part
└─sde4                                 8:68   0 830,5G  0 part
nvme0n1                              259:0    0 238,5G  0 disk
├─nvme0n1p1                          259:1    0     2G  0 part  /boot
└─nvme0n1p2                          259:2    0 236,5G  0 part
  ├─vghyper-HyperiorRoot             254:0    0   128G  0 lvm   /
  ├─vghyper-ProjectCache_cpool_cdata 254:2    0    80G  0 lvm
  ├─vghyper-ProjectCache_cpool_cmeta 254:3    0    64M  0 lvm
  └─vghyper-SwapNVMe                 254:7    0    16G  0 lvm

My current setup is causing boot failures where systemd gets stuck on boot-up waiting for 1min (I reduced it from 1min 30s)
on some units/devices that time out. This is an long time issue, that has repeated for years for time to time.
System shows that 6 units are stuck when the boot-up is also stuck.
After the time outs on these units and I do get to the desktop, I can often just do:

$ vgchange -ay vghyper
$ mount -a

And I regain all my mounts. Network might also still be broken and can be fixed by restarting few units.
Main suspects that I have:

The LVM vghyper/ProjectData volume fails to assemble during boot.
There is still an config issue somewhere.
My DIY network setup systemd scripts are to be blamed.

The last point is unlikely, since the problem has vanished at least twice after pacman -Syu, and I have not changed the network setup.
The DIY network setup is provided via custom package I have built. (the setup also a bit complex: virtual bridge, dhcpcd, dnscrypt-proxy, firewall)

I can provide extra information when needed in order to solve this long standing mystery.

I also would like talk how to better setup NVMe/SSD cached ProjectData volume and how to organize the volumes in general.
I prefer BTRFS as filesystem and I feel I'm under utilizing it. The system is bit messy after using it for so many years.

ccm · 2022-04-07 20:29:30

can you share the output of:
$ cat /etc/mkinitcpio.conf | grep -e "HOOKS" | grep -v "#"

JATothrim · 2022-04-08 17:31:44

/etc/mkinitcpio.conf:

HOOKS=(base systemd autodetect modconf keyboard sd-vconsole block mdadm_udev lvm2 filesystems fsck shutdown)

Here is also output of systemd-analyze critical-chain:

graphical.target @1min 1.056s
└─multi-user.target @1min 1.056s
  └─netinit-online.target @1min 1.055s
    └─netinit-post.target @1min 1.055s
      └─netinit-dnscrypt.service @1min 1.055s
        └─netinit.target @1min 1.052s
          └─netinit-pre.target @1min 1.052s
            └─netinit-bridge@vbr0.service @1min 779ms +272ms
              └─basic.target @1min 762ms
                └─sockets.target @1min 760ms
                  └─dbus.socket @1min 758ms
                    └─sysinit.target @1min 741ms
                      └─systemd-update-done.service @1min 737ms +2ms
                        └─ldconfig.service @1min 400ms +333ms
                          └─local-fs.target @1min 397ms
                            └─tmp.mount @1min 393ms +2ms
                              └─systemd-journald.socket
                                └─system.slice
                                  └─-.slice

Today the boot-stuck unexpectedly repeated after fixup attempt:
I saw lvm2-monitor.service was taking too long, I disabled+masked it and regenerated mkinitcpio.
The system rebooted successfully after that change, but after longer power-off the problem apparently came back.
This is one peculiar symptom of the problem: the system may appear boot fine for few times until it just doesn't.

2. monday 5.2022 I noticed that after failed boot some LVM devices are entirely missing.
Here is listings of /dev/mapper and /dev/vghyper:

$ ls -la /dev/vghyper 
total 0
drwxr-xr-x  2 root root   80 May  2 16:41 .
drwxr-xr-x 21 root root 4520 May  2 16:42 ..
lrwxrwxrwx  1 root root    7 May  2 16:41 ArchDataOne -> ../dm-1
lrwxrwxrwx  1 root root    7 May  2 16:41 HyperiorRoot -> ../dm-0
$ ls -la /dev/mapper 
total 0
drwxr-xr-x  2 root root     160 May  2 16:41 .
drwxr-xr-x 21 root root    4520 May  2 16:42 ..
crw-------  1 root root 10, 236 May  2 16:41 control
lrwxrwxrwx  1 root root       7 May  2 16:41 vghyper-ArchDataOne -> ../dm-1
lrwxrwxrwx  1 root root       7 May  2 16:41 vghyper-HyperiorRoot -> ../dm-0
lrwxrwxrwx  1 root root       7 May  2 16:41 vghyper-ProjectCache_cpool_cdata -> ../dm-2
lrwxrwxrwx  1 root root       7 May  2 16:41 vghyper-ProjectCache_cpool_cmeta -> ../dm-3
lrwxrwxrwx  1 root root       7 May  2 16:41 vghyper-ProjectData_corig -> ../dm-4

Here is the same after

$ vgchange -ay

$ ls -la /dev/mapper 
total 0
drwxr-xr-x  2 root root     220 May  2 16:50 .
drwxr-xr-x 21 root root    4580 May  2 16:50 ..
crw-------  1 root root 10, 236 May  2 16:41 control
lrwxrwxrwx  1 root root       7 May  2 16:41 vghyper-ArchDataOne -> ../dm-1
lrwxrwxrwx  1 root root       7 May  2 16:50 vghyper-ArchDataTwo -> ../dm-6
lrwxrwxrwx  1 root root       7 May  2 16:50 vghyper-Hackery -> ../dm-8
lrwxrwxrwx  1 root root       7 May  2 16:41 vghyper-HyperiorRoot -> ../dm-0
lrwxrwxrwx  1 root root       7 May  2 16:41 vghyper-ProjectCache_cpool_cdata -> ../dm-2
lrwxrwxrwx  1 root root       7 May  2 16:41 vghyper-ProjectCache_cpool_cmeta -> ../dm-3
lrwxrwxrwx  1 root root       7 May  2 16:41 vghyper-ProjectData_corig -> ../dm-4
lrwxrwxrwx  1 root root       7 May  2 16:50 vghyper-SwapNVMe -> ../dm-7
$ ls -la /dev/vghyper
total 0
drwxr-xr-x  2 root root  140 May  2 16:50 .
drwxr-xr-x 21 root root 4580 May  2 16:50 ..
lrwxrwxrwx  1 root root    7 May  2 16:41 ArchDataOne -> ../dm-1
lrwxrwxrwx  1 root root    7 May  2 16:50 ArchDataTwo -> ../dm-6
lrwxrwxrwx  1 root root    7 May  2 16:50 Hackery -> ../dm-8
lrwxrwxrwx  1 root root    7 May  2 16:41 HyperiorRoot -> ../dm-0
lrwxrwxrwx  1 root root    7 May  2 16:50 SwapNVMe -> ../dm-7

After seeing this my only explanation is that an LVM process gets killed too early in the boot:
The vghyper-SwapNVMe is an normal device on same NVMe physical volume as the root device.
Yet it still fails to appear.

Last edited by JATothrim (2022-05-02 14:01:57)

JATothrim · 2022-05-02 16:16:34

I think finally managed to solve this issue. And for good.
The final clue what was happening was:

$ journalctl --unit=lvm-activate-vghyper.service
May 02 18:27:57 archlinux systemd[1]: Started /usr/bin/lvm vgchange -aay --autoactivation event vghyper.
May 02 18:27:57 archlinux systemd[1]: Stopping /usr/bin/lvm vgchange -aay --autoactivation event vghyper...
May 02 18:27:57 archlinux lvm[392]:   Interrupted...
May 02 18:27:57 archlinux lvm[392]:   3 logical volume(s) in volume group "vghyper" now active
May 02 18:28:57 HyperiorArchMachine systemd[1]: lvm-activate-vghyper.service: State 'stop-sigterm' timed out. Killing.
May 02 18:28:57 HyperiorArchMachine systemd[1]: lvm-activate-vghyper.service: Killing process 392 (lvm) with signal SIGKILL.
May 02 18:28:57 HyperiorArchMachine systemd[1]: lvm-activate-vghyper.service: Main process exited, code=killed, status=9/KILL
May 02 18:28:57 HyperiorArchMachine systemd[1]: lvm-activate-vghyper.service: Failed with result 'timeout'.
May 02 18:28:57 HyperiorArchMachine systemd[1]: Stopped /usr/bin/lvm vgchange -aay --autoactivation event vghyper.

So just as I had suspected in initrd stage systemd kills the LVM process(es)!
This leaves the LVM volume groups in partially activated state:
Soon as rootfs is found the other scans are terminated before initrd-switch-root.target is reached.
What systemd should actually do is that after initrd-switch-root.target it should retry activating rest of the logical volumes.
To fix this I wrote following service:

$ nano /etc/systemd/system/lvm-udev-retrigger.service
[Unit]
Description=Late trigger devices for LVM2
Documentation=man:udev(7) man:systemd-udevd.service(8)
DefaultDependencies=no
Before=local-fs-pre.target
Wants=systemd-udevd.service systemd-modules-load.service
After=systemd-udevd-kernel.socket systemd-udevd-control.socket systemd-modules-load.service initrd-switch-root.service
ConditionPathIsReadWrite=/sys

[Service]
Type=oneshot
RemainAfterExit=no
ExecStart=-udevadm trigger --type=subsystems --action=add
ExecStart=-udevadm trigger --type=devices --action=add
ExecStart=-lvm vgchange -ay vghyper
ExecStart=-udevadm settle

[Install]
WantedBy=local-fs-pre.target

The service waits for all kernel modules to finish loading, re-triggers Udev events and then crucially attempts the volume activation again.
I hope this can solve similar lvm volume activation problem(s) for other users as well.
Just remember to change the volume group name in the unit.

Edit: I updated the systemd service code. This has been so far consistently booted successfully.
Addition to the service, I had to disable auto-activating all other LVM volumes except for the rootfs in /etc/lvm/lvm.conf

Last edited by JATothrim (2022-05-08 20:25:29)

sickill · 2022-10-22 09:28:41

Thanks for posting your findings JATothrim, it helped to solve a similar issue of mine.

I have root fs on LVM and I recently created a thin pool for some of QEMU VMs.
Exactly like in your case systemd was killing vgchange right in the middle when it got root volume it needs, leaving other volumes (thin pool and its sub-volumes) in half-activated state.
Most of the time these volumes were inactive after boot, sometimes they were active but symlinks in /dev/<vg>/ were missing.

I put the following in /etc/lvm/lvm.conf:

auto_activation_volume_list = [ "stacja/arch-root" ]

(needs initramfs rebuild with: mkinitcpio -P)

I found that in my case there's no need for udevadm commands you used so my service unit is slightly simpler:

# systemctl cat lvm-activate.service 
# /etc/systemd/system/lvm-activate.service
[Unit]
Requires=dm-event.socket
After=dm-event.socket dm-event.service
Before=local-fs-pre.target
DefaultDependencies=no

[Service]
Type=oneshot
ExecStart=/usr/bin/lvm vgchange -ay stacja
RemainAfterExit=no

[Install]
WantedBy=local-fs-pre.target

where stacja is my VG name.

Maybe I don't even need dependency on dm-event.socket/service, but I left it there as it finally works and I'm not up for spending additional time on that.

Arch Linux

#1 2022-04-07 15:26:22

systemd boot problems with LVM2

#2 2022-04-07 20:29:30

Re: systemd boot problems with LVM2

#3 2022-04-08 17:31:44

Re: systemd boot problems with LVM2

#4 2022-05-02 16:16:34

Re: systemd boot problems with LVM2

#5 2022-10-22 09:28:41

Re: systemd boot problems with LVM2

Board footer