You are not logged in.

#1 2024-08-26 09:20:51

why_do_i_need_a_username
Member
Registered: 2024-08-26
Posts: 16

[SOLVED] Arch setup on LVM-on-LUKS won't unlock any other LUKS volumes

So I recently ran into an issue where I couldn't open LUKS(2) partitions as intended on my Arch install, which is itself installed on a LVM-on-LUKS setup (autounlocked on startup using TPM2, initrd is systemd-based). I first experienced this when trying to create a disk setup similar to the aforementioned host system on an external drive on an external SSD which I wanted to use as the boot drive for a Raspberry Pi (yes, this is indeed possible if you try hard enough), where

cryptsetup -v luksOpen /dev/sdXX pi5_vg_sdXX_pv

hanged indefinitely and never actually completed properly. I could kill the cryptsetup process manually and still get the expected device for the unlocked partition in /dev/mapper... however, trying to do anything with it failed miserably and gave me the same "hanging indefinitely" issue in most of what I tried, especially when trying to create an LVM VG with a singular PV, with said PV being on that device. I recall (I think) being able to, e.g., do

mkfs.fat -F 32 /dev/mapper/the-device-in-question

...but not much more than that. I thought that this was an issue with the USB-to-SATA adapter I was using (I've had similar issues with it in the past and with those, it *was* the culprit), so I switched to another one (this one was USB 2.0 as opposed to the previous one being 3.0/3.1/whatever), and... the same thing happened there as well. So I booted into a live CD, in my case, Ubuntu 22.04 LTS, and it opened the volume just fine. I thought that maybe, it was an Arch problem, so I booted into an Arch install ISO to verify that, and, again, it worked as intended. So I decided to swap out the drive itself (I *had* basically been using it as a temporary storage/data transfer drive for the 10-ish months I've owned it for, so I thought that maybe I just killed it and ran out of writes) with a less used one, shrank the singular partition on it, created a LUKS partition in the newly created free space, and the same thing happened, my locally installed Arch setup failed but everything else didn't.

Afterwards, I tried creating a LUKS partition on a loop device, which failed as well:

[root@t480 ~]# dd if=/dev/zero of=/testfile bs=1M count=1024 status=progress
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB, 1.0 GiB) copied, 0.327222 s, 3.3 GB/s
[root@t480 ~]# losetup /dev/loop0 /testfile
[root@t480 ~]# cryptsetup -v luksFormat --type luks2 --cipher aes-xts-plain64 --hash sha512 --key-size 512 /dev/loop0

WARNING!
========
This will overwrite data on /dev/loop0 irrevocably.

Are you sure? (Type 'yes' in capital letters): YES
Enter passphrase for /testfile: 
Verify passphrase: 
Key slot 0 created.
Command successful.
[root@t480 ~]# cryptsetup -v luksOpen /dev/loop0 loop0
No usable token is available.
Enter passphrase for /testfile: 
# it just hangs here indefinitely

...however, I could unlock and mount the drive for my Arch install from within an Arch live ISO (archlinux-2024.08.01-x86_64.iso) and was able to open it (t480_vg is the VG on which I have Arch installed, or at least my main install of it):

root@archiso ~ # cryptsetup -v luksOpen /dev/nvme0n1p5 t480_vg_nvme0n1p5_pv
Failed to unseal secret using TPM2: Operation not permitted
No usable token is available
Enter passphrase for /dev/nvme0n1p5: 
Key slot 0 unlocked.
Command successful.
cryptsetup -v luksOpen /dev/nvme0n1p5 t480_vg_nvme0n1p5_pv  5.36s user 0.16s system 53% cpu 10.299 total
root@archiso ~ # vgchange -ay t480_vg
  1 logical volume(s) in volume group "t480_vg" now active
root@archiso ~ # mount -o subvol=/@ /dev/t480_vg/system /mnt
root@archiso ~ # mount -o subvol=/@home /dev/t480_vg/system /mnt/home
root@archiso ~ # mount -o subvol=/@log /dev/t480_vg/system /mnt/var/log
root@archiso ~ # mount -o subvol=/@spool /dev/t480_vg/system /mnt/var/spool
root@archiso ~ # mount /dev/nvme0n1p1 /mnt/boot
root@archiso ~ # arch-chroot /mnt
[root@archiso /]# losetup /dev/loop1 /testfile
[root@archiso /]# cryptsetup -v luksOpen /dev/loop1 loop1
No usable token is available.
Enter passphrase for /testfile: 
Key slot 0 unlocked.
Command successful.

I *would* have attempted to investigate this further until I found out what it was, but after having spent over half a year seemingly just running into the most ridiculous Linux failure modes imaginable *cough* (K)ubuntu 24.04 breaking its network stack after detaching an interface a bit too many times *cough*, I simply am not willing to do that anymore for the time being. I have, however, set up yet another Arch on LVM on LUKS w/ systemd initrd (only difference between this and the main install is the fact that I'm not doing TPM2 autounlock... but maybe that's what's causing this? and also I used "linux-lts" as the kernel package on the main one but "linux" on this one) install on the remaining (previously) unallocated space on the same drive as my main install (originally left free so I could eventually put encrypted swap there), where it also worked as intended, even inside arch-chroot (with the thing that's being chrooted into being the install on t480_vg/system), at least with the loop device thing (didn't test USB).

For reference, this is what the kernel parameters (both installs use systemd-boot if that's relevant) for the main (t480_vg) install look like:

rd.luks.name=e80f49e2-f8ea-46a2-a1cb-556db2df5e1b=t480_vg_nvme0n1p5_pv rd.luks.options=e80f49e2-f8ea-46a2-a1cb-556db2df5e1b=tpm2=device=auto root=/dev/t480_vg/system rootfstype=btrfs rootflags=subvol=/@ i915.enable_psr=0 i915.enable_dc=0 i915.enable_fbc=0 i915.enable_guc=3 rw

...here they are for the install I made for testing this:

rd.luks.name=3ccdcaaa-5b9e-457f-a6c1-825240ea58f4=test_vg_nvme0n1p6_pv root=/dev/test_vg/system rootfstype=btrfs rootflags=subvol=/@ rw

...inxi -Fxz:

System:
  Kernel: 6.6.47-1-lts arch: x86_64 bits: 64 compiler: gcc v: 14.2.1
  Desktop: wayfire v: 0.8.1-unknown Distro: Arch Linux
Machine:
  Type: Laptop System: LENOVO product: 20L6S5DR0K v: ThinkPad T480
    serial: <superuser required>
  Mobo: LENOVO model: 20L6S5DR0K v: SDK0J40697 WIN
    serial: <superuser required> UEFI: LENOVO v: N24ET76W (1.51 )
    date: 02/27/2024
CPU:
  Info: quad core model: Intel Core i5-8350U bits: 64 type: MT MCP
    arch: Coffee Lake rev: A cache: L1: 256 KiB L2: 1024 KiB L3: 6 MiB
  Speed (MHz): avg: 447 high: 776 min/max: 400/3600 cores: 1: 400 2: 400
    3: 400 4: 400 5: 400 6: 776 7: 400 8: 400 bogomips: 30409
  Flags: avx avx2 ht lm nx pae sse sse2 sse3 sse4_1 sse4_2 ssse3 vmx
Graphics:
  Device-1: Intel UHD Graphics 620 vendor: Lenovo driver: i915 v: kernel
    arch: Gen-9.5 bus-ID: 00:02.0
  Device-2: Bison SunplusIT Integrated Camera driver: uvcvideo type: USB
    bus-ID: 1-8:5
  Display: wayland server: X.org v: 1.21.1.13 with: Xwayland v: 24.1.2
    compositor: wayfire v: 0.8.1-unknown driver: X: loaded: modesetting
    gpu: i915 resolution: 1680x1050~60Hz
  API: EGL v: 1.5 drivers: iris,swrast platforms:
    active: gbm,wayland,x11,surfaceless,device inactive: N/A
  API: OpenGL v: 4.6 compat-v: 4.5 vendor: intel mesa v: 24.1.6-arch1.1
    glx-v: 1.4 direct-render: yes renderer: Mesa Intel UHD Graphics 620 (KBL
    GT2)
Audio:
  Device-1: Intel Sunrise Point-LP HD Audio vendor: Lenovo ThinkPad T480
    driver: snd_hda_intel v: kernel bus-ID: 00:1f.3
  API: ALSA v: k6.6.47-1-lts status: kernel-api
  Server-1: sndiod v: N/A status: off
  Server-2: PipeWire v: 1.2.2 status: active
Network:
  Device-1: Intel Ethernet I219-LM vendor: Lenovo driver: e1000e v: kernel
    port: N/A bus-ID: 00:1f.6
  IF: enp0s31f6 state: up speed: 100 Mbps duplex: full mac: <filter>
  Device-2: Intel Wireless 8265 / 8275 driver: iwlwifi v: kernel
    bus-ID: 03:00.0
  IF: wlp3s0 state: up mac: <filter>
Drives:
  Local Storage: total: 524.75 GiB used: 22 GiB (4.2%)
  ID-1: /dev/nvme0n1 vendor: Samsung model: SSD 980 500GB size: 465.76 GiB
    temp: 32.9 C
  ID-2: /dev/sdb model: N/A size: 58.98 GiB type: USB
Partition:
  ID-1: / size: 80 GiB used: 21.75 GiB (27.2%) fs: btrfs dev: /dev/dm-1
    mapped: t480_vg-system
  ID-2: /boot size: 2 GiB used: 109.1 MiB (5.3%) fs: vfat
    dev: /dev/nvme0n1p1
  ID-3: /home size: 80 GiB used: 21.75 GiB (27.2%) fs: btrfs dev: /dev/dm-1
    mapped: t480_vg-system
  ID-4: /var/log size: 80 GiB used: 21.75 GiB (27.2%) fs: btrfs
    dev: /dev/dm-1 mapped: t480_vg-system
Swap:
  Alert: No swap data was found.
Sensors:
  System Temperatures: cpu: 47.0 C pch: 43.0 C mobo: N/A
  Fan Speeds (rpm): fan-1: 0
Info:
  Memory: total: 16 GiB available: 15.5 GiB used: 1.76 GiB (11.3%)
  Processes: 247 Uptime: 8m Init: systemd
  Packages: 1178 Compilers: clang: 18.1.8 gcc: 14.2.1 Shell: Bash v: 5.2.32
    inxi: 3.3.35

Some possible theories as to what might be going on:
1. I didn't RTFM properly.
2. I have some weird power management settings set in

tlp

or something else that somehow break something somewhere somewhen.
3. Not enough RAM + no swap (unlikely as right now, it's only using ~1.5/16GB and that drops to 800MB-1.1GB after closing firefox, though maybe there's some failsafe thing in cryptsetup that goes off when it doesn't see any swap, but again, I very much doubt that because it worked on that clean Arch install I made, and the RAM usage there, while substantially lower depending on how you measure it (~500MB with Xorg + openbox + picom + 2 xterm windows), wasn't *that* much higher either, so...)
4. I hosed it myself somehow (which happens a bit too frequently for my liking, and usually there's no explanation as to *what* I broke).
5. My ThinkPad T480 is broken in some godforsaken way that broke LUKS on this *specific* Arch setup. Very much doubt this one though.
6. Kernel 6.6 LTS doesn't work as intended when it comes to this type of thing, and it works just fine on current mainline Linux. Don't think so as iirc LUKS worked just fine on 6.6 when I ran it at various points during the last half a year.
7. Aliens, probably. Or cosmic rays, or reptilians, or the feds and/or Micro$oft trying to mess with people using Linux. Whatever. Very unlikely, but who knows at this point.

Has this happened to anybody here before? Or am I just unlucky somehow? Is this fixable, or should I just reinstall, set up my system as normal and then verify that LUKS works properly after every step? (I have had to do this (except the issue was something else instead of LUKS, usually either networking or video playback) with both *buntu and Fedora but never with Arch, and I don't intend to do so whatsoever.) Thanks in advance for any help in regards to this.

Last edited by why_do_i_need_a_username (2024-08-31 07:34:39)

Offline

Board footer

Powered by FluxBB