You are not logged in.

#1 2018-10-12 01:20:44

jwhendy
Member
Registered: 2010-04-01
Posts: 621

Repairing borked rEFInd as result of corporate Win 10 upgrade

So, I probably should have known better, as it briefly dawned on me to pull my linux drive before proceeding with a Win 10 1703 upgrade pushed down from IT. I didn't, and now I can't boot into arch. Brief background on setup:

- nvme0n1: windows
- nvme1n1p1: /boot/efi, fat32: refind
- nvme1n1p2: /boot, ext4: vmlinuz* and initrd*
- nvme1n1p1: btrfs on luks/dm-crypt (contains an arch, xenial, and bionic install, each on their own subvols)

Upon first boot, I got a blank rEFInd screen with "Searching for boot loaders; please wait" overlaid on top, which was the first sign of trouble. After a bunch of troubleshooting attempts, I've been unsuccessful. I wrote this post as a help-the-world/note-to-self endeavor, and repeated my bootloader steps without success. This included:

- reformat both nvme1n1p1 and nvme1n1p2 as fat32 and ext4, respectively
- copy back vmlinuz* and initrd*
- re-install refind with:

### this is chrooted into my still-present arch install, and keys are still in /etc/refind.d/keys
# refind-install --shim /usr/share/shim-signed/shimx64.efi --localkeys

You can see the output of the install command and blank rEFInd screen that I get HERE.

I'm kind of at a loss on how to move forward. I had a glimpse of success thinking I'd switch to systemd-boot with PreLoader.efi, but it couldn't find /vmlinuz-linux... I now realize this is due to systemd-boot's limitation of expecting the esp to be at /boot, and mine is a separate partition at /boot/efi. I can't recall why I did that, but there are loads of examples doing it, so who knows what I read ~6mos ago. This is my first UEFI system with Secure Boot, which was quite the learning curve.

So, one possible option is giving up on rEFInd and going with systemd-boot. Maybe a silly question as I'm pretty sure it's possible... can I combined nvme1n1p1 and nvme1n1p2 into one partition starting/stopping at the same start/end sectors they currently use and absolutely, positively leave nvme1n1p3 (my encrypted precious data) alone? Yes, I will backup... but still, this will easily morph from an hour of troubleshooting to a full on weekend of reinstalling and restoring if this is not possible.

So far, that's seeming like my best bet since at least systemd-boot got to the point of looking for my arch menu entry's vmlinuz-linux. I'd merge these partitions and just go with that.

Thanks for any ideas/suggestions. I can't find a blank rEFInd screen like I'm seeing anywhere at all to identify the same issue...

Offline

#2 2018-10-12 01:49:34

eschwartz
Fellow
Registered: 2014-08-08
Posts: 4,097

Re: Repairing borked rEFInd as result of corporate Win 10 upgrade

I don't know what refind might be doing, but I can confirm that you can drop your ext4 partition and keep /boot directly on your btrfs subvolume, inside the encrypted luks device.

grub comes with a cryptomount command that lets it decrypt a luks device. And just like refind, it comes with support for ext4 and btrfs, etc. so you can have your decent filesystem with support for symlinks instead of the horrible vfat "standard". tongue

...

Have you tried checking that nothing happened to the efi or ext4 partitions? Would it help to disable refind scanning and use explicit stanzas?


Managing AUR repos The Right Way -- aurpublish (now a standalone tool)

Offline

#3 2018-10-12 02:39:11

jwhendy
Member
Registered: 2010-04-01
Posts: 621

Re: Repairing borked rEFInd as result of corporate Win 10 upgrade

Writing from arch. What worked was just mounting my vfat /dev/nvme1n1p1 to /boot and going through the systemd-boot and secure boot shim instructions from scratch.

Regarding this:

eschwartz wrote:

Have you tried checking that nothing happened to the efi or ext4 partitions? Would it help to disable refind scanning and use explicit stanzas?

See original post. I am not sure, ran fsck on both (not mentioned) and ultimately just blew them away, reformatting both (which is mentioned). So that should remove any corruption concerns.

Now I'm getting an odd "security violation" when trying to boot Ubuntu Xenial (I actually triple boot arch, xenial, and bionic from separate subvols), despite loading the same vmlinuz and initrd that worked before all of this went awry. The arch kernel was just copied back to /boot, and it works, so I don't really suspect any corrupt stuff with ubuntu's same files. I get the mok manager and tried enrolling the hash, as well as re-enrolling the .cer/.crt that these are signed with and have not had any luck yet.

So... about 50% there, but still not fully restored, and even if I was on systemd-boot, it will always bug me if I don't solve why the heck refind is doing what it's doing.

Offline

Board footer

Powered by FluxBB