You are not logged in.
Hello,
I have completely broken my system and hope that the clever minds here might have a solution.
It is an encrypted setup with Arch Linux. The Arch installation has the partitions /dev/sdc1 (boot) and /dev/sdc2 (encrypted system partition).
I ran a normal pacman -Syu this morning at 8.30am, first the console froze and then the complete system. After about 15 minutes I did a poweroff.
After that I only had the entry “Reboot into firmware interface”. I had this before and tried the following fix:
1. cryptsetup luksOpen /dev/sdc2 cryptvol
2. mount /dev/mapper/cryptvol /mnt
3. mount /dev/sdc1 /mnt/boot
4. arch-chroot /mnt
5. pacman -S linux
The reinstall of the kernel went through without errors, the files are located under /boot (in the chroot):
- amd-ucode.img
- initramfs-linux-fallback.img
- initramfs-linux.img
- vmlinuz-linux
There are also the directories EFI, 'System Volume Information' and loader.
In the loader directory is the correct loader.conf (default arch.conf, timeout 3) and also an entries subdirectory with the arch.conf:
Title Arch Linux
linux /vmlinuz-linux
initrd /initramfs-linux.img
options cryptdevice=PARTUUID=<correct PARTUUID based on blkid>:cryptroot root=/dev/mapper/cryptroot rw
After reinstalling the kernel, I was at least able to boot into Arch again, but ALL services fail directly at startup. See screenshot:
Full boot log (after that no furhter booting takes places)
https://ibb.co/ynQWxnk
After that I tried other things, including
- pacman -Syu (whereupon an error was thrown, which I was able to fix by deleting the db.lck)
=> result: Starting full system upgrade - there is nothing to do
- pacman -S $(pacman -qQ) --noconfirm (to reinstall all packages)
=> result: warning xyz is up to date -- reinstalling, some error for AUR packages (e.g. zoom) but this should not be critical for the boot
- mkinitcpio -P
=> result: Initcpio image generation successful (some warning about missing firmware for different modules, but this is normal I think, seen this in the past while upgrading my system)
Then I exited chroot, umounted, luksClose my stuff and shutdown.
Without success, I still have the same error with all services. Even the Emergency Shell cannot start.
I suspect that the errors that are displayed to me are only subsequent errors. In the boot scrolls the bootlog, unfortunately I can't scroll up with SHIFT+PGUP and I suspect that I don't get the main cause.
So I went back to the chroot via archiso and tried to read the last boot log with journalctl -b. However, journalctl -b throws the error “cannot execute binary file: Exec format error”. Unfortunately, reading with cat does not work because they are binaries. The last entry in the syslog is from around 8.30 am (when the system was still working).
My problem is not only that I have no idea how to solve this, but that I don't get a really meaningful error message about the main cause, which makes classic troubleshooting very difficult. I can find a few threads about the exec format error (which is also thrown at boot with the faulty service starts), but this mostly refers to the execution of individual commands.
I would be very grateful for any tips or ideas on how to proceed here.
EDIT:
Another observation is that after the boot and after the entry “triggering uevents” I briefly have a black screen and then the normal “enter passphrase” is promptly displayed on all three monitors. Normally I had no black screen and the prompt was only displayed on my main monitor. I don't know if this is relevant
Last edited by pacmanpenguin (2024-10-12 16:25:39)
Offline
1. You can still chroot into the system?
2. From the chroot, what's the output of
pacman -Qikk systemd
stat /usr/bin/journalctl
file /usr/bin/journalctl
3. for the last journal in the chroot use
journalct -b -1 # -b is the current boot, the install iso
or w/o chroot
journalctl -D /mnt/var/log/journal -b -1
4. If you can get a journal from a flawed boot, please post it, eg.
journalctl -D /mnt/var/log/journal -b -1 | curl -F 'file=@-' 0x0.st
Offline
1. Yes, this still works.
2. From chroot:
pacman -Qikk systemd:
https://0x0.st/X6cx.txt
Name : systemd
Version : 256.7-1
Description : system and service manager
Architecture : x86_64
URL : https://www.github.com/systemd/systemd
Licenses : LGPL-2.1-or-later CC0-1.0 GPL-2.0-or-later MIT-0
Groups : None
Provides : nss-myhostname systemd-tools=256.7 udev=256.7
Depends On : systemd-libs=256.7 acl libacl.so=1-64 bash cryptsetup libcryptsetup.so=12-64 dbus dbus-units kbd kmod hwdata libcap libcap.so=2-64 libgcrypt libxcrypt libcrypt.so=2-64 libidn2 lz4 pam libelf libseccomp libseccomp.so=2-64 util-linux libblkid.so=1-64 libmount.so=1-64 xz pcre2 audit libaudit.so=1-64 openssl libcrypto.so=3-64 libssl.so=3-64
Optional Deps : libmicrohttpd: systemd-journal-gatewayd and systemd-journal-remote
quota-tools: kernel-level quota management
systemd-sysvcompat: symlink package to provide sysvinit binaries [installed]
systemd-ukify: combine kernel and initrd into a signed Unified Kernel Image
polkit: allow administration as unprivileged user [installed]
curl: systemd-journal-upload, machinectl pull-tar and pull-raw [installed]
gnutls: systemd-journal-gatewayd and systemd-journal-remote [installed]
qrencode: show QR codes [installed]
iptables: firewall features [installed]
libarchive: convert DDIs to tarballs [installed]
libbpf: support BPF programs [installed]
libpwquality: check password quality [installed]
libfido2: unlocking LUKS2 volumes with FIDO2 token [installed]
libp11-kit: support PKCS#11 [installed]
tpm2-tss: unlocking LUKS2 volumes with TPM2 [installed]
Required By : accountsservice android-udev base bolt colord flatpak forticlient-vpn gdm gnome-logs gnome-remote-desktop gnome-session gnome-settings-daemon gnome-system-monitor gnome-user-share gvfs iio-sensor-proxy libcolord libgudev liblogging libpulse mdadm media-player-info mkinitcpio pacman rtkit vte3 vte4 xdg-desktop-portal xdg-user-dirs
Optional For : None
Conflicts With : nss-myhostname systemd-tools udev
Replaces : nss-myhostname systemd-tools udev
Installed Size : 32.46 MiB
Packager : Christian Hesse <eworm@archlinux.org>
Build Date : Tue Oct 8 17:47:49 2024
Install Date : Thu Oct 10 08:44:43 2024
Install Reason : Installed as a dependency for another package
Install Script : Yes
Validated By : Signature
systemd: 1547 total files, 30 altered files
stat /usr/bin/journalctl:
https://0x0.st/X6c3.txt
File: /usr/bin/journalctl
Size: 84672 Blocks: 168 IO Block: 4096 regular file
Device: 254,0 Inode: 2901061 Links: 1
Access: (0755/-rwxr-xr-x) Uid: ( 0/ root) Gid: ( 0/ root)
Access: 2024-10-10 08:44:43.251719789 +0200
Modify: 2024-10-08 17:47:49.000000000 +0200
Change: 2024-10-10 08:44:43.055062280 +0200
Birth: 2024-10-10 08:44:43.055062280 +0200
file /usr/bin/journactl
/usr/bin/journalctl: data
3. journalctl -b -1
bash: /usr/bin/journalctl: cannot execute binary file: Exec format error
4. I did a "fresh" boot today, up to the point where no more service starts are attempted, I get a log outside chroot, but the last entry in journal is from Oct 09, where the system was still working (ensured that I "jumped" to the end of log file)
Last edited by pacmanpenguin (2024-10-11 07:10:06)
Offline
The systemd package is damaged and journalctl isn't an ELF binary.
LC_ALL=C pacman --root /mnt -Qkk 2> /tmp/howbadisit
cat /tmp/howbadisit | curl -F 'file=@-' 0x0.st
And check https://wiki.archlinux.org/title/SMART to see whether the drive is falling apart ("smartctl -a", the generic "healthy" assertion is meaningless)
Edit: bw, you don't have to copy the internet around, my 0x0.st works just as good as yours
Last edited by seth (2024-10-11 08:00:21)
Offline
LC_ALL=C pacman --root /mnt -Qkk (i tried it inside an outside of chroot, same result)
error: failed to initialize alpm library:
(root: oot, dbpath: oot/var/lib/pacman/)
could not find or read directory
I ran a smartctl -a /dev/sdc
smartctl 7.4 2023-08-01 r5530 [x86_64-linux-6.8.2-arch2-1] (local build)
Copyright (C) 2002-23, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Model Family: Crucial/Micron Client SSDs
Device Model: CT500MX500SSD1
Serial Number: 1949E22D774B
LU WWN Device Id: 5 00a075 1e22d774b
Firmware Version: M3CR023
User Capacity: 500,107,862,016 bytes [500 GB]
Sector Size: 512 bytes logical/physical
Rotation Rate: Solid State Device
Form Factor: 2.5 inches
TRIM Command: Available
Device is: In smartctl database 7.3/5528
ATA Version is: ACS-3 T13/2161-D revision 5
SATA Version is: SATA 3.3, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is: Fri Oct 11 08:08:21 2024 UTC
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
General SMART Values:
Offline data collection status: (0x80) Offline data collection activity
was never started.
Auto Offline Data Collection: Enabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: ( 0) seconds.
Offline data collection
capabilities: (0x7b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 2) minutes.
Extended self-test routine
recommended polling time: ( 30) minutes.
Conveyance self-test routine
recommended polling time: ( 2) minutes.
SCT capabilities: (0x0031) SCT Status supported.
SCT Feature Control supported.
SCT Data Table supported.
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x002f 100 100 000 Pre-fail Always - 0
5 Reallocate_NAND_Blk_Cnt 0x0032 100 100 010 Old_age Always - 0
9 Power_On_Hours 0x0032 100 100 000 Old_age Always - 19207
12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 3229
171 Program_Fail_Count 0x0032 100 100 000 Old_age Always - 0
172 Erase_Fail_Count 0x0032 100 100 000 Old_age Always - 0
173 Ave_Block-Erase_Count 0x0032 063 063 000 Old_age Always - 561
174 Unexpect_Power_Loss_Ct 0x0032 100 100 000 Old_age Always - 370
180 Unused_Reserve_NAND_Blk 0x0033 000 000 000 Pre-fail Always - 45
183 SATA_Interfac_Downshift 0x0032 100 100 000 Old_age Always - 0
184 Error_Correction_Count 0x0032 100 100 000 Old_age Always - 0
187 Reported_Uncorrect 0x0032 100 100 000 Old_age Always - 0
194 Temperature_Celsius 0x0022 065 040 000 Old_age Always - 35 (Min/Max 0/60)
196 Reallocated_Event_Count 0x0032 100 100 000 Old_age Always - 0
197 Current_Pending_ECC_Cnt 0x0032 100 100 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0030 100 100 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x0032 100 100 000 Old_age Always - 0
202 Percent_Lifetime_Remain 0x0030 063 063 001 Old_age Offline - 37
206 Write_Error_Rate 0x000e 100 100 000 Old_age Always - 0
210 Success_RAIN_Recov_Cnt 0x0032 100 100 000 Old_age Always - 0
246 Total_LBAs_Written 0x0032 100 100 000 Old_age Always - 44977207223
247 Host_Program_Page_Count 0x0032 100 100 000 Old_age Always - 990565482
248 FTL_Program_Page_Count 0x0032 100 100 000 Old_age Always - 7714510665
SMART Error Log Version: 1
No Errors Logged
SMART Self-test log structure revision number 1
No self-tests have been logged. [To run self-tests, use: smartctl -t]
SMART Selective self-test log data structure revision number 1
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Completed [00% left] (0-65535)
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.
The above only provides legacy SMART information - try 'smartctl -x' for more
Thank you very much for your help!
Last edited by pacmanpenguin (2024-10-11 08:12:11)
Offline
1. not from the chroot (that's what "--root" does, we want t avoid relying on the compromised system)
2. Drive doesn't look too bad, so it's the filesystem (btrfs?) or your re-installation of all packages failed.
Offline
1. I tried both, but same result
2. Filesystem is ext4 for /dev/mapper/cryptvol based on df -T
Not sure how to check this, but based on pacman -Qk everything seems fine, 0 missing files for all packages. While reinstalling no error were shown.
EDIT: Tried reinstall again, same result as in the initial post. xyz is up to date.
Last edited by pacmanpenguin (2024-10-11 08:37:22)
Offline
Ah,
(root: oot, dbpath: oot/var/lib/pacman/)
Seems you dropped a dash and made it "-root /mnt" instead of "--root /mnt"
Edit
Not sure how to check this, but based on pacman -Qk everything seems fine, 0 missing files for all packages.
"-Qkk" and the systemd package inspection clearly shows that the package is corrupted.
Edit #2:
sigh.
Don't chroot, mount all partitions (don't forget/mnt/boot if you're booting from a dedicated partition and run
pacman --root /mnt --cachedir /mnt/var/cache/pacman/pkg -Sy --dbonly $(pacman --root /mnt -Qnq)
pacman --root /mnt --cachedir /mnt/var/cache/pacman/pkg -S $(pacman --root /mnt -Qnq)
Don't skip any dashes. This will first sanitized the package database of all packages, then re-install all packages.
Then arch-chroot into the system and run "pacman -Syu"
Last edited by seth (2024-10-11 08:43:30)
Offline
Ah,
(root: oot, dbpath: oot/var/lib/pacman/)
Seems you dropped a dash and made it "-root /mnt" instead of "--root /mnt"
Yes, you're right. Sorry, my fault.
See full log here: http://0x0.st/X6Ta.csv
There are some warnings shown, that are not included in the .csv file.
Offline
pacman --root /mnt --cachedir /mnt/var/cache/pacman/pkg -Sy --dbonly $(pacman --root /mnt -Qnq)
All packages seem have just been reinstalled, at the end an “unknow trust” (invalid or corrputed package (PGP Signature) was thrown for a few packages.
However, there is now an error at the end:
failed to commit transaction (invalid or corrupted package)
Errors occured, no packages were upgraded
Should I rerun this command?
Last edited by pacmanpenguin (2024-10-11 08:58:03)
Offline
These all have broken mtrees
drawio-desktop electron28 firefox gdb gdb-common git glib2-devel glib2-docs gnome-bluetooth-3.0 go gperftools graphene gtk4 gtksourceview5 imagemagick jasper lib32-libnghttp3 libblockdev libblockdev-crypto libblockdev-fs libblockdev-loop libblockdev-mdraid libblockdev-nvme libblockdev-part libblockdev-swap libgphoto2 libgsf libimobiledevice-glue libmanette libnftnl libngtcp2 libnm libproxy libspelling libvncserver libwebp libwnck3 libyuv lvm2 mbedtls2 net-tools netpbm networkmanager noto-fonts-emoji npm nvidia ppp python-cffi python-cryptography python-pycodestyle python-pyproject-hooks python-validate-pyproject python-wheel qt6-base qt6-declarative remmina rpi-imager sdl2 signal-desktop systemd-sysvcompat thunderbird unbound vapoursynth vim vim-runtime virtualbox-host-modules-arch webkit2gtk-4.1 webkitgtk-6.0
And these have file deviations
adwaita-cursors adwaita-icon-theme amd-ucode audacity baobab cmake default-cursors djvulibre eog evince evolution-data-server file-roller forticlient-vpn gcr gdk-pixbuf2 geocode-glib-common glib2 gnome-calculator gnome-calendar gnome-characters gnome-clocks gnome-color-manager gnome-console gnome-control-center gnome-disk-utility gnome-font-viewer gnome-logs gnome-music gnome-online-accounts gnome-photos gnome-shell gnome-system-monitor gnome-text-editor gnome-tweaks google-chrome granite7 hicolor-icon-theme htop ibus java-runtime-common jdk-openjdk jdk11-openjdk kdenlive keepassxc lftp libpeas libreoffice-still libutempter malcontent mesa microsoft-edge-stable-bin monero-gui nautilus network-manager-applet nm-connection-editor nodejs nodejs-nopt nvidia-utils opensnitch orca planify postfix purpose qt5-tools rygel scribus shadow shutter simple-scan sublime-text systemd tecla tilix totem v4l-utils virtualbox vtop wasabi-wallet-bin wireshark-qt yelp
archlinux-keyring seems intact, though - you'll have to provide the actual error.
You can also try to pre-update it
pacman --root /mnt --cachedir /mnt/var/cache/pacman/pkg -Sy archlinux-keyring
nb. that the --dbonly pass doesn't update *anything*, it just fixes the broken and missing mtree problems.
The second pass will actually re-install the packages and fix the broken files.
Offline
I tried the commands from the previous comment (but with --noconfirm and redirected stdout and stderr to the log):
pacman --root /mnt --cachedir /mnt/var/cache/pacman/pkg -Sy --dbonly $(pacman --root /mnt -Qnq)
https://0x0.st/X6T4.txt
pacman --root /mnt --cachedir /mnt/var/cache/pacman/pkg -S $(pacman --root /mnt -Qnq)
https://0x0.st/X6Tt.txt
then chroot into /mnt and pacman -Syu
https://0x0.st/X6Tv.txt
Im not sure if the first commands worked as expected. For this "exist in filesystem error" from pacman -Syu inside chroot there is a force / overwrite flag, should I rerun it with that flag?
Edit: If I run the pacman --root /mnt --cachedir /mnt/var/cache/pacman/pkg -Sy archlinux-keyring before, its the same error.
keyring_command.log: https://0x0.st/X6T6.txt
Last edited by pacmanpenguin (2024-10-11 09:50:10)
Offline
usr/bin/pacman-key: line 196: /dev/fd/63: No such file or directory
ldconfig: File /usr/lib32/libnghttp3.so.9.2.4 is empty, not checked.
…
Do not chroot!
In doubt you'll have to reset the keyring, https://wiki.archlinux.org/title/Pacman … l_the_keys
However that should™ not be necessary if you don't chroot into the intallation and use the pacman of the install iso.
Speaking of which: if that's an older install iso, the keyring might be too dated.
You can try to run "pacman -Sy archlinux-keyring" on the install iso (no chroot, no --root, we're trying to update the keys in the install iso. This will not survive a reboot!) or fetch the latest install iso.
As long as pacman quits with "Errors occurred, no packages were upgraded." you've not effectively changed/fixed the system.
Offline
In Germany we have a saying “Kaum macht man es richtig, schon funktioniert es” which means “As soon as you do it right, it works”.
I am happy to inform you that the system is booting cleanly. Seth, you can't imagine how much you've helped me. I have already accepted my fate of having to completely reinstall everything. My whole weekend is saved! And more importantly, I learned a lot. If you have a buymeacoffee link or something, feel free to send it over to me.
For all those who might find the thread at some point: In the end it was absolutely sufficient to run the following commands (outside chroot):
pacman -Sy archlinux-keyring
pacman --root /mnt --cachedir /mnt/var/cache/pacman/pkg -Sy --dbonly $(pacman --root /mnt -Qnq)
pacman --root /mnt --cachedir /mnt/var/cache/pacman/pkg -S $(pacman --root /mnt -Qnq)
After that I did a pacman -Syu inside chroot, but there was nothing to update, so I think its obsolete.
Some side questions:
1. I created a fresh bootstick with the latest archiso today (with balenaEtcher) and when I tried to luksOpen my /dev/sdc2, I get a "no key available with this passphrase". ofc ensured to load the correct keyboard layout (de-latin1) and even typed the password in the shell to ensure its correct. When I booted from the old bootstick, everything works. Kinda crazy, any explanation for that? I think I had this one time in the past and it was bc of missing initramfs or something..
2. After running the two commands and waiting till everything is finished, umount /mnt is not possible. Based on the PID fuser -m its bc of gpg-agent. Why is this still running after everything is finished?
THANK YOU!
Offline
gpg-agent might be related to archlinux-keyring-wkd-sync ?
But even then I'm not really sure why it would access /mnt - you can inspect "ls -l /proc/$(pidof gpg-agent)/fd" to see what files it's accessing.
Is the luksOpen problem still reproducible?
nb. that the device node order is not deterministic (ie. eg. sdb and sdc might swap places any time), you might simply have tried to unlock the wrong partition.
Please always remember to mark resolved threads by editing your initial posts subject - so others will know that there's no task left, but maybe a solution to find.
Thanks.
Offline