You are not logged in.

#1 2024-04-29 07:41:13

losipai
Member
Registered: 2021-07-29
Posts: 17

[SOLVED] Crash during update ("upgrading systemd"), boot entry gone

The system hung at "upgrading systemd" and I couldn't kill the terminal or change VT or do anything with the keyboard, although I think the cursor could still move.

Second time this happens this month, this time I just turned off the laptop and went to bed, so perhaps there is some useful info I can extract before I start trying to fix it on my own? If the problem isn't well known already.

Nvidia + Wayland + custom kernel (Asus ROG G14).

Recently the laptop has crashed several times during boot with a black screen and lit keyboard LEDs which are normally off. It might have mentioned something about Nvidia but I'm not sure. Each time, fsck would run after rebooting and Wayland would start normally again.

Last time the pacman crash happened I spent a weekend backing my stuff up over rsync because none of the pacman rescue techniques I could find would let me install anything. I know it sounds handwavy but I tried a lot of things, however the database was corrupted, the signing keys were broken, openssh didn't work, shared libraries were missing, no space left on the partition, nothing would be able to run. Also I have a newborn and my brain is mush.



Update: the boot entry disappearing was due to the kernel linux-g14 having been deleted due to outdated keys - not really related to the crash and broken pacman state.

Last edited by losipai (2024-05-01 13:27:37)

Offline

#2 2024-05-01 07:21:05

losipai
Member
Registered: 2021-07-29
Posts: 17

Re: [SOLVED] Crash during update ("upgrading systemd"), boot entry gone

From installer:

arch-chroot /mnt
chroot: failed to run command /bin/bash: Input/output error
pacstrap -K /mnt bash
==> Creating install root at /mnt
==> Installing packages to /mnt
:: Synchronizing package databases...
error: failed to synchronize all databases (unable to lock database)
==> ERROR: Failed to install packages to new root

rm /var/lib/pacman/db.lck
pacstrap /mnt bash
# OK, installed it
arch-chroot /mnt
chroot: failed to run command /bin/bash: Input/output error
mount
/dev/nvme0n1p2 on /mnt type ext4 (rw,relatime)
/dev/nvme0n1p1 on /mnt/boot type vfat (rw, # rest omitted)
# after unmounting
fsck -f -y /dev/nvme0n1p2
fsck -f -y /dev/nvme0n1p1
nvme smart-log /dev/nvme0n1p2
# only including error info below:
critical_warning   : 0
media_errors   : 0
num_err_log_entries   : 1

# chroot still gives Input/output error
pacman -r /mnt -Qnq | pacman -r /mnt -Syu - --cachedir /mnt/var/cache/pacman/pkg --dbpath /mnt/var/lib/pacman --gpgdir /mnt/etc/pacman.d/gnupg
# dependency resolution fails
pacman -r /mnt -Rs <offending packages> - --cachedir /mnt/var/cache/pacman/pkg --dbpath /mnt/var/lib/pacman --gpgdir /mnt/etc/pacman.d/gnupg
(1/2) removing xxx
(2/2) removing yyy
call to execv failed (Exec format error)
error: command failed to execute correctly

# trying again...
pacman -r /mnt -Qnq | pacman -r /mnt -Syu - --cachedir /mnt/var/cache/pacman/pkg --dbpath /mnt/var/lib/pacman --gpgdir /mnt/etc/pacman.d/gnupg
<package>: xxx exists in filesystem # for every package
Errors occurred, no package were upgraded

pacman -r /mnt -Qnq | pacman -r /mnt -Syu - --cachedir /mnt/var/cache/pacman/pkg --dbpath /mnt/var/lib/pacman --gpgdir /mnt/etc/pacman.d/gnupg --overwrite="*"
# 789 packages reinstalled, but with several errors that scroll by too quickly for me to read

mount --bind /proc /mnt/proc
mount -o bind /dev /mnt/dev
pacman -r /mnt -Qnq | pacman -r /mnt -Syu - --cachedir /mnt/var/cache/pacman/pkg --dbpath /mnt/var/lib/pacman --gpgdir /mnt/etc/pacman.d/gnupg --overwrite="*" | tee -a pacman_log.txt
# arch-chroot /mnt now works!
# downgrade nvidia stuff to 535xx
# fix Asus G14 repo keys and reinstall kernel which had been deleted

Last edited by losipai (2024-05-01 13:13:42)

Offline

#3 2024-05-01 08:04:24

seth
Member
Registered: 2012-09-03
Posts: 52,276

Re: [SOLVED] Crash during update ("upgrading systemd"), boot entry gone

https://bbs.archlinux.org/viewtopic.php?id=293400
https://aur.archlinux.org/packages/nvidia-535xx-dkms
https://wiki.archlinux.org/title/Pacman … an_upgrade

https://bbs.archlinux.org/viewtopic.php … 6#p2168066

error: failed to synchronize all databases (unable to lock database)

https://bbs.archlinux.org/viewtopic.php … 2#p2168282

To just re-install all packages

pacman --root /mnt --cachedir /mnt/var/cache/pacman/pkg -S --dbonly $(pacman --root /mnt -Qnq)
pacman --root /mnt --cachedir /mnt/var/cache/pacman/pkg -S $(pacman --root /mnt -Qnq)

If there're persisting IO errros, check dmesg to make sure the device isn't just genuinely broken.


Also I have a newborn and my brain is mush.

I'd tell you to put some beer into the milk, but apparently that's "problematic" and I "should be cancelled" and "get help" tongue

Offline

#4 2024-05-01 13:23:21

losipai
Member
Registered: 2021-07-29
Posts: 17

Re: [SOLVED] Crash during update ("upgrading systemd"), boot entry gone

Thank you seth, I'm able to boot again after the above steps!

I downgraded to nvidia-535xx as mentioned in the links. Hyprland won't start with nVidia anymore so it's running on the iGPU - who knows what I broke in the process.

seth wrote:

I'd tell you to put some beer into the milk, but apparently that's "problematic" and I "should be cancelled" and "get help"

Sounds about as safe as using Nvidia drivers! tongue

Last edited by losipai (2024-05-01 13:23:46)

Offline

#5 2024-05-01 13:35:58

seth
Member
Registered: 2012-09-03
Posts: 52,276

Re: [SOLVED] Crash during update ("upgrading systemd"), boot entry gone

Did the dkms build fail?

dkms status
lspci -k
cat /proc/cmdline
pacman -Qs kernel
nvidia-smi # does your GPU respond?

Offline

#6 2024-05-01 13:49:49

losipai
Member
Registered: 2021-07-29
Posts: 17

Re: [SOLVED] Crash during update ("upgrading systemd"), boot entry gone

dkms status

nvidia/535.171.04, 6.8.7-arch1-1.1-g14, x86_64: installed
rtl8814au/5.8.5.1.r180.gb5a6f96, 6.8.7-arch1-1.1-g14, x86_64: installed
 lspci -k

00:00.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Renoir/Cezanne Root Complex
	Subsystem: Advanced Micro Devices, Inc. [AMD] Renoir/Cezanne Root Complex
00:00.2 IOMMU: Advanced Micro Devices, Inc. [AMD] Renoir/Cezanne IOMMU
	Subsystem: Advanced Micro Devices, Inc. [AMD] Renoir/Cezanne IOMMU
00:01.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Renoir PCIe Dummy Host Bridge
00:01.1 PCI bridge: Advanced Micro Devices, Inc. [AMD] Renoir PCIe GPP Bridge
	Subsystem: ASUSTeK Computer Inc. Device 1662
	Kernel driver in use: pcieport
00:02.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Renoir PCIe Dummy Host Bridge
00:02.2 PCI bridge: Advanced Micro Devices, Inc. [AMD] Renoir/Cezanne PCIe GPP Bridge
	Subsystem: ASUSTeK Computer Inc. Device 1662
	Kernel driver in use: pcieport
00:02.4 PCI bridge: Advanced Micro Devices, Inc. [AMD] Renoir/Cezanne PCIe GPP Bridge
	Subsystem: ASUSTeK Computer Inc. Device 1662
	Kernel driver in use: pcieport
00:08.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Renoir PCIe Dummy Host Bridge
00:08.1 PCI bridge: Advanced Micro Devices, Inc. [AMD] Renoir Internal PCIe GPP Bridge to Bus
	Subsystem: Advanced Micro Devices, Inc. [AMD] Renoir Internal PCIe GPP Bridge to Bus
	Kernel driver in use: pcieport
00:14.0 SMBus: Advanced Micro Devices, Inc. [AMD] FCH SMBus Controller (rev 51)
	Subsystem: Advanced Micro Devices, Inc. [AMD] FCH SMBus Controller
	Kernel driver in use: piix4_smbus
	Kernel modules: i2c_piix4, sp5100_tco
00:14.3 ISA bridge: Advanced Micro Devices, Inc. [AMD] FCH LPC Bridge (rev 51)
	Subsystem: Advanced Micro Devices, Inc. [AMD] FCH LPC Bridge
00:18.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Cezanne Data Fabric; Function 0
00:18.1 Host bridge: Advanced Micro Devices, Inc. [AMD] Cezanne Data Fabric; Function 1
00:18.2 Host bridge: Advanced Micro Devices, Inc. [AMD] Cezanne Data Fabric; Function 2
00:18.3 Host bridge: Advanced Micro Devices, Inc. [AMD] Cezanne Data Fabric; Function 3
	Kernel driver in use: k10temp
	Kernel modules: k10temp
00:18.4 Host bridge: Advanced Micro Devices, Inc. [AMD] Cezanne Data Fabric; Function 4
00:18.5 Host bridge: Advanced Micro Devices, Inc. [AMD] Cezanne Data Fabric; Function 5
00:18.6 Host bridge: Advanced Micro Devices, Inc. [AMD] Cezanne Data Fabric; Function 6
00:18.7 Host bridge: Advanced Micro Devices, Inc. [AMD] Cezanne Data Fabric; Function 7
01:00.0 3D controller: NVIDIA Corporation GA107M [GeForce RTX 3050 Ti Mobile] (rev a1)
	Subsystem: ASUSTeK Computer Inc. Device 148c
	Kernel driver in use: nvidia
	Kernel modules: nvidia_drm, nvidia
06:00.0 Network controller: Intel Corporation Wi-Fi 6 AX200 (rev 1a)
	Subsystem: Intel Corporation Device 008c
	Kernel driver in use: iwlwifi
	Kernel modules: iwlwifi
07:00.0 Non-Volatile memory controller: Sandisk Corp IX SN530 NVMe SSD (DRAM-less) (rev 01)
	Subsystem: Sandisk Corp IX SN530 NVMe SSD (DRAM-less)
	Kernel driver in use: nvme
	Kernel modules: nvme
08:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Cezanne [Radeon Vega Series / Radeon Vega Mobile Series] (rev c4)
	Subsystem: ASUSTeK Computer Inc. Device 148c
	Kernel driver in use: amdgpu
	Kernel modules: amdgpu
08:00.1 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] Renoir Radeon High Definition Audio Controller
	Subsystem: Advanced Micro Devices, Inc. [AMD/ATI] Renoir Radeon High Definition Audio Controller
	Kernel driver in use: snd_hda_intel
	Kernel modules: snd_hda_intel
08:00.2 Encryption controller: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 10h-1fh) Platform Security Processor
	Subsystem: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 10h-1fh) Platform Security Processor
	Kernel driver in use: ccp
	Kernel modules: ccp
08:00.3 USB controller: Advanced Micro Devices, Inc. [AMD] Renoir/Cezanne USB 3.1
	Subsystem: Advanced Micro Devices, Inc. [AMD] Renoir/Cezanne USB 3.1
	Kernel driver in use: xhci_hcd
	Kernel modules: xhci_pci
08:00.4 USB controller: Advanced Micro Devices, Inc. [AMD] Renoir/Cezanne USB 3.1
	Subsystem: Advanced Micro Devices, Inc. [AMD] Renoir/Cezanne USB 3.1
	Kernel driver in use: xhci_hcd
	Kernel modules: xhci_pci
08:00.5 Multimedia controller: Advanced Micro Devices, Inc. [AMD] ACP/ACP3X/ACP6x Audio Coprocessor (rev 01)
	Subsystem: ASUSTeK Computer Inc. Device 1662
	Kernel modules: snd_pci_acp3x, snd_rn_pci_acp3x, snd_pci_acp5x, snd_pci_acp6x, snd_acp_pci, snd_rpl_pci_acp6x, snd_pci_ps, snd_sof_amd_renoir, snd_sof_amd_rembrandt, snd_sof_amd_vangogh, snd_sof_amd_acp63
08:00.6 Audio device: Advanced Micro Devices, Inc. [AMD] Family 17h/19h HD Audio Controller
	Subsystem: ASUSTeK Computer Inc. Device 1662
	Kernel driver in use: snd_hda_intel
	Kernel modules: snd_hda_intel
08:00.7 Signal processing controller: Advanced Micro Devices, Inc. [AMD] Sensor Fusion Hub
	Subsystem: Advanced Micro Devices, Inc. [AMD] Sensor Fusion Hub
	Kernel driver in use: pcie_mp2_amd
	Kernel modules: amd_sfh
cat /proc/cmdline

ibt=off pm_debug_messages amd_pmc.dyndbg="+p" acpi.dyndbg="file drivers/acpi/x86/s2idle.c +p" initrd=\initramfs-linux-g14.img root=UUID=1baf899e-df18-41a1-b98d-c1cff60bea28 rw
pacman -Qs kernel

local/dkms 3.0.12-1
    Dynamic Kernel Modules System
local/iptables 1:1.8.10-1
    Linux kernel packet control tool (using legacy interface)
local/kmod 32-1
    Linux kernel module management tools and library
local/lib32-libdrm 2.4.120-1
    Userspace interface to kernel DRM services (32-bit)
local/libdrm 2.4.120-1
    Userspace interface to kernel DRM services
local/libnetfilter_conntrack 1.0.9-2
    Library providing an API to the in-kernel connection tracking state table
local/libnfnetlink 1.0.2-2
    Low-level library for netfilter related kernel/userspace communication
local/libsysprof-capture 46.0-1
    Kernel based performance profiler - capture library
local/linux-api-headers 6.7-1
    Kernel headers sanitized for use in userspace
local/linux-g14 6.8.7.arch1-1.1
    The Linux kernel and modules
local/linux-g14-headers 6.8.7.arch1-1.1
    Headers and scripts for building modules for the Linux kernel
local/mtdev 1.1.6-2
    A stand-alone library which transforms all variants of kernel MT events to the slotted type B protocol
nvidia-smi

# It says "Off" now, used to say 8W or whatever the power draw was.

Last edited by losipai (2024-05-01 13:51:17)

Offline

#7 2024-05-01 14:42:21

seth
Member
Registered: 2012-09-03
Posts: 52,276

Re: [SOLVED] Crash during update ("upgrading systemd"), boot entry gone

01:00.0 3D controller: NVIDIA Corporation GA107M [GeForce RTX 3050 Ti Mobile] (rev a1)
	Subsystem: ASUSTeK Computer Inc. Device 148c
	Kernel driver in use: nvidia
	Kernel modules: nvidia_drm, nvidia

This is a pure GPU device, you can use it for DRI_PRIME, but not actually run a display server on it (exclusively)

You can remove ibt=off (nvidia fixed that), but maybe enable https://wiki.archlinux.org/title/NVIDIA … de_setting - use the "nvidia_drm.modeset=1" kernel parameter (modprobe.conf won't do!), to prevent any confusion through device re-ordering when the simpledrm device gets removed by the intel driver (I guess nvidia would then be card1 and intel card2)

Otherwise this looks uncritical, the GPU probably has entered RTD3, https://wiki.archlinux.org/title/PRIME# … Management

Offline

#8 2024-05-05 17:45:54

losipai
Member
Registered: 2021-07-29
Posts: 17

Re: [SOLVED] Crash during update ("upgrading systemd"), boot entry gone

Thanks!
I'm struggling to understand the GPU stuff.

Not sure where ibt=off is set from, can't find this string in /etc or /boot except that it's baked into vmlinuz-linux-g14 for some reason. Maybe an upstream thing?

use the "nvidia_drm.modeset=1" kernel parameter (modprobe.conf won't do!)

It already says Y if I sudo cat /sys/module/nvidia_drm/parameters/modeset, is that enough or do I still need to load the kernel with it?

This is a pure GPU device, you can use it for DRI_PRIME

Yes, it seems to activate when I run stuff with prime-run. Still trying to get CUDA to work, it's saying CL_PLATFORM_NOT_FOUND_KHR.

Last edited by losipai (2024-05-05 17:54:07)

Offline

#9 2024-05-05 19:39:44

seth
Member
Registered: 2012-09-03
Posts: 52,276

Re: [SOLVED] Crash during update ("upgrading systemd"), boot entry gone

ibt=off is a https://wiki.archlinux.org/title/Kernel_parameters which depends on your bootloader

seth wrote:

modprobe.conf won't do!

  - you'll be able to add the parameter once you figured the above wink

https://wiki.archlinux.org/title/GPGPU#CUDA - do you have opencl-nvidia and cuda installed?

Offline

#10 2024-05-12 12:36:11

losipai
Member
Registered: 2021-07-29
Posts: 17

Re: [SOLVED] Crash during update ("upgrading systemd"), boot entry gone

seth wrote:

ibt=off is a https://wiki.archlinux.org/title/Kernel_parameters which depends on your bootloader

modprobe.conf won't do!  - you'll be able to add the parameter once you figured the above wink

Yes, I use systemd-boot but nothing in my configuration sets that. It's not on the initrd or options lines, or anywhere in any other file. That's why I thought that maybe linux-g14 sets it by default. ¯\_(ツ)_/¯


seth wrote:

https://wiki.archlinux.org/title/GPGPU#CUDA - do you have opencl-nvidia and cuda installed?

Yes!

Last edited by losipai (2024-05-12 12:36:29)

Offline

#11 2024-05-12 13:29:37

seth
Member
Registered: 2012-09-03
Posts: 52,276

Re: [SOLVED] Crash during update ("upgrading systemd"), boot entry gone

What if you additioanlly to "prime-run" also "export VK_DRIVER_FILES=/usr/share/vulkan/icd.d/nvidia_icd.json"?

Offline

#12 2024-05-12 15:54:57

losipai
Member
Registered: 2021-07-29
Posts: 17

Re: [SOLVED] Crash during update ("upgrading systemd"), boot entry gone

No difference.

VK_DRIVER_FILES=/usr/share/vulkan/icd.d/nvidia_icd.json prime-run katrain

...
OpenCL error at /home/user/data/kata/kata0/cpp/neuralnet/openclhelpers.cpp, func err, line 308, error CL_PLATFORM_NOT_FOUND_KHR

I'm trying to get TensorRT to work. But I'll install cuda-tools and see if I can run something simpler first.

Offline

#13 2024-05-12 16:11:37

seth
Member
Registered: 2012-09-03
Posts: 52,276

Re: [SOLVED] Crash during update ("upgrading systemd"), boot entry gone

Do the cuda device nodes exist, "ls -l /dev/nvidia*"?

Offline

#14 2024-05-12 18:00:44

losipai
Member
Registered: 2021-07-29
Posts: 17

Re: [SOLVED] Crash during update ("upgrading systemd"), boot entry gone

$ ls -l /dev/nvidia*

crw-rw-rw- 1 root root    195, 254 May 12 10:57 /dev/nvidia-modeset
crw-rw-rw- 1 root root    507,   0 May 12 10:57 /dev/nvidia-uvm
crw-rw-rw- 1 root root    507,   1 May 12 10:57 /dev/nvidia-uvm-tools
crw-rw-rw- 1 root root    195,   0 May 12 10:57 /dev/nvidia0
crw-rw-rw- 1 root root    195, 255 May 12 10:57 /dev/nvidiactl

/dev/nvidia-caps:
total 0
cr-------- 1 root root 511, 1 May 12 11:00 nvidia-cap1
cr--r--r-- 1 root root 511, 2 May 12 11:00 nvidia-cap2

Last edited by losipai (2024-05-12 18:03:47)

Offline

#15 2024-05-13 06:36:44

seth
Member
Registered: 2012-09-03
Posts: 52,276

Re: [SOLVED] Crash during update ("upgrading systemd"), boot entry gone

crw-rw-rw- 1 root root    195,   0 May 12 10:57 /dev/nvidia0

"yes"


Try to

export OPENCL_VENDOR_PATH=/etc/OpenCL/vendors/nvidia.icd

- the error in #12 is from opencl, not cuda

Offline

#16 2024-05-13 17:49:13

losipai
Member
Registered: 2021-07-29
Posts: 17

Re: [SOLVED] Crash during update ("upgrading systemd"), boot entry gone

Same error!
I might need a more straight-forward way to verify either of OpenCL, CUDA or TensorRT than what I've been trying to run...

Offline

#17 2024-05-13 21:46:39

seth
Member
Registered: 2012-09-03
Posts: 52,276

Re: [SOLVED] Crash during update ("upgrading systemd"), boot entry gone

Offline

Board footer

Powered by FluxBB