You are not logged in.
Pages: 1
* edited TLDR * My cpu slowly died and from there it produced random problems everywhere.
A week ago I updated my system as always but it seems something went wrong and I can't no longer achieve a proper boot.
When I power on the machine, I can't see grub selection window so maybe something related to the kernel or the bootloader configuration happened after the update.
I tried troubleshooting , secure boot is disabled and I can't change kernel parameters as I have no access to grub.
I have also tried to boot from a live usb but failed, I downloaded the latest usb installation medium trying to at least chroot and mkinitcpio but it gets stuck with an acpi error that I could bypass with kernel parameter acpi=off (grub works in my live usb), then the text indicating that initrd is starting shows up and then the screen gets completely black and I have to force a shutdown pressing the power button for about 9 secs.
I have taken out the gpu just in case (it started failing and giving errors) to avoid redherrings as i wasted 2 days wondering why nomodeset wasn't working.
I got to boot with a kali linux live usb after some sort of error handling kali did, but I don't really know what to do from there. Is there any problem trying to mkinitcpio from other distribution other than arch? Is there anything I should have in mind when chrooting from another distribution?
I have not updated the BIOS, the last day I used the system it seemed like it powered off properly.
Last edited by blinkingbit (2024-10-18 10:23:34)
Offline
I'd first and foremost check the integrity of your grub installation and in doubt re-install that.
Can you chroot into arch from kali?
Don't forget to mount any boot partition if you have one.
Offline
I should have in mind when chrooting from another distribution?
You should be able to install the arch-install-scripts package from the Kali ISO, which will provide arch-chroot(8) and make things a bit simpler.
Para todos todo, para nosotros nada
Offline
Can you chroot into arch from kali?
Yeah I could chroot and I have reinstalled grub successfully and also executed mkinitcpio -P. My system is still broken though, after selecting the boot entry in grub and using acpi=off to avoid some errors, the message of "Loading initial ramdisk..." appears and after that the screen is just black, nothing more appears and I have to force power off the computer.
I'm not really sure in which step my system is when it hangs up. I think its on initramfs because of the change after the initial ramdisk message, but I don't end up in any form of emergency shell, at least I can't type anything and see any text in the screen.
This is my first time trying to fix a boot problem so I could have miss an obvious thing to check while in chroot. I'm not really sure what things I can do while chrooting, is it safe to check logs or update the system? Or is it better to wait until diagnose the problem properly to avoid breaking more things?
You should be able to install the arch-install-scripts package from the Kali ISO, which will provide arch-chroot(8) and make things a bit simpler.
I have followed this instructions and it seems that it was enough but if I have to do anything more this will be handy, thank you!
Offline
Try to also re-install the kernel, I'd say your /boot partition got corrupted, damaging grub but likely also kernel image and initramfs.
Offline
I have chrooted again and executed the following
pacman -Syy
pacman -S linux
Installation output seem successful but the problem is still there. After selecting the boot entry and seeing the message about loading initial ramdisk, the screen turns black and stays like that instead of showing up the usual login prompt.
The thing I can't wrap my head around is how badly I messed up to not even be able to reinstall arch. I mean, the installation iso not working on my machine anymore is weird and it's working properly in other machines
Offline
the screen turns black and stays like that instead of showing up the usual login prompt.
TRy to boot the multi-user.target (2nd link below) in doubt along "nomodeset"
how badly I messed up to not even be able to reinstall arch
From the chroot run
sudo LC_ALL=C pacman -Qkk | grep -v ', 0 altered files' | grep -v backup
Offline
Tried to boot with nomodeset and systemd.unit=multi-user.target and it just rebooted ending in grub again. I tried also systemd.unit=rescue.target but the same happened.
Output of the pacman check:
warning: amd-ucode: /boot/amd-ucode.img (Permissions mismatch)
warning: amd-ucode: /boot/amd-ucode.img (Modification time mismatch)
warning: anki: /var/lib/pacman/local/anki-24.04.1-2/changelog (Modification time mismatch)
warning: cups: /etc/cups/classes.conf (Permissions mismatch)
warning: cups: /etc/cups/printers.conf (Permissions mismatch)
warning: dkms: /var/lib/pacman/local/dkms-3.0.12-1/install (Modification time mismatch)
warning: fakeroot: /var/lib/pacman/local/fakeroot-1.35-1/install (Modification time mismatch)
warning: filesystem: /var/lock (Modification time mismatch)
warning: filesystem: /var/mail (Modification time mismatch)
warning: filesystem: /var/run (Modification time mismatch)
warning: filesystem: /var/games (GID mismatch)
warning: filesystem: /var/games (Permissions mismatch)
warning: filesystem: /var/spool/mail (Permissions mismatch)
warning: fontconfig: /var/lib/pacman/local/fontconfig-2:2.15.0-2/install (Modification time mismatch)
warning: gdk-pixbuf2: /var/lib/pacman/local/gdk-pixbuf2-2.42.12-1/install (Modification time mismatch)
warning: glibc: /var/db (Permissions mismatch)
warning: grub: /var/lib/pacman/local/grub-2:2.12-2/install (Modification time mismatch)
warning: hpoj: /var/lib/pacman/local/hpoj-0.91-21/install (Modification time mismatch)
warning: intel-oneapi-openmp: /opt/intel/oneapi/compiler/2024.1/lib/libhwloc.so.15 (No such file or directory)
warning: intel-oneapi-tbb: /opt/intel/oneapi/tbb/2021.12/lib/libhwloc.so.15 (No such file or directory)
warning: intel-oneapi-tbb: /opt/intel/oneapi/tbb/2021.12/lib/libtcm.so.1 (No such file or directory)
warning: intel-oneapi-tbb: /opt/intel/oneapi/tbb/2021.12/lib/libtcm_debug.so.1 (No such file or directory)
warning: java-runtime-common: /var/lib/pacman/local/java-runtime-common-3-5/install (Modification time mismatch)
warning: java-runtime-common: /usr/lib/jvm/default (Symlink path mismatch)
warning: java-runtime-common: /usr/lib/jvm/default (Modification time mismatch)
warning: java-runtime-common: /usr/lib/jvm/default-runtime (Symlink path mismatch)
warning: java-runtime-common: /usr/lib/jvm/default-runtime (Modification time mismatch)
warning: lib32-fontconfig: /var/lib/pacman/local/lib32-fontconfig-2:2.15.0-1/install (Modification time mismatch)
warning: lib32-gdk-pixbuf2: /var/lib/pacman/local/lib32-gdk-pixbuf2-2.42.12-1/install (Modification time mismatch)
warning: man-db: /var/lib/pacman/local/man-db-2.12.1-1/install (Modification time mismatch)
warning: materialx: /usr/share/mime/model/materialx.xml (No such file or directory)
warning: nodejs-nopt: /usr/bin/nopt (Permissions mismatch)
warning: nvme-cli: /var/lib/pacman/local/nvme-cli-2.9.1-1/install (Modification time mismatch)
warning: perl-xml-sax: /var/lib/pacman/local/perl-xml-sax-1.02-2/install (Modification time mismatch)
warning: polkit: /var/lib/pacman/local/polkit-124-2/install (Modification time mismatch)
warning: pulseaudio: /var/lib/pacman/local/pulseaudio-17.0-3/install (Modification time mismatch)
amd-ucode: 16 total files, 1 altered file
anki: 1140 total files, 1 altered file
cups: 946 total files, 2 altered files
dkms: 25 total files, 1 altered file
fakeroot: 47 total files, 1 altered file
filesystem: 124 total files, 5 altered files
fontconfig: 356 total files, 1 altered file
gdk-pixbuf2: 370 total files, 1 altered file
glibc: 1614 total files, 1 altered file
grub: 1122 total files, 1 altered file
hpoj: 75 total files, 1 altered file
intel-oneapi-openmp: 174 total files, 1 altered file
intel-oneapi-tbb: 241 total files, 3 altered files
java-runtime-common: 21 total files, 3 altered files
lib32-fontconfig: 17 total files, 1 altered file
lib32-gdk-pixbuf2: 14 total files, 1 altered file
man-db: 485 total files, 1 altered file
materialx: 1193 total files, 1 altered file
nodejs-nopt: 30 total files, 1 altered file
nvme-cli: 222 total files, 1 altered file
warning: rustup: /var/lib/pacman/local/rustup-1.27.1-1/install (Modification time mismatch)
warning: shadow: /usr/bin/groupmems (GID mismatch)
warning: shadow: /usr/bin/groupmems (Permissions mismatch)
warning: shared-mime-info: /var/lib/pacman/local/shared-mime-info-2.4-1/install (Modification time mismatch)
warning: sudo: /var/db (Permissions mismatch)
warning: systemd: /var/log/journal (GID mismatch)
warning: texlive-basic: /var/lib/texmf/arch/installedpkgs/basic.fmts (Modification time mismatch)
warning: texlive-basic: /var/lib/texmf/arch/installedpkgs/basic.maps (Modification time mismatch)
warning: texlive-fontsextra: /var/lib/texmf/arch/installedpkgs/fontsextra.maps (Modification time mismatch)
warning: texlive-fontsrecommended: /var/lib/texmf/arch/installedpkgs/fontsrecommended.maps (Modification time mismatch)
warning: texlive-formatsextra: /var/lib/texmf/arch/installedpkgs/formatsextra.fmts (Modification time mismatch)
warning: texlive-formatsextra: /var/lib/texmf/arch/installedpkgs/formatsextra.maps (Modification time mismatch)
warning: texlive-games: /var/lib/texmf/arch/installedpkgs/games.maps (Modification time mismatch)
warning: texlive-latex: /var/lib/texmf/arch/installedpkgs/latex.fmts (Modification time mismatch)
warning: texlive-latex: /var/lib/texmf/arch/installedpkgs/latex.maps (Modification time mismatch)
warning: texlive-latexextra: /var/lib/texmf/arch/installedpkgs/latexextra.fmts (Modification time mismatch)
warning: texlive-latexextra: /var/lib/texmf/arch/installedpkgs/latexextra.maps (Modification time mismatch)
warning: texlive-mathscience: /var/lib/texmf/arch/installedpkgs/mathscience.fmts (Modification time mismatch)
warning: texlive-mathscience: /var/lib/texmf/arch/installedpkgs/mathscience.maps (Modification time mismatch)
warning: texlive-music: /var/lib/texmf/arch/installedpkgs/music.maps (Modification time mismatch)
warning: texlive-pictures: /var/lib/texmf/arch/installedpkgs/pictures.maps (Modification time mismatch)
warning: ventoy-bin: /var/lib/pacman/local/ventoy-bin-1.0.97-1/install (Modification time mismatch)
warning: xorg-server: /var/lib/pacman/local/xorg-server-21.1.13-1/install (Modification time mismatch)
warning: xorg-server-common: /var/lib/xkb/README.compiled (Modification time mismatch)
perl-xml-sax: 36 total files, 1 altered file
polkit: 218 total files, 1 altered file
pulseaudio: 366 total files, 1 altered file
rustup: 37 total files, 1 altered file
shadow: 588 total files, 1 altered file
shared-mime-info: 252 total files, 1 altered file
sudo: 239 total files, 1 altered file
systemd: 1543 total files, 1 altered file
texlive-basic: 2673 total files, 2 altered files
texlive-fontsextra: 106360 total files, 1 altered file
texlive-fontsrecommended: 5345 total files, 1 altered file
texlive-formatsextra: 769 total files, 2 altered files
texlive-games: 1008 total files, 1 altered file
texlive-latex: 2237 total files, 2 altered files
texlive-latexextra: 6954 total files, 2 altered files
texlive-mathscience: 1086 total files, 2 altered files
texlive-music: 593 total files, 1 altered file
texlive-pictures: 4542 total files, 1 altered file
ventoy-bin: 130 total files, 1 altered file
xorg-server: 49 total files, 1 altered file
xorg-server-common: 15 total files, 1 altered file
Offline
it just rebooted
A week ago I updated my system as always
https://bbs.archlinux.org/viewtopic.php?id=298360
Try to downgrade the kernel to 6.10.2 (the LTS kernel is also affected and won't help)
Offline
I have tried downgrading with 6.10.2.arch1-1 both linux and linux-headers packages.
After booting with this kernel version it constantly rebooting after loading inital ramdisk.
I have tried to boot it with processor.max_cstate=0, still rebooting. Added idle=nomwait and the reboot stopped but its an empty black screen now.
It's my first time downgrading the kernel so maybe I did something wrong. I have amd microcode and I don't really know if I have any kernel module that I should have downgrade.
Offline
Did you try to boot w/ "acpi=off"?
Offline
Yes. I have tried just with acpi=off alone (which end in black screen) and acpi=off processor.max_cstate=0 idle=nomwait which ended in a reboot.
Offline
which end in black screen
acpi=off nomodeset
Offline
With nomodeset, after loading initial ramdisk the monitor shows that the video signal is missing and a little bit after that it reboots.
It also happens if I use idle=nomwait and processor.max_cstate=0 with acpi=off and nomodeset.
Offline
Please post your complete system journal from the live system:
sudo journalctl -b | curl -F 'file=@-' 0x0.st
Offline
Offline
I didn't find anything interesting reading the journal in my previous message and I'm pretty lost again.
I have been researching my pacman log but it seems that in the last update before this problem appeared, the kernel wasn't being updated:
[2024-08-02T11:55:01+0200] [PACMAN] Running 'pacman -Syu'
[2024-08-02T11:55:01+0200] [PACMAN] synchronizing package lists
[2024-08-02T11:55:01+0200] [PACMAN] starting full system upgrade
[2024-08-02T11:55:09+0200] [ALPM] transaction started
[2024-08-02T11:55:09+0200] [ALPM] upgraded lz4 (1:1.9.4-3 -> 1:1.10.0-2)
[2024-08-02T11:55:09+0200] [ALPM] upgraded gstreamer (1.24.5-2 -> 1.24.6-1)
[2024-08-02T11:55:09+0200] [ALPM] upgraded cryptsetup (2.7.3-1 -> 2.7.4-1)
[2024-08-02T11:55:09+0200] [ALPM] upgraded gst-plugins-base-libs (1.24.5-2 -> 1.24.6-1)
[2024-08-02T11:55:09+0200] [ALPM] upgraded gst-plugins-bad-libs (1.24.5-2 -> 1.24.6-1)
[2024-08-02T11:55:09+0200] [ALPM] upgraded usd (24.05-2 -> 24.08-1)
[2024-08-02T11:55:10+0200] [ALPM] upgraded blender (17:4.2.0-3 -> 17:4.2.0-4)
[2024-08-02T11:55:10+0200] [ALPM] upgraded discord (0.0.61-1 -> 0.0.62-1)
[2024-08-02T11:55:10+0200] [ALPM] upgraded fzf (0.54.0-1 -> 0.54.3-1)
[2024-08-02T11:55:10+0200] [ALPM] upgraded git (2.45.2-1 -> 2.46.0-1)
[2024-08-02T11:55:10+0200] [ALPM] upgraded gst-libav (1.24.5-2 -> 1.24.6-1)
[2024-08-02T11:55:10+0200] [ALPM] upgraded libgme (0.6.3-4 -> 0.6.3-5)
[2024-08-02T11:55:10+0200] [ALPM] upgraded gst-plugins-bad (1.24.5-2 -> 1.24.6-1)
[2024-08-02T11:55:10+0200] [ALPM] upgraded gst-plugins-base (1.24.5-2 -> 1.24.6-1)
[2024-08-02T11:55:10+0200] [ALPM] upgraded gst-plugins-good (1.24.5-2 -> 1.24.6-1)
[2024-08-02T11:55:10+0200] [ALPM] upgraded gst-plugins-ugly (1.24.5-2 -> 1.24.6-1)
[2024-08-02T11:55:10+0200] [ALPM] upgraded lib32-gstreamer (1.24.5-1 -> 1.24.6-1)
[2024-08-02T11:55:10+0200] [ALPM] upgraded lib32-gst-plugins-base-libs (1.24.5-1 -> 1.24.6-1)
[2024-08-02T11:55:10+0200] [ALPM] upgraded lib32-gst-plugins-base (1.24.5-1 -> 1.24.6-1)
[2024-08-02T11:55:10+0200] [ALPM] upgraded lib32-gst-plugins-good (1.24.5-1 -> 1.24.6-1)
[2024-08-02T11:55:11+0200] [ALPM] upgraded libavif (1.1.0-1 -> 1.1.1-1)
[2024-08-02T11:55:11+0200] [ALPM] upgraded libfabric (1.21.0-1 -> 1.22.0-1)
[2024-08-02T11:55:11+0200] [ALPM] upgraded lua-language-server (3.9.3-1 -> 3.10.0-1)
[2024-08-02T11:55:11+0200] [ALPM] upgraded python-importlib-metadata (7.1.0-1 -> 7.2.1-1)
[2024-08-02T11:55:11+0200] [ALPM] upgraded python-jsonschema (4.22.0-1 -> 4.23.0-1)
[2024-08-02T11:55:11+0200] [ALPM] upgraded tdb (1.4.10-3 -> 1.4.11-1)
[2024-08-02T11:55:11+0200] [ALPM] transaction completed
Also mesa or any driver related package wasn't neither. There was an update to amd-ucode days after my issue so maybe that's worth a try. I'm going to try updating my BIOS as it seems that my board PRIME X670-P has received some updates related to voltages of processors (this was mentioned in the post about the random rebooting problem) and I have one of the series mentioned. *Fingers crossed*
*Update edit* After upgrading my bios... my computer don't start at all. It can't go beyond POST so I'm going to RMA the motherboard, just in case. I've read that faulty cpus are far less common so I will start with the motherboard, and if the problem persist I will try RMA the cpu too. My luck with this build can't be worse at this rate ...
Last edited by blinkingbit (2024-08-16 15:58:47)
Offline
Aug 14 21:04:48 localhost.localdomain kernel: amdgpu 0000:11:00.0: [drm] fb0: amdgpudrmfb frame buffer device
Aug 14 21:04:48 localhost.localdomain kernel: amdgpu 0000:11:00.0: amdgpu: [mmhub] page fault (src_id:0 ring:173 vmid:0 pasid:0, for process pid 0 thread pid 0)
Aug 14 21:04:48 localhost.localdomain kernel: amdgpu 0000:11:00.0: amdgpu: in page starting at address 0x0000000000000000 from client 0x12 (VMC)
Aug 14 21:04:48 localhost.localdomain kernel: amdgpu 0000:11:00.0: amdgpu: MMVM_L2_PROTECTION_FAULT_STATUS:0x00043B5A
Aug 14 21:04:48 localhost.localdomain kernel: amdgpu 0000:11:00.0: amdgpu: Faulty UTCL2 client ID: VCNU (0x1d)
Aug 14 21:04:48 localhost.localdomain kernel: amdgpu 0000:11:00.0: amdgpu: MORE_FAULTS: 0x0
Aug 14 21:04:48 localhost.localdomain kernel: amdgpu 0000:11:00.0: amdgpu: WALKER_ERROR: 0x5
Aug 14 21:04:48 localhost.localdomain kernel: amdgpu 0000:11:00.0: amdgpu: PERMISSION_FAULTS: 0x5
Aug 14 21:04:48 localhost.localdomain kernel: amdgpu 0000:11:00.0: amdgpu: MAPPING_ERROR: 0x1
Aug 14 21:04:48 localhost.localdomain kernel: amdgpu 0000:11:00.0: amdgpu: RW: 0x1
Aug 14 21:04:48 localhost.localdomain kernel: amdgpu 0000:11:00.0: amdgpu: [mmhub] page fault (src_id:0 ring:157 vmid:0 pasid:0, for process pid 0 thread pid 0)
Aug 14 21:04:48 localhost.localdomain kernel: amdgpu 0000:11:00.0: amdgpu: in page starting at address 0x0000000000000000 from client 0x12 (VMC)
Aug 14 21:04:48 localhost.localdomain kernel: amdgpu 0000:11:00.0: amdgpu: MMVM_L2_PROTECTION_FAULT_STATUS:0x00003B3A
Aug 14 21:04:48 localhost.localdomain kernel: amdgpu 0000:11:00.0: amdgpu: Faulty UTCL2 client ID: VCNU (0x1d)
Aug 14 21:04:48 localhost.localdomain kernel: amdgpu 0000:11:00.0: amdgpu: MORE_FAULTS: 0x0
Aug 14 21:04:48 localhost.localdomain kernel: amdgpu 0000:11:00.0: amdgpu: WALKER_ERROR: 0x5
Aug 14 21:04:48 localhost.localdomain kernel: amdgpu 0000:11:00.0: amdgpu: PERMISSION_FAULTS: 0x3
Aug 14 21:04:48 localhost.localdomain kernel: amdgpu 0000:11:00.0: amdgpu: MAPPING_ERROR: 0x1
Aug 14 21:04:48 localhost.localdomain kernel: amdgpu 0000:11:00.0: amdgpu: RW: 0x0
This happens right when the GPU gets initialized.
In case the board comes back or whatever: is there a parallel windows installation (there're quite some partitions)
Offline
In case the board comes back or whatever: is there a parallel windows installation (there're quite some partitions)
No, there is no parallel windows installation.
In the end, the motherboard was fine and the culprit was the cpu. I thought cpus usually don't die that easily but it seems that AMD had some problems with gen7 voltages so my processor end up frying itself slowly over time. After RMA and using the replacement the system booted fine. I have updated the keyring and then a full system update and finally after almost 3 months the problem is solved.
Current journal seems fine now.
Last edited by blinkingbit (2024-10-18 10:29:22)
Offline
Pages: 1