You are not logged in.

#1 2022-06-22 10:34:36

Lizreu
Member
Registered: 2022-06-22
Posts: 4

[Partially solved] Stock Arch kernels rebooting mid-boot

I'm having an issue where any Arch kernel (stock or modified) doesn't boot anymore on my system, and this seems limited to Arch-based kernels. This includes trying to boot from an Arch LiveUSB - so this isn't limited to my kernel/system configuration. Oddly enough, Manjaro boots fine, and I'm using Manjaro's stock kernel as a replacement for now.

This issue started happening after I moved abroad with my PC, and, specifically:
* I swapped out my GPU for another one (RX 6800 XT -> RX 6500 XT), old one was too bulky to take with me
* Two RAM slots on my mobo got damaged and don't work anymore (probably damage during transport)
* I upgraded the BIOS on my mobo

Current HW:
* CPU: Ryzen 5950X
* RAM: Kingston HyperX, 2x16 GB
* GPU: Sapphire RX 6500 XT
* Motherboard: MSI MPG B550 Gaming Carbo WiFi

This system used to boot perfectly fine before too, just with a different GPU - so this is my main guess right now.

Specifically, the problems looks like this:
* GRUB loads fine.
* After GRUB passes the control to the kernel, it loads for about two seconds and then bootloops.
* This happens before systemd even has a chance to kick in, so I haven't been able to recover any logs from the kernel. There's about two lines of unrelated logs on-screen before this happens.

Things I've tried:
* Stock Arch kernel, modified Arch kernels compiled for my CPU arch - no luck
* Setting 'panic' kernel param - no effect, still instantly reboots
* Updating/removing CPU microcode updates - no effect
* Setting 'nomodeset' (suspecting GPU) - no effect
* Setting 'debug' to capture any extra output - no effect
* Blacklisting the 'radeon' driver (doesn't seem like it actually ever even gets to loading it anyway)

I'm looking for any tips to troubleshoot this further. I've had very limited luck extracting any useful output so far, but there's probably something I still haven't tried.
Once I managed to get it to load into systemd (not sure what I did then, it was a very frantic troubleshooting session just trying everything across the board), but the video output froze as soon as it got to loading the radeon driver. The system did end up loading further up to the login prompt - I managed to recover some systemd logs from that boot session for once. Couldn't replicate it since.

This feels like a hardware or a bios misconfiguration issue, but oddly enough the Manjaro kernel works fine with no issues, and I can't quite figure out what it is that the Manjaro kernel does different that lets it boot on my system.

Any help, tips or pointers would be much appreciated.

Last edited by Lizreu (2022-06-22 19:49:57)

Offline

#2 2022-06-22 14:32:39

seth
Member
Registered: 2012-09-03
Posts: 29,780

Re: [Partially solved] Stock Arch kernels rebooting mid-boot

Manjaro kernel is same version w/ same kernel parameters?
Did you try the LTS kernel?
"bootloops" => does every cycle start w/ a BIOS post or does the kernel reload?

https://wiki.archlinux.org/title/Genera … bug_output
ignore_loglevel and earlyprintk might help - also "boot_delay=1000" (stalls 1s between every message)

Finally try "acpi=off" and "pcie_aspm=off"

Online

#3 2022-06-22 17:22:00

Lizreu
Member
Registered: 2022-06-22
Posts: 4

Re: [Partially solved] Stock Arch kernels rebooting mid-boot

seth wrote:

Finally try "acpi=off" and "pcie_aspm=off"

"acpi=off" did the trick. Booted from an LTS kernel successfully. Doesn't boot without "acpi=off".

"pcie_aspm=off" makes no difference.

For completeness sake:

Kernel versions I tried:
Normal kernel - 5.18.2.arch1-1 (not tested yet, but probably works too)
LTS Kernel - 5.15.45-1-lts (works now)
Manjaro kernel - (from AUR) linux-manjaro-xanmod 5.17.7-1 (works without "acpi=off" tweak). I'm not actually sure if it's the stock one Manjaro uses, but both this one and the one coming on the Manjaro LiveCD worked for me.

Other remarks:
"bootloops" => sometimes just hangs entirely, usually restarts with BIOS post.
boot_delay seemed to just hang the kernel with no output at all. Tried anything from 50 to 1000 ms for that one.
"earlyprintk=efi,keep" - no difference.
With "ignore_loglevel" there's a bunch of output, but nothing alarming at a glance. It gets right about to initializing/probing nVME and SATA devices from what I can tell.  Here's a screencap.

Huge thanks for suggesting an immediate fix that worked, but I guess the question now is why it works. Any ideas?

Offline

#4 2022-06-22 18:35:44

loqs
Member
Registered: 2014-03-06
Posts: 14,886

Re: [Partially solved] Stock Arch kernels rebooting mid-boot

What is the VID:PID of the NVME?  I think it needs a quirk to ignore bad subnqn.
Edit:
That only generates a warning,   so should not be the causing a boot failure.

Last edited by loqs (2022-06-22 18:40:33)

Offline

#5 2022-06-22 19:34:48

Lizreu
Member
Registered: 2022-06-22
Posts: 4

Re: [Partially solved] Stock Arch kernels rebooting mid-boot

loqs wrote:

What is the VID:PID of the NVME?  I think it needs a quirk to ignore bad subnqn.
Edit:
That only generates a warning,   so should not be the causing a boot failure.

Yeah, the NVMe stuff has been there forever. I've just ignored it so far.

Small update:
The trick seems to lie in the xanmod kernel patchset. This one boots no-problemo, and Manjaro seems to use it too. I'll keep using just the xanmod patchset without the Manjaro stuff for now, since I tend to forget about stray kernel parameters left over in my configs, and its pretty close to what I used to use before that.

Scouring the web on the "acpi=off" thing I can't seem to find any good leads, but this seems to suggest something in the stock kernel doesn't want to handle the ACPI interface in the BIOS. Maybe this is related to the BIOS update I did earlier. Or I might just be pulling stuff out of my nethers here.

For the time being I'm content with what I have so I'll mark this as partially solved, because the ACPI mystery is still unsolved, but there's at least two indirect workarounds. Documenting this for any poor soul that might stumble upon this in the future:
* Try "acpi=off" in the kernel parameter
* Try a custom kernel like the xanmod patchset

Offline

#6 2022-06-22 21:29:16

seth
Member
Registered: 2012-09-03
Posts: 29,780

Re: [Partially solved] Stock Arch kernels rebooting mid-boot

Could also be a kernel config difference (ie. the xanmod/manjaro config differs from arch in some regard)
You could run "zgrep -i acpi /proc/config.gz | sort" on either kernel and diff the outputs…

Online

#7 2022-06-23 02:45:02

loqs
Member
Registered: 2014-03-06
Posts: 14,886

Re: [Partially solved] Stock Arch kernels rebooting mid-boot

seth wrote:

Could also be a kernel config difference (ie. the xanmod/manjaro config differs from arch in some regard)
You could run "zgrep -i acpi /proc/config.gz | sort" on either kernel and diff the outputs…

Or build the Arch linux package with the config from one of the working kernels.  Unsupported options will be dropped the kernel build system.
Edit:
config from linux-manjaro-xanmod 5.17.7-1
https://drive.google.com/file/d/124diFU … sp=sharing linux-5.18.6.arch1-1.1-x86_64.pkg.tar.zst
https://drive.google.com/file/d/1OuMB6U … sp=sharing linux-headers-5.18.6.arch1-1.1-x86_64.pkg.tar.zst

Last edited by loqs (2022-06-23 07:59:57)

Offline

#8 2022-06-23 09:56:55

Lizreu
Member
Registered: 2022-06-22
Posts: 4

Re: [Partially solved] Stock Arch kernels rebooting mid-boot

*** ./5.18.2-arch1-1.env        2022-06-23 12:35:23.093372326 +0300
--- ./5.18.4-xanmod1-1.env      2022-06-23 12:34:26.540431769 +0300
***************
*** 9 ****
! # CONFIG_ACPI_DEBUGGER is not set
--- 9,10 ----
! CONFIG_ACPI_DEBUGGER=y
! CONFIG_ACPI_DEBUGGER_USER=y
***************
*** 32 ****
--- 34 ----
+ CONFIG_ACPI_CUSTOM_DSDT_FILE=""
***************
*** 42 ****
! CONFIG_ACPI_CUSTOM_METHOD=m
--- 44 ----
! # CONFIG_ACPI_CUSTOM_METHOD is not set
***************
*** 43 ****
--- 46 ----
+ # CONFIG_ACPI_REDUCED_HARDWARE_ONLY is not set
***************
*** 54 ****
! CONFIG_ACPI_APEI_ERST_DEBUG=m
--- 57 ----
! # CONFIG_ACPI_APEI_ERST_DEBUG is not set
***************
*** 64 ****
! CONFIG_X86_ACPI_CPUFREQ=m
--- 67 ----
! CONFIG_X86_ACPI_CPUFREQ=y
***************
*** 93 ****
! CONFIG_XEN_ACPI_PROCESSOR=m
--- 96 ----
! CONFIG_XEN_ACPI_PROCESSOR=y
***************
*** 98 ****
! # CONFIG_THINKPAD_ACPI_DEBUGFACILITIES is not set
--- 101 ----
! CONFIG_THINKPAD_ACPI_DEBUGFACILITIES=y

Some differences there. Not really sure what to make of all this.

EDIT: I'd tinker around to see if any of these suddenly make the Arch stock kernel work on my hardware, but at the moment I don't have much time. Maybe on the weekend.

Last edited by Lizreu (2022-06-23 09:58:06)

Offline

#9 2022-06-23 14:08:07

seth
Member
Registered: 2012-09-03
Posts: 29,780

Re: [Partially solved] Stock Arch kernels rebooting mid-boot

Nothing there really sticks out - testing the default kernel w/ the xianmod config to see whether it's just the config at all.
loqs being bored kindly already provided such kernel for you smile

Online

Board footer

Powered by FluxBB