You are not logged in.
Hello,
I use an Intel Hades Canyon NUC (NUC8i7HNK) on a daily basis and with recent kernels I can't get past POST.
After successful POST, the monitor(s) go black and peripheral lighting turns off. The device does not accept any input. The following kernel versions have caused the issue:
- 5.18.15-arch1-2
- 5.18.16
- 5.19
I'm currently on 5.18.14 which is working without issue. [update confirming that 5.18.15-arch1-1 also works] Some general specifications of my system are:
- CPU = Intel i7-8705G
- GPU = AMD ATI Radeon RX Vega M GL (amdgpu driver)
- GPU = Intel HD Graphics 630 (only for optional hardware acceleration)
Things I have tried include:
- Flashing the BIOS
- Adding the spectre_v2=off kernel parameter
I thought I would post this in case anyone runs into this issue. The way to recover the system is by mounting the broken system from inside the Arch USB installer:
mount /dev/<home_partition> /mnt
mount /dev/<boot_partition> /mnt/boot
mount -t proc proc /mnt/proc
mount -t sysfs sys /mnt/sys
mount -o bind /dev /mnt/dev
where
/dev/<{home,boot}_partition>
are the respective partition names on the hard drive. For me they were /dev/nvme0n1p1 and /dev/nvme0n1p3 but this will depend on your system. Then, enter the system with
chroot /mnt
Finally, roll back the kernel by installing the good version from pacman cache, like
pacman -U /var/cache/pacman/pkg/linux-<version>.pkg.tar.zst
Last edited by adigitoleo (2022-09-03 08:35:03)
Offline
Can you post full journalctl log from failed boot (like journalctl -b-n, where n=1 or 2 or more)? I'm sure you should have it, but you have to find which was it. You should see this by not clean shutdown, unless root partition fail to mount when you updated to this newer kernel and logs were not written.
Offline
Can you post full journalctl log from failed boot (like journalctl -b-n, where n=1 or 2 or more)? I'm sure you should have it, but you have to find which was it. You should see this by not clean shutdown, unless root partition fail to mount when you updated to this newer kernel and logs were not written.
Unfortunately, there is no journal entry, it seems the boot fails quite early. Or perhaps the partitions failed to mount as you said.
Offline
Does linux 5.19 currently in testing also fail to boot? Arch_Linux_Archive#How_to_downgrade_one_package
# pacman -U https://archive.archlinux.org/packages/l/linux/linux-5.19.arch1-1-x86_64.pkg.tar.zst.sig
Edit:
It it is still present in 5.19 can you try the kernel from https://bbs.archlinux.org/viewtopic.php … 7#p2048677
Last edited by loqs (2022-08-06 23:53:34)
Offline
Does linux 5.19 [...]
Issue persists with 5.19, no logs. The 5.18.12-00114-g48fda9af1df9 that you linked to works fine. I think I was upgrading kernels fairly regularly of late, so anything before 5.18.14 would probably work. Does this new info help, or do I need to bisect? I'll probably need some guidance with that.
Offline
$ git bisect start
status: waiting for both good and bad commits
$ git bisect bad v5.18.15
status: waiting for good commit(s), bad commit known
$ git bisect good v5.18.14
Bisecting: 79 revisions left to test after this (roughly 6 steps)
[6edb818732fc05fda495f5b3a749bd1cee01398b] iavf: Fix handling of dummy receive descriptors
https://drive.google.com/file/d/1HOVcyR … sp=sharing linux-5.18.14.r79.g6edb818732fc-1-x86_64.pkg.tar.zst
https://drive.google.com/file/d/1XWepGk … sp=sharing linux-headers-5.18.14.r79.g6edb818732fc-1-x86_64.pkg.tar.zst
Offline
https://drive.google.com/file/d/1HOVcyR … sp=sharing linux-5.18.14.r79.g6edb818732fc-1-x86_64.pkg.tar.zst
This one still works. OK I've cloned the arch linux kernel source from github.com/archlinux/linux and am doing a bisect from there. I used /proc/config.gz to create the .config file and I'll use
make olddefconfig
make
to build the kernels. Just posting to double check if that is correct or if anything else is required.
Edit: Build failed, looks like I'm missing some kind of header data, let me read through the wiki again. OK silly me, I forgot to check the build deps in the PKGBUILD.
Last edited by adigitoleo (2022-08-09 10:46:48)
Offline
git bisect good
Bisecting: 39 revisions left to test after this (roughly 5 steps)
[52ee7f5c4811ce6be1becd14d38ba1f8a8a0df81] tcp: Fix data-races around sysctl_tcp_recovery.
https://drive.google.com/file/d/1MkhqG_ … sp=sharing linux-5.18.14.r119.g52ee7f5c4811-1-x86_64.pkg.tar.zst
https://drive.google.com/file/d/1Pd39i4 … sp=sharing linux-headers-5.18.14.r119.g52ee7f5c4811-1-x86_64.pkg.tar.zst
This is the PKGBUILD I have been building from
https://drive.google.com/file/d/1ci7mLM … sp=sharing linux-5.18.14.r119.g52ee7f5c4811-1.src.tar.gz
Offline
linux-5.18.14.r119.g52ee7f5c4811-1-x86_64.pkg.tar.zst
That one also works. I can't seem to find that commit though to start from there. Are you using the arch linux fork or just upstream kernel source?
Offline
It is from the stable kernel release which Arch directly imports with any local changes added on the end
https://github.com/archlinux/linux/comm … 8.15-arch1 -> Older 52ee7f5 tcp: Fix data-races around sysctl_tcp_recovery.
https://git.kernel.org/pub/scm/linux/ke … f8a8a0df81
git bisect good
Bisecting: 19 revisions left to test after this (roughly 4 steps)
[5f2d2c2af16f3b1d7e16b1a8af37bda561b282a1] dlm: fix pending remove if msg allocation fails
https://drive.google.com/file/d/1kkZRXR … sp=sharing linux-5.18.14.r139.g5f2d2c2af16f-1-x86_64.pkg.tar.zst
https://drive.google.com/file/d/1G_-4QV … sp=sharing linux-headers-5.18.14.r139.g5f2d2c2af16f-1-x86_64.pkg.tar.zst
Have you tried adding the kernel parameter mitigations=off to one of the bad kernels?
Edit:
If the above is good
https://drive.google.com/file/d/1sj6Eh_ … sp=sharing linux-5.18.14.r149.gdd5663fc13b9-1-x86_64.pkg.tar.zst
https://drive.google.com/file/d/1WqlAUR … sp=sharing linux-headers-5.18.14.r149.gdd5663fc13b9-1-x86_64.pkg.tar.zst
Last edited by loqs (2022-08-10 02:40:52)
Offline
Both of those revisions (r139, r149) also work.
Have you tried adding the kernel parameter mitigations=off to one of the bad kernels?
Tried that with 5.19.1 and still got the same black screen.
Offline
git bisect good
Bisecting: 4 revisions left to test after this (roughly 2 steps)
[49cbb4820e4f1895130755732485afb2d18508f9] watchqueue: make sure to serialize 'wqueue->defunct' properly
https://drive.google.com/file/d/19I4dOP … sp=sharing linux-5.18.14.r154.g49cbb4820e4f-1-x86_64.pkg.tar.zst
https://drive.google.com/file/d/1IckiRn … sp=sharing linux-headers-5.18.14.r154.g49cbb4820e4f-1-x86_64.pkg.tar.zst
Edit:
The bisection is getting very close to 5.18.15 so check if that is good:
https://drive.google.com/file/d/1hpZAql … sp=sharing linux-headers-5.18.15-1-x86_64.pkg.tar.zst
https://drive.google.com/file/d/1uhiQYO … sp=sharing linux-5.18.15-1-x86_64.pkg.tar.zst
Last edited by loqs (2022-08-15 02:51:05)
Offline
linux-5.18.14.r154.g49cbb4820e4f-1-x86_64.pkg.tar.zst
Also works. Thanks for going through this. I haven't used the headers by the way, but that's not necessary if I'm just using amdgpu right?
Offline
git bisect good
Bisecting: 2 revisions left to test after this (roughly 1 step)
[a8f27ccc12baca1394dacd2c21e5bd7bafba3853] ASoC: SOF: pm: add definitions for S4 and S5 states
https://drive.google.com/file/d/1wAyK73 … sp=sharing linux-headers-5.18.14.r156.ga8f27ccc12ba-1-x86_64.pkg.tar.zst
https://drive.google.com/file/d/1rRzBBy … sp=sharing linux-5.18.14.r156.ga8f27ccc12ba-1-x86_64.pkg.tar.zst
Please also see my edit to post #12
Offline
That works, and the 5.18.15 you shared also works (strange, I thought 5.18.15 caused the issue before). However, a 5.18.16 which I had in pacman cache still causes the issue. Thanks for the help, I think I know how to track it down now I'll just bisect between those two versions. Surely I'll find the bad commit at some point.
Offline
I would suggest you check 5.18.15.arch1-1 from your system's cache or the ALA before starting on the bisect to confirm that is good.
Offline
5.18.15-arch1-1 works but 5.18.15-arch1-2 does not. That narrows it down a bit. I should have tried those two before, apologies.
Offline
What if you add the boot parameter memtest=0
Last edited by loqs (2022-08-16 02:18:15)
Offline
What if you add the boot parameter memtest=0
It works! Now I'm on 5.19.6-arch1-1. Does that mean that I have bad RAM?
I suppose the other option is that the compile flag for enabling memtest was removed added at some point from the Arch kernel config?
Oh, I found it: https://github.com/archlinux/svntogit-p … 58c0b97976
So memtest wasn't being run before, and then they turned it on, which would appear like an unresponsive boot for a few minutes hours, right? Anyway I'll probably leave it disabled for now by deleting the boot parameter. I can always run memtest if I want when I have time.
Last edited by adigitoleo (2022-09-03 08:35:29)
Offline