You are not logged in.
I run arch linux on a ASUS TUF Gaming A16(FA607PV, which has r9 7940hx and nvidia 4060), and I'm faced with random kernel panic on it.(The caps lock button light just suddenly starts blinking and the screen is frozen)
I have tried to get the crashs log from journalctl, but it's not there(Maybe systemd is unable to note it down) And the system doesn't reboot automatically and I have to force it to reboot(So the /proc/last_kmsg just doesn't exist)
I've just enabled systemd-pstore and the kdumpst, maybe they will get the error log for me(but......as this kernel panic is really random, maybe it will take some time for me to get it)
Also, the kernel panic seems to be related with the memory (I bought this laptop with two 32g memory preinstalled by ASUS), as the memtester and memtest86+ always give the same error in every loop, but the windows memory tester never found any faults on them(I inserted these memory on another computer)
Does anyone get a similliar issue?
Also, the nvidia driver nv_queue allways panic on every boot, and I can't even blacklist it(It didn't appeared in my module list after boot up, and modprobe blacklist and cmdline blacklist never worked for it)
The nv_queue log is just the same in the post here: https://bbs.archlinux.org/viewtopic.php?id=297997
Does it have any affect on my system or is there any ways to fix this?
(My nvidia card rtx 4060 mobile seems to work well with ollama....)
Last edited by fish4terrisa-MSDSM (2024-08-15 02:52:40)
Offline
This crashes seems to related with the problem noted at here:https://forums.developer.nvidia.com/t/nvidia-driver-kernel-random-call-trace/302487/5
However, the crashes still exist.....but at a bit lower rate.
Maybe that's caused by the nv_queue crashes, and I'll try to bisect the kernel to find out the problem(No nv_queue crashes found in linux-lts)
Offline
Found out that it seems to be a cstate or pci msr problem.
With rcu_nocbs=0-31 processor.max_cstate=5 pci=nomsi added, it's stable now.
Closed as solved
Offline