You are not logged in.

#1 2022-08-05 17:27:25

Brudhu
Member
Registered: 2022-08-04
Posts: 2

[SOLVED] GPU has fallen off the bus

Hi!

Edit: the problem was hardware and was fixed after a main board replacement by System76.

I bought a System76 Gazelle laptop (with a RTX 3060 6GB GPU) about a month ago. It came with Ubuntu and the first thing I did was installing a larger SSD unit in the second slot and installing Arch Linux in it (I used the archinstall script).

The thing is: around twice a day the UI freezes, the fans go full speed and I have to long press the power button to reset it. It doesn't work trying to switch to one another virtual console with CTRL + ALT + F2. It doesn't matter if the GPU is super loaded or not (I ran some GPU benchmarks from the Arch Wiki (Benchmarking) and it never froze while running them).

I checked the logs and I get the following message:

Aug 04 11:59:41 archlinux kernel: NVRM: A GPU crash dump has been created. If possible, please run
                                  NVRM: nvidia-bug-report.sh as root to collect this data before
                                  NVRM: the NVIDIA kernel module is unloaded.
Aug 04 11:59:41 archlinux kernel: [133B blob data]
Aug 04 11:59:41 archlinux kernel: NVRM: GPU 0000:01:00.0: GPU has fallen off the bus.
Aug 04 11:59:41 archlinux kernel: NVRM: Xid (PCI:0000:01:00): 79, pid='<unknown>', name=<unknown>, GPU has fallen off the bus.
Aug 04 11:59:41 archlinux kernel: NVRM: GPU at PCI:0000:01:00: GPU-1e54eaed-9458-efa9-3b9b-d332a9300303

When it happens it says to run nvidia-bug-report.sh as root to collect debug information - which I did and attached to this post.

I'm using my NVIDIA GPU only, not using the Intel one. I set it up by following the Arch Wiki (NVIDIA_Optimus).

System Info:

Operating System: Arch Linux
KDE Plasma Version: 5.25.4
KDE Frameworks Version: 5.96.0
Qt Version: 5.15.5
Kernel Version: 5.18.16-arch1-1 (64-bit)
Graphics Platform: X11
Processors: 20 × 12th Gen Intel® Core™ i7-12700H
Memory: 62,7 GiB of RAM
Graphics Processor: NVIDIA GeForce RTX 3060 Laptop GPU/PCIe/SSE2
Manufacturer: System76
Product Name: Gazelle
System Version: gaze17-3060-b

Kernel Parameters:
(I know some of them may not make sense, but I'm a desperate Kernel newbie trying to make my laptop stop crashing)

[luvizotto@archlinux ~]$ cat /proc/cmdline
initrd=\intel-ucode.img initrd=\initramfs-linux.img root=PARTUUID=accf2e61-13f2-4dae-b824-4c5856c99913 rw intel_pstate=no_hwp rootfstype=ext4 ibt=off rcutree.rcu_idle_gp_delay=2 intel_idle.max_cstate=1 pcie_aspm=off nvidia-drm.modeset=1

NVidia Info:
Driver Version: 515.65.01

[luvizotto@archlinux ~]$ systemctl list-unit-files | grep nvidia
nvidia-hibernate.service                                                      enabled         disabled
nvidia-persistenced.service                                                   enabled         disabled
nvidia-powerd.service                                                         disabled        disabled
nvidia-resume.service                                                         enabled         disabled
nvidia-suspend.service                                                        disabled        disabled

What I've tried so far:

  • Using linux-lts + nvidia-lts

  • Multiple combinations of kernel parameters (the current config is the best I got.. before that I could get a crash every 2 - 3 hours)

  • Limitting GPU clocks with

    nvidia-smi -lgc 300,1500

Important information:
It works if I use the original Ubuntu that came installed in the laptop, which should mean it's not a hardware problem (?), but I love Arch and really want this to be fixed.

Please, any help debugging this is highly appreciated. Any idea on what could be the problem? Let me know if I missed any important information please.

Attachments:
journalctl log example 1
journalctl log example 2
nvidia-bug-report.sh example 1
nvidia-bug-report.sh example 2

Last edited by Brudhu (2022-09-26 14:22:09)

Offline

#2 2022-09-26 09:45:11

gustaw.daniel@gmail.com
Member
Registered: 2022-09-26
Posts: 1

Re: [SOLVED] GPU has fallen off the bus

I bought Hyperbook (company from Poland similar to System76 that use the same Clevo case) with RTX 3060 on Motherboard PD5x_7xPNP_PNN_PNT.

Can't start X on arch linux, tried both with and without graphic card.
https://wiki.archlinux.org/title/NVIDIA_Optimus

but now i am only able to run system in text mode.

There are topics that describe similar problem:

https://forums.developer.nvidia.com/t/5 … 0ti/228954
https://bugs.archlinux.org/task/75995

Offline

#3 2022-09-26 14:20:06

Brudhu
Member
Registered: 2022-08-04
Posts: 2

Re: [SOLVED] GPU has fallen off the bus

Good to know, thanks for that.

In my case the problem was actually a hardware problem. After some time I was able to reproduce the issue in the stock Ubuntu that came with the laptop. I ended up sending the laptop back to System76 for a repair, they replaced the main board and the problem seems to be fixed after that.

Offline

Board footer

Powered by FluxBB