You are not logged in.

#1 2025-03-08 15:09:49

der-joel
Member
Registered: 2025-03-08
Posts: 3

[Solved] Nvidia GPU has fallen off the bus

Hey guys,

I've switched to using arch as main OS about half a year ago. I must say its the best linux OS I've used so far.
However one issue that I just can't fix is my system crashing but the sound keeps playing.
It's occuring seamingly at random about twice a day without being tied to one specific application or heavy GPU load.
The system logs show: "79, GPU has fallen off the bus"
Nvidia lists a couple of reasons for this particular error on their website https://docs.nvidia.com/deploy/xid-erro … or-listing.
I am pretty sure that my GPU is working fine, because I could not reproduce the crash on windows (dual boot).
The nvidia drivers were installed following the official arch wiki.

I have searched the internet for a good while and tried the following fixes:
- Updated drivers (pacman -Syu)
- Clean PCIe slot and power supply
- BIOS update
- Disable ASPM
- Monitored GPU temperature (around 65 degrees celsius, seems normal)
- Set some recommended kernel parameters
- Re-installing nvidia drivers
- Switching from wayland to X11

journalctl -b -1 -p "warning":

Mär 08 15:35:37 archlinux kwin_wayland[911]: kwin_wayland_drm: The main thread was hanging temporarily!
Mär 08 15:35:37 archlinux kernel: NVRM: GPU at PCI:0000:27:00: GPU-9ece4788-6d60-e616-f628-a2f7ba1d9229
Mär 08 15:35:37 archlinux kernel: NVRM: Xid (PCI:0000:27:00): 79, GPU has fallen off the bus.
Mär 08 15:35:37 archlinux kernel: NVRM: GPU 0000:27:00.0: GPU has fallen off the bus.
Mär 08 15:35:37 archlinux kernel: NVRM: A GPU crash dump has been created. If possible, please run
                                   NVRM: nvidia-bug-report.sh as root to collect this data before
                                   NVRM: the NVIDIA kernel module is unloaded.
Mär 08 15:35:37 archlinux kernel: NVRM: Xid (PCI:0000:27:00): 154, GPU recovery action changed from 0x0 (None) to 0x1 (GPU Reset Required)
Mär 08 15:35:37 archlinux kwin_wayland[911]: kwin_wayland_drm: Failed to create a framebuffer: Invalid argument
Mär 08 15:35:48 archlinux kwin_wayland[911]: kwin_wayland_drm: Failed to create a framebuffer: Invalid argument
Mär 08 15:35:48 archlinux kwin_wayland[911]: kwin_scene_opengl: A graphics reset attributable to the current GL context occurred.
Mär 08 15:35:48 archlinux kernel: [drm:nv_drm_gem_alloc_nvkms_memory_ioctl [nvidia_drm]] *ERROR* [nvidia-drm] [GPU ID 0x00002700] Failed to allocate NVKMS memory for GEM object
Mär 08 15:35:48 archlinux kernel: [drm:nv_drm_gem_alloc_nvkms_memory_ioctl [nvidia_drm]] *ERROR* [nvidia-drm] [GPU ID 0x00002700] Failed to allocate NVKMS memory for GEM object
Mär 08 15:35:48 archlinux kernel: [drm:nv_drm_gem_alloc_nvkms_memory_ioctl [nvidia_drm]] *ERROR* [nvidia-drm] [GPU ID 0x00002700] Failed to allocate NVKMS memory for GEM object
Mär 08 15:35:48 archlinux kwin_wayland[911]: kwin_wayland_drm: Checking test buffer failed!
Mär 08 15:35:48 archlinux kernel: [drm:nv_drm_gem_alloc_nvkms_memory_ioctl [nvidia_drm]] *ERROR* [nvidia-drm] [GPU ID 0x00002700] Failed to allocate NVKMS memory for GEM object
Mär 08 15:35:48 archlinux kernel: [drm:nv_drm_gem_alloc_nvkms_memory_ioctl [nvidia_drm]] *ERROR* [nvidia-drm] [GPU ID 0x00002700] Failed to allocate NVKMS memory for GEM object
Mär 08 15:35:48 archlinux kernel: [drm:nv_drm_gem_alloc_nvkms_memory_ioctl [nvidia_drm]] *ERROR* [nvidia-drm] [GPU ID 0x00002700] Failed to allocate NVKMS memory for GEM object
Mär 08 15:35:48 archlinux kernel: [drm:nv_drm_gem_alloc_nvkms_memory_ioctl [nvidia_drm]] *ERROR* [nvidia-drm] [GPU ID 0x00002700] Failed to allocate NVKMS memory for GEM object
Mär 08 15:35:48 archlinux kernel: [drm:nv_drm_gem_alloc_nvkms_memory_ioctl [nvidia_drm]] *ERROR* [nvidia-drm] [GPU ID 0x00002700] Failed to allocate NVKMS memory for GEM object

Specs:
Kernel: 6.13.5-arch1-1
DE: KDE Plasma 6.3.2
WM: kwin
DSP: wayland
GPU: NVIDIA GeForce GTX 970
Kernel Parameters: GRUB_CMDLINE_LINUX_DEFAULT="loglevel=3 nvidia_drm.modeset=1 nvidia_drm.fbdev=1 nvidia.NVreg_EnableGpuFirmware=0 nvidia.Nvreg_PreserveVideoMemoryAllocations=1 pcie_aspm=off"

pacman -Q | grep "nvidia\|vk\|vulkan":

lib32-nvidia-utils 570.124.04-1
lib32-opencl-nvidia 570.124.04-1
lib32-vkd3d 1.14-1
lib32-vulkan-icd-loader 1.4.304.1-1
nvidia 570.124.04-2
nvidia-settings 570.124.04-1
nvidia-utils 570.124.04-1
opencl-nvidia 570.124.04-1
vulkan-headers 1:1.4.304.1-2
vulkan-icd-loader 1.4.304.1-1
vulkan-tools 1.4.304.1-1

I'd really appreciate any ideas on how to fix this. I absolutely want to keep my current os setup but this is a bit too much of a disturbance to be ignored.
Thanks in advance!

Last edited by der-joel (2025-03-24 14:50:58)

Offline

#2 2025-03-08 18:04:01

xerxes_
Member
Registered: 2018-04-29
Posts: 922

Re: [Solved] Nvidia GPU has fallen off the bus

Did you read this from nvidia site you linked:

4.9. Xid 79: GPU has fallen off the bus

This event is logged when the GPU driver attempts to access the GPU over its PCI Express connection and finds that the GPU is not accessible.

This event is often caused by hardware failures on the PCI Express link causing the GPU to be inaccessible due to the link being brought down. Reviewing system event logs and kernel PCI event logs may provide additional indications of the source of the link failures.

This event may also be cause by failing GPU hardware or other driver issues.

So post full journalctl log: 'sudo journalctl -b' .
And when system crash, you can't do anything (switch to different VT, use sysrq)?

Offline

#3 2025-03-08 20:59:51

konstancja
Member
Registered: 2025-03-08
Posts: 2

Re: [Solved] Nvidia GPU has fallen off the bus

I was having a similar issue with both the nvidia and nvidia-open drivers on version 570.124.04.

With some digging it seems apparently this issue may be fixed in 570.124.06, but that's not in the repositories yet.

As a workaround, adding the following to my kernel cmdline to disable the GSP firmware with the proprietary nvidia package solved the issue for me, note that disabling GSP is only an option on the proprietary driver:

nvidia.NVreg_EnableGpuFirmware=0

After this is applied, the output of 'nvidia-smi -q' should show the following:

$ nvidia-smi -q | grep GSP
    GSP Firmware Version                  : N/A

Offline

#4 2025-03-09 14:13:09

der-joel
Member
Registered: 2025-03-08
Posts: 3

Re: [Solved] Nvidia GPU has fallen off the bus

@xerxes_
I have tried to switch to a different VT via keyboard shortcuts after the crash but that did not work.
I've not heard about sysrq yet. Sounds promising. I've enabled it and will try when the next crash occurs.
Here's the full journalctl log.
Doesn't look like there is anything suspicious except for this line right before the crash:

Mär 08 15:35:37 archlinux kwin_wayland[911]: kwin_wayland_drm: The main thread was hanging temporarily!

@konstancja
I already set the parameter, since it was mentioned in another forum post. Sadly it did not fix the issue for me.

Thanks for your help!

Last edited by der-joel (2025-03-09 14:17:21)

Offline

#5 2025-03-09 16:57:29

seth
Member
Registered: 2012-09-03
Posts: 64,790

Re: [Solved] Nvidia GPU has fallen off the bus

Did you forget to attach the dedicated power supply?
Did you select the proper PCIe slot (labled "PEG")?

Online

#6 2025-03-21 13:31:56

der-joel
Member
Registered: 2025-03-08
Posts: 3

Re: [Solved] Nvidia GPU has fallen off the bus

After a bit of tinkering the issue seems to be fixed. I've waited for a few days and the crash did not reappear.
Here are the things I've changed. Im not sure what fixed it:
- Cleaned my case again especially the GPU fan
- Changed boot parameters to this: GRUB_CMDLINE_LINUX_DEFAULT="loglevel=3 nvidia_drm.modeset=1 nvidia_drm.fbdev=1 nvidia.NVreg_EnableGpuFirmware=0 pcie_aspm=off"
- Rebuild boot partition from scratch and re-installed linux kernel
- Re-installed GRUB

I also rewired GPU the power supply and measured the cables with a multimeter. That didn't seem to be the issue although it could've fixed a loose contact or something like that.
Thanks to everyone that helped and provided tips. It's much appreciated!
I think this can be closed.

Last edited by der-joel (2025-03-24 14:51:24)

Offline

#7 2025-03-21 15:09:28

seth
Member
Registered: 2012-09-03
Posts: 64,790

Re: [Solved] Nvidia GPU has fallen off the bus

\o/
Mark resolved threads by editing your initial posts subject - so others will know that there's no task left, but maybe a solution to find.
Thanks.

Online

Board footer

Powered by FluxBB