You are not logged in.
Hello.
I have a 2 dGPU setup - RTX 4060 and GTX 970, using 4060 for renderer and 970 for connecting my old display via DVI-D. For some reason, I recently started having this issue from time to time, the 970 would start artifacting (4060 outputs just fine), the system would freeze soon after, and rebooting the system would result in 970 still artifacting and this error being shown on the 4060 output:
[ OK ] Reached target Graphical Interface.
[ OK ] Mounted /mnt/coldest.
[ 1244.666366] nvidia-modeset: ERROR: GPU:1: Idling display engine timed out: 0x0000957d:0:0:417
I’d think the 970 finally got cooked after years and years of service, but no, booting into Windows 11 and then back into Linux temporarily fixes the issue. Any advice? I have no idea how to fix this. Using proprietary nvidia drivers (because nvidia-open doesn’t support 970) latest version as of right now.
Offline
This is one of those cases where I don't see the benefit of the 970 any workloiad if you actually have access to the newer GPU. Get yourself an adapter cable for the old display protocol, hook it into the new card. Though DVI did have some issues in a few newer drivers, afaik these should be fixed by now/it wouldn't work at all if it was still that issue
Last edited by V1del (2024-11-29 16:08:37)
Online
I also use 970 for virtual machines with passthrough (either to share my PC with a friend, or run an older OS which doesn't have the drivers for 4060), so I'd much rather prefer to keep it.
It had been working flawlessly since February of this year, when I installed Arch, up until 14th of November, when it first did this.
Plus, booting into Windows to fix the issue seems to imply that it's not even hardware related if it works just fine there and on Linux afterwards too.
Offline
When did this start? After a certain kernel/nvidia driver update? I could see some hickup with the fbdev enablement/firmware issues if a reboot from Windows works so try the kernel parameters
nvidia_drm.modeset=1 nvidia_drm.fbdev=0 nvidia.NVreg_EnableGpuFirmware=0
Online
I unfortunately do not remember, since it didn't start immediately after an update but at some point during runtime later.
I do not have fbdev and GPU firmware enabled, here are my kernel parameters:
rd.luks.name=7e8b3f21-2d15-46ec-8682-951972ede506=root root=/dev/mapper/root rw rootflags=subvol=/@ intel_iommu=on iommu=pt rd.driver.pre=vfio-pci nvidia_drm.modeset=1
But I guess I can try explicitly setting them to 0 and run with these for some time to see if the issue reappears.
Offline
that's one thing I know that's changed in the driver in the last few releases, fbdev got enabled by default as well as the GSP firmware.
Online
Well, that'd make all the sense, thank you. I'll test it.
Offline
After testing for a bit, disabling GSP firmware didn't help - artifacting still occurs, and disabling fbdev leads to a black sreen after SDDM. I guess I'm out of luck.
Offline
Disabling fbdev will leave you w/ the simplydumb device on 6.12 kernels, the nvidida_drm.modeset=1 hack was removed.
This has caused multiple issues already, https://gitlab.archlinux.org/archlinux/ … /issues/94
Try the behavior w/ the LTS kernel
Offline
I'm using UKIs, how would I go about safely adding the LTS kernel?
From what I can tell from the wiki, I need to install the package, navigate to /etc/mkinitcpio.d/linux-lts.preset, and edit it accordingly.
As for Nvidia drivers, I'd just replace nvidia with nvidia-dkms.
Is this correct or am I missing something? Is there a way to check for packages that depend on a specific kernel so I can replace them with dkms versions?
Offline
Any package that ships a kernel module, doing a
pacman -Qo /usr/lib/modules
will give you a list of packages that write something into that path. FWIW you could also opt for nvidia-lts instead of nvidia-dkms for the LTS kernel (and note that all DKMS variants require the appropriate linux-headers (or linux-lts-headers) package to be present before DKMS can build a module)
Online
Thank you everyone, managed to get LTS kernel working, was way easier than I thought.
Disabled both GSP and fbdev and logged into Plasma just fine:
[mine_diver@ABLPHA ~]$ sudo cat /sys/module/nvidia_drm/parameters/fbdev
N
[mine_diver@ABLPHA ~]$ cat /proc/driver/nvidia/params | grep EnableGpuFirmware
EnableGpuFirmware: 0
EnableGpuFirmwareLogs: 2
Time to see if it actually helps against the artifacts.
Offline