You are not logged in.
I am trying to make nvidia-xrun work on my laptop and the problem is that it works, but only once. After manually repeating the steps from the script I narrowed down the problem to 'modprobe' not being able to find the card after it was removed from pci and then rescanned back. The shortest (not) working example is the following:
Just after boot and login the card is visible and nvidia is not loaded:
$ lspci | grep NVIDIA
01:00.0 3D controller: NVIDIA Corporation GP107M [GeForce GTX 1050 Ti Mobile] (rev a1)
$ lsmod | grep nvidia
At this point I can load and unload the nvidia module without problems any number of times i like:
$ sudo modprobe nvidia
$ lsmod | grep nvidia
nvidia 34144256 0
$ sudo modprobe -r nvidia
$ lsmod | grep nvidia
The dmesg log is:
[ 227.844178] nvidia: loading out-of-tree module taints kernel.
[ 227.844187] nvidia: module license 'NVIDIA' taints kernel.
[ 227.844188] Disabling lock debugging due to kernel taint
[ 227.853395] nvidia: module verification failed: signature and/or required key missing - tainting kernel
[ 227.866409] nvidia-nvlink: Nvlink Core is being initialized, major device number 234
[ 229.038560] nvidia 0000:01:00.0: enabling device (0006 -> 0007)
[ 229.155129] NVRM: loading NVIDIA UNIX x86_64 Kernel Module 460.39 Thu Jan 21 21:54:06 UTC 2021Then in the nvidia-xrun script they manage power on the nvidia card using pci. Here 0000:01:00.0 is the card id and 0000:00:01.0 is its pci. To power it off:
$ sudo tee /sys/bus/pci/devices/0000:01:00.0/remove <<<1
$ lspci | grep NVIDIA
To power it on again:
$ sudo tee /sys/bus/pci/devices/0000:00:01.0/power/control <<<on
$ sudo tee /sys/bus/pci/rescan <<<1
$ sudo tee /sys/bus/pci/devices/0000:01:00.0/power/control <<<on
$ lspci | grep NVIDIA
01:00.0 3D controller: NVIDIA Corporation GP107M [GeForce GTX 1050 Ti Mobile] (rev a1)However, now the modprobe command fails:
$ sudo modprobe nvidia
modprobe: ERROR: could not insert 'nvidia': No such deviceThe dmesg log is:
[ 813.452711] nvidia-nvlink: Nvlink Core is being initialized, major device number 234
[ 813.453102] nvidia 0000:01:00.0: enabling device (0000 -> 0003)
[ 813.453243] NVRM: The NVIDIA GPU 0000:01:00.0
NVRM: (PCI ID: 10de:1c8c) installed in this system has
NVRM: fallen off the bus and is not responding to commands.
[ 813.453313] nvidia: probe of 0000:01:00.0 failed with error -1
[ 813.453326] NVRM: The NVIDIA probe routine failed for 1 device(s).
[ 813.453326] NVRM: None of the NVIDIA devices were initialized.
[ 813.453478] nvidia-nvlink: Unregistered the Nvlink Core, major device number 234All my further efforts to reanimate the card were futile and only the laptop restart helps. Does anyone have any advice on the matter?
Offline