You are not logged in.
Hi folks,
I upgraded packages today and after a reboot Xorg now always runs on Nvidia card. This causes the fans to spin all the time making the laptop noisy. I also suspect that it adds a constant 3W to power consumption.
I recently switched from a bumblebee setup to PRIME with udev rules. It was working fine until today
lspci output:
00:02.0 VGA compatible controller: Intel Corporation Comet Lake UHD Graphics (rev 04)
02:00.0 3D controller: NVIDIA Corporation TU117M [GeForce GTX 1650 Mobile / Max-Q] (rev a1)
Can please you suggest something to try here?
Last edited by pratclot (2025-05-18 10:44:14)
Offline
The running xorg process is normal and will have been normal while it was working as well. What did actually change? Which packages were part of the update? check your pacman.log (the kernel and nvidia packages being likely candidates...)
Offline
Also check powertop, top and nvidia-smi for what device/process actually charges the system/CPU/GPU
Offline
Hey, thanks for the tips!
Here are "nvidia" lines from the pacman log (I included a previous upgrade):
[2025-04-27T10:30:36+0200] [ALPM-SCRIPTLET] ==> dkms remove --no-depmod nvidia/570.133.07 -k 6.14.2-arch1-1
[2025-04-27T10:30:37+0200] [ALPM-SCRIPTLET] ==> dkms remove --no-depmod nvidia/570.133.07 -k 6.14.2-zen1-1-zen
[2025-04-27T10:30:37+0200] [ALPM-SCRIPTLET] Error! nvidia/535.54.03: Missing the module source directory or the symbolic link pointing to it.
[2025-04-27T10:30:38+0200] [ALPM-SCRIPTLET] Error! nvidia/535.54.03: Missing the module source directory or the symbolic link pointing to it.
[2025-04-27T10:30:49+0200] [ALPM] upgraded nvidia-utils (570.133.07-1 -> 570.144-1)
[2025-04-27T10:30:53+0200] [ALPM] upgraded opencl-nvidia (570.133.07-1 -> 570.144-1)
[2025-04-27T10:31:12+0200] [ALPM] upgraded libnvidia-container (1.17.5-1 -> 1.17.6-1)
[2025-04-27T10:31:19+0200] [ALPM] upgraded nvidia-container-toolkit (1.17.5-1 -> 1.17.6-1)
[2025-04-27T10:31:20+0200] [ALPM] upgraded nvidia-dkms (570.133.07-1 -> 570.144-1)
[2025-04-27T10:31:54+0200] [ALPM-SCRIPTLET] ==> dkms install --no-depmod nvidia/570.144 -k 6.14.4-zen1-1-zen
[2025-04-27T10:32:53+0200] [ALPM-SCRIPTLET] ==> dkms install --no-depmod nvidia/570.144 -k 6.14.4-arch1-1
[2025-04-27T10:34:42+0200] [ALPM] running 'nvidia-ctk-cdi.hook'...
[2025-04-27T10:34:42+0200] [ALPM-SCRIPTLET] WARNING: updating nvidia-utils version (570.133.07 -> 570.144) in /etc/cdi/nvidia.yaml using plain string substitution.
[2025-04-27T10:34:42+0200] [ALPM-SCRIPTLET] nvidia-ctk cdi generate --output="/etc/cdi/nvidia.yaml"
[2025-05-16T14:04:20+0200] [ALPM-SCRIPTLET] ==> dkms remove --no-depmod nvidia/570.144 -k 6.14.4-arch1-1
[2025-05-16T14:04:20+0200] [ALPM-SCRIPTLET] ==> dkms remove --no-depmod nvidia/570.144 -k 6.14.4-zen1-1-zen
[2025-05-16T14:04:55+0200] [ALPM] upgraded nvidia-utils (570.144-1 -> 570.144-3)
[2025-05-16T14:05:02+0200] [ALPM] upgraded opencl-nvidia (570.144-1 -> 570.144-3)
[2025-05-16T14:06:04+0200] [ALPM] upgraded nvidia-dkms (570.144-1 -> 570.144-3)
[2025-05-16T14:06:43+0200] [ALPM-SCRIPTLET] ==> dkms install --no-depmod nvidia/570.144 -k 6.14.6-zen1-1-zen
[2025-05-16T14:07:52+0200] [ALPM-SCRIPTLET] ==> dkms install --no-depmod nvidia/570.144 -k 6.14.6-arch1-1
[2025-05-16T14:09:44+0200] [ALPM] running 'nvidia-ctk-cdi.hook'...
There were about 652 packages updated I believe, so I am not sure which others to look at. "xorg-server" does not seem to be on the list.
I downgraded linux-zen, rebooted, downgraded linux-zen-headers, downgraded nvidia-utils, then nvidia-dkms and the build process fails with these:
grep "error:" /var/lib/dkms/nvidia/570.144/build/make.log | sort | uniq
././common/inc/nv-linux.h:1203:9: error: implicit declaration of function ‘dma_is_direct’; did you mean ‘d_is_dir’? [-Wimplicit-function-declaration]
././common/inc/nv-linux.h:480:17: error: implicit declaration of function ‘ioremap_driver_hardened’ [-Wimplicit-function-declaration]
././common/inc/nv-linux.h:480:17: error: initialization of ‘void *’ from ‘int’ makes pointer from integer without a cast [-Wint-conversion]
././common/inc/nv-linux.h:500:11: error: implicit declaration of function ‘ioremap_cache_shared’; did you mean ‘ioremap_cache’? [-Wimplicit-function-declaration]
././common/inc/nv-linux.h:500:9: error: assignment to ‘void *’ from ‘int’ makes pointer from integer without a cast [-Wint-conversion]
././common/inc/nv-linux.h:533:11: error: implicit declaration of function ‘ioremap_driver_hardened_wc’ [-Wimplicit-function-declaration]
././common/inc/nv-linux.h:533:9: error: assignment to ‘void *’ from ‘int’ makes pointer from integer without a cast [-Wint-conversion]
././common/inc/nv-linux.h:707:12: error: implicit declaration of function ‘phys_to_dma’; did you mean ‘nv_phys_to_dma’? [-Wimplicit-function-declaration]
././common/inc/nv-mm.h:112:16: error: too many arguments to function ‘get_user_pages’; expected 4, have 8
././common/inc/nv-mm.h:112:47: error: passing argument 2 of ‘get_user_pages’ makes integer from pointer without a cast [-Wint-conversion]
././common/inc/nv-mm.h:112:60: error: passing argument 4 of ‘get_user_pages’ makes pointer from integer without a cast [-Wint-conversion]
././common/inc/nv-mm.h:218:20: error: too many arguments to function ‘get_user_pages’; expected 4, have 8
...
nvidia-smi does not work after this of course (NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver). On 6.14.6 with 570.144-3 (570.144-1 won't build in the same way) it indicates that it is Xorg using the driver and not letting the card go to sleep (the laptop has a LED that stays white if the card is sleeping, and red if it is not):
Fri May 16 22:13:33 2025
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 570.144 Driver Version: 570.144 CUDA Version: 12.8 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA GeForce GTX 1650 ... Off | 00000000:02:00.0 Off | N/A |
| N/A 45C P8 3W / 35W | 5MiB / 4096MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| 0 N/A N/A 2584 G /usr/lib/Xorg 4MiB |
+-----------------------------------------------------------------------------------------+
If I remember correctly Xorg was not there before, and the LED definitely stayed white all the time unless I ran a game.
I installed back 6.14.6, rebooted, installed 570.144-3. After that nvidia-smi told me there is no one using the card, and once it exited the LED turned white (it stayed red all the time with failed 570.144-1 installation). After a reboot Xorg is back using the card.
570.144-3 works with 6.14.4, but Xorg still uses the card.
Offline
the build process fails
gcc15 related, irrelevant here.
As V1del mentioned, it's perfectly normal that Xorg "uses" the GPU here, that will not preclude it from entering RTD3
Running nvidia-smi however will wake the GPU for sure, only check /sys/bus/pci/devices/0000:02:00.0/power/runtime* as explained in the wiki.
Did you add NVreg_DynamicPowerManagement=0x02 and NVreg_EnableGpuFirmware=0 ("systool -vm nvidia")?
nb. that if you've the nvidia modules in the initramfs, you'll have to recreate that after adding such modprobe configlets.
Offline
Hey seth,
Thank you for the hint!
I added "NVreg_EnableGpuFirmware=0" and now I get the same output as in https://bbs.archlinux.org/viewtopic.php … 7#p2181317, the GPU and the LED turn off after a few moments.
I did not have it before though and it worked. Even better, the GPU would not wait for a few moments to shut down as it does now, it would turn off immediately after a game for example was closed.
What I have now is Steam (the game runs from it) using the GPU all the time, and it was not the case before. Actually, after the update Steam started to give me "invisible" window, which is "fixable" by steps here. I assume its webview just does not work with discrete graphics, but even with that setting it still would not leave the GPU alone (although it has a "visible" window).
With Xorg I see that after about 20 seconds the card would turn off, but when Steam starts to use it it stays on.
This is something I liked about bumblebee, nothing was able to use Nvidia because the module was not loaded. Now I have to "believe" it would work, and it still breaks.
Offline
The GSP is only used by default since some 56yxx driver (560? 565?)
Steam will use your dedicated GPU when it detects it, that's steam specific behaviour and you'll have to force it to use the IGP if you don't want that.
This cannot happen if you completely take the GPU off the bus and the driver out of the kernel - steam doesn't "see" that there's a GPU.
So… why do you not use bumblebee if you like that setup better?
Offline
Hey seth,
I do not use bumblebee after this discussion. In short, pvkrun stopped working so I had to find another way to play a game.
I installed nvidia-550xx-dkms, the behavior is the same with both 6.14.6 and 6.14.4. I guess something else has changed. I tried to downgrade intel packages, no luck yet.
In general, I think what V1del suggested may help find a solution, but I am not sure which packages to look at first, there are just too many.
Steam will use your dedicated GPU when it detects it, that's steam specific behaviour and you'll have to force it to use the IGP if you don't want that.
This was not the case before the update. I know it because I was able to keep Steam running all the time and its processes would never use Nvidia. nvidia-smi did not show it in the list of processes, neither it showed Xorg, and the LED was always white if a game was not running, and so the fans were always quiet.
It is not possible that I am imagining this, there is an obvious physical change in my environment - loud fans and a red LED staring at me.
To boot, sometimes Firefox would jump on the GPU too now. That never happened before.
If it helps, there is another change I noticed. I use tbsm to start graphical environment, so the laptop will initially boot into a tty with login prompt. During the whole boot before that the LED would stay red, but once the prompt appeared it would be already white. Once I login, tbsm would start X. It would turn on the GPU for some reason (i.e. the LED would become red), but nvidia-smi would have an empty list of processes, and once it exited the LED would immediately become white and always stay that way.
Now the LED shows red once the prompt is displayed, and it turns to white after maybe 5 seconds. This tells me that the way the GPU would turn off changed. Roughly, before it would power off immediately after all processes quit using it, and now there is some "smart" analysis that determines when to power it down. It would be helpful to figure out what affects that.
Offline
This was not the case before the update.
This cannot happen if you completely take the GPU off the bus and the driver out of the kernel - steam doesn't "see" that there's a GPU.
To boot, sometimes Firefox would jump on the GPU too now. That never happened before.
Possibly ffmpeg for video acceleration, the animated wallpaper im plasma seems to do that.
This tells me that the way the GPU would turn off changed.
Yes, of course.
Instead of explcitily cutting it from the bus and unloading the driver, the GPU detects the unload and moves into D3 - there's really nothing special about this.
__NV_PRIME_RENDER_OFFLOAD=0 env __GLX_VENDOR_LIBRARY_NAME=mesa __EGL_VENDOR_LIBRARY_FILENAMES=/usr/share/glvnd/egl_vendor.d/50_mesa.json VK_DRIVER_FILES=/usr/share/vulkan/icd.d/intel_icd.x86_64.json LIBVA_DRIVER_NAME=iHD
https://wiki.archlinux.org/title/PRIME# … _using_GPU
https://wiki.archlinux.org/title/Hardwa … ing_VA-API
Offline
I think I got it working as it was.
When I switched from bumblebee I commented out blacklist rules for Nvidia modules. I uncommented them now, and this particular change made Xorg unable to bind to Nvidia card, and so nvidia-smi is empty after login (even the LED stays white on its own. It turns to red a couple of times, but for a second only).
Now, Steam still would use the GPU for something. At some point I reverted the change from Reddit link I posted, Steam would process something while nvidia-smi stayed empty (with red LED), and then calm down and turn the LED to white permanently.
One curiosity is how it worked without the blacklist rules on before the update, but I do not really care much for now as everything is back to normal (completely empty nvidia-smi).
Thanks for your ideas folks!
Offline
So you simply end up mimicing bumblebee by blacklisting the nvidia module (and nouveau through nvidia-utils)
Why don't you just disable the nvidia GPU entirely? I don't really get what you're trying to achieve here…
Offline
Not sure if I can call current behavior "mimicking bumblebee". I do not have to call prime-run for Steam games, they just grab Nvidia card when they need as if it was a Windows system.
I do have to call it for glxspheres64 though, so it could be something Steam-related.
lsmod shows no nvidia modules after graphical session starts. If I use nvidia-smi once, the modules will be loaded and stick (nvidia, nvidia_uvm, nvidia_modeset), but the card would be off immediately after anything that used it exits.
With bumblebee I remember having to call optirun glxspheres64 to "activate" Nvidia card so that I could use pvkrun. Otherwise it would use Intel card since nvidia modules were not loaded.
And now the modules are loaded at all times and yet the card is used only on demand. Autostart programs, such as Firefox, all stay away from Nvidia card. To be fair, I could not get Firefox to use Nvidia again even after restarting it.
Offline
If I use nvidia-smi once, the modules will be loaded and stick
Does that also work if you just run "prime-run glxinfo" (what what's the output of that)?
Does the xorg log afterwards suggest that the server has detected and added the GPU?
I do not have to call prime-run for Steam games, they just grab Nvidia card
Yes, as mentioned that's the steam (specific) default behavior.
Offline
Hey seth,
pratclot wrote:If I use nvidia-smi once, the modules will be loaded and stick
Does that also work if you just run "prime-run glxinfo" (what what's the output of that)?
Yes, the modules are permanently loaded if they were not before, and the output talks about NVIDIA.
Does the xorg log afterwards suggest that the server has detected and added the GPU?
From what I see there is no mention of "nvidia" in Xorg log at all (case insensitive grep).
Offline