You are not logged in.
Machine: Dell XPS 15 9510
GPU: NVIDIA GeForce RTX 3050
Display Manager: XDM
My laptop is ridiculiosuly hot no matter what I am doing. It's to the point that even if I'm disconnected from the internet and only working in emacs, it will be uncomfortably hot to the touch and burning through battery at an insane rate. I suspect this is because of the GPU. I want my NVIDIA GPU to be completely powered off (consuming 0 power at all) unless I run prime-run `<application-name>`.
I followed the instructions here and followed the unstructions under the PRIME render offload. I have nvidia, nvidia-utils, nvidia-settings, nvidia-prime, mesa, and mesa-utils installed. prime-run appears to work fine. Proof:
Output from `glxinfo | grep "OpenGL renderer"`:
OpenGL renderer string: Mesa Intel(R) UHD Graphics (TGL GT1)Output from `prime-run glxinfo | grep "OpenGL renderer"`:
OpenGL renderer string: NVIDIA GeForce RTX 3050 Laptop GPU/PCIe/SSE2So, I figured I hade succeeded. Nope... the laptop is still super hot. Upon further inspection using `nvidia-smi`:
Wed Jan 5 01:00:05 2022
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 495.46 Driver Version: 495.46 CUDA Version: 11.5 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 NVIDIA GeForce ... On | 00000000:01:00.0 Off | N/A |
| N/A 46C P3 13W / N/A | 4MiB / 3913MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| 0 N/A N/A 794 G /usr/lib/Xorg 4MiB |
+-----------------------------------------------------------------------------+Three things to note here: 1) the card is running X. I have no idea why it would be running x. I want it to be running nothing. (2) The GPU is consuming 13W of power!!! I watched it for a while it was either 6W (at P5 or P8) or 13W (at P8), but never any other value. This is absurd. (3) the GPU is at 0% usage, so it's doing nothing while consuming 13W of power. The battery went down 12 precent in the last five minutes while I only have a browser and a shell open.
How can I get this thing to cool down? Any help is appreciated. If I have to choose between never being able to use the GPU at all and using it like this, I'd opt for the former. I'm about ready to open it up and take the card out.
--------------------------------------------------------------------------------------------------------------------------------------
EDIT:
I was able to stop the card from running xorg by removing `/usr/lib64/xorg/modules/drivers/nvidia_drv.so`. `nvidia-smi` now reports "No running processes found", but still reports the same high power consumption.
--------------------------------------------------------------------------------------------------------------------------------------
EDIT:
Another interesting thing to note: when I do `prime-run libreoffice`, for example, libre office opens as expected. BUT nvidia-smi still shows that there are no running processes, 0% cpu load, 0bytes of ram used (but still 13W consumption. Does this mean that prime-run is not actually running applications with the GPU? Is there another way for me to check the load on the gpu?
Another thing to note:
I created the following systemd service to shut the card down on boot.
[Unit]
Description=Nvidia Graphic Card OnBoot Disabler
After=sysinit.target
[Service]
Type=oneshot
RemainAfterExit=yes
ExecStart=/usr/bin/sh -c "echo 1 > /sys/bus/pci/devices/0000:01:00.0/remove; echo '\\_SB.PCI0.PEG1.PEGP._OFF' > /proc/acpi/call"
ExecStop=/usr/bin/sh -c "echo '\\_SB.PCI0.PEG1.PEGP._ON' > /proc/acpi/call; echo 1 > /sys/bus/pci/rescan"
[Install]
WantedBy=sysinit.targetsource. I just had to change the ACPI call.
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Edit:
Very interesting:
I decided to use `sensors` check the current being drawn from the battery. It would typically sit around a little less than 1A. However, when I run nvidia-smi the current jumps up to 2-3A then settles down near about 1.3A. This means (to me) that the GPU isn't actually on until I actually run nvidia-smi, which is great. Also, when I run prime-run <app-name> the current jumps up and settles around 2.3A, which means that the GPU is, at least on. The fact that nvidia-smi still says that there are no running processes is a bit weird to me.
Last edited by manray (2022-01-05 08:42:51)
Offline
Did you enable RTD3 Power Management?
https://wiki.archlinux.org/title/PRIME# … Management
Also make sure Xorg is forced to run on you integrated card because based on what you have above it seems like it is handling xorg.
Alternatively you could use a switchable graphics program like optimus-manager and use hybrid mode.
Last edited by echo-84 (2022-01-05 06:21:48)
Offline
Did you enable RTD3 Power Management?
I believe so. I added the udev rules as stated, edited /etc/modprobe.d/nvidia-pm.conf as stated, and enabled nvidia-persistenced. That's really all I did.
Offline
I decided to use `sensors` check the current being drawn from the battery. It would typically sit around a little less than 1A. However, when I run nvidia-smi the current jumps up to 2-3A then settles down near about 1.3A. This means (to me) that the GPU isn't actually on until I actually run nvidia-smi, which is great. Also, when I run prime-run <app-name> the current jumps up and settles around 2.3A, which means that the GPU is at least on. The fact that nvidia-smi still says that there are no running processes is a bit weird to me. How can I ensure that the GPU is actually running the intended program without using nvidia-smi?
Offline
I believe I have this issue prettu much solved. I don't think that the card is drawing power while idling. I didn't realize that the running nvidia-smi would power the card on.
Also, I found that once I ran a more resource intensive program with prime-run, it would show up in nvidia-smi. Not sure why that is, but I don't really care. I also ran some rough benchmarks and found that games perform better when I run them with prime-run vs without.
Thanks!
Offline
prime-run has an "intensiveness" threshold for actually invoking the GPU, afaik by default at the minimum 20MB of VRAM usage for actually deciding to activate the card.
if this is [SOLVED] please mark it as such by editing the title in your first post.
Offline