[SOLVED] Laptop, bumblebee and bbswitch

paozaf · 2023-04-30 19:58:33

Hi all,
I have a ThinkPad p14s with a dedicated NVIDIA card.

I installed the proprietary driver, bumblebee, optirun, and TLP.
No custom configurations, except the BusID field in the /etc/bumblebee/xorg.conf.nvidia file.

So far so good...I use optirun to execute software with GPU support.
The question is: am I fine in this way?
I mean, the NVIDIA card is always turned off (so not power sink), except when I use optirun?

If I install bbswitch the OS is no longer able to use the GPU.

Any suggestion/explanation?

Thanks a lot.

Last edited by paozaf (2023-05-08 10:52:04)

paozaf · 2023-05-02 14:09:43

No one can help?

Fuxino · 2023-05-02 15:48:31

If using optirun to execute applications with the NVIDIA GPU works (e.g. `optirun glxinfo | grep "OpenGL renderer"` shows that it's using NVIDIA), then you should be fine, but I think without bbswitch the GPU is not actually powered off. So if you want that, you will need bbswitch (or another method to power down the GPU, see https://wiki.archlinux.org/title/Hybrid … screte_GPU).

paozaf wrote:

If I install bbswitch the OS is no longer able to use the GPU.

What do you mean exactly? Do you get any errors?

paozaf · 2023-05-02 20:04:43

Thanks for helping!

If I install bbswitch I get and I try to run something with optirun, I get this:

[  188.198584] [ERROR]Cannot access secondary GPU - error: Could not enable discrete graphics card

[  188.198600] [ERROR]Aborting because fallback start is disabled.

Of cousrse I edited /etc/bumblebee/xorg.conf.nvidia by adding the correct BusID (it works with bumblebee but without bbswitch)

Fuxino · 2023-05-02 20:54:44

Can you turn on the card manually as explained here? What's the output of

cat /proc/acpi/bbswitch

after that?

Also, post the output of dmesg after you try running something with optirun

paozaf · 2023-05-02 21:13:08

It is already turned ON (as cat /proc/acpi/bbswitch coomand says).

The dmesg is

[   35.404876] i915 0000:00:02.0: [drm] Selective fetch area calculation failed in pipe A
[   49.001179] bbswitch: enabling discrete graphics
[   49.072312] pci 0000:03:00.0: Unable to change power state from D3cold to D0, device inaccessible
[   49.138142] pci 0000:03:00.0: Unable to change power state from D3cold to D0, device inaccessible

Note that I have also TLP and I edited its configuration file by adding:

RUNTIME_PM_DENYLIST="03:00.0"
RUNTIME_PM_BLACKLIST="03:00.0"
RUNTIME_PM_DRIVER_DENYLIST="nouveau nvidia"

paozaf · 2023-05-02 21:15:31

When I switch the card on/off I get

[   49.001179] bbswitch: enabling discrete graphics
[   49.072312] pci 0000:03:00.0: Unable to change power state from D3cold to D0, device inaccessible
[   49.138142] pci 0000:03:00.0: Unable to change power state from D3cold to D0, device inaccessible
[   99.924591] nvidia: module license 'NVIDIA' taints kernel.
[   99.924594] Disabling lock debugging due to kernel taint
[  100.062779] nvidia-nvlink: Nvlink Core is being initialized, major device number 509
[  100.062785] NVRM: The NVIDIA GPU 0000:03:00.0
               NVRM: (PCI ID: 10de:1fb7) installed in this system has
               NVRM: fallen off the bus and is not responding to commands.
[  100.063556] nvidia: probe of 0000:03:00.0 failed with error -1
[  100.063569] NVRM: The NVIDIA probe routine failed for 1 device(s).
[  100.063570] NVRM: None of the NVIDIA devices were initialized.
[  100.063650] nvidia-nvlink: Unregistered Nvlink Core, major device number 509
[  125.886710] nvidia-nvlink: Nvlink Core is being initialized, major device number 509
[  125.886715] NVRM: The NVIDIA GPU 0000:03:00.0
               NVRM: (PCI ID: 10de:1fb7) installed in this system has
               NVRM: fallen off the bus and is not responding to commands.
[  125.887536] nvidia: probe of 0000:03:00.0 failed with error -1
[  125.887551] NVRM: The NVIDIA probe routine failed for 1 device(s).
[  125.887552] NVRM: None of the NVIDIA devices were initialized.
[  125.887654] nvidia-nvlink: Unregistered Nvlink Core, major device number 509
[  381.047433] bbswitch: disabling discrete graphics
[  396.552161] bbswitch: enabling discrete graphics
[  396.614623] pci 0000:03:00.0: Unable to change power state from D3cold to D0, device inaccessible
[  396.680732] pci 0000:03:00.0: Unable to change power state from D3cold to D0, device inaccessible

V1del · 2023-05-02 21:44:12

If this thing is remotely modern drop bumblebee and bbswitch and setup PRIME instead: https://wiki.archlinux.org/title/PRIME# … er_offload

tdtooke · 2023-05-02 22:57:18

Prime is better but I know from personal experience when you're researching this bumblebee is the first thing you find. If battery life is not a concern you might be better off sidestepping the problem and only using your NVIDIA GPU for everything. That's what I ended up doing. The wiki has a how to on setting that up.

V1del · 2023-05-02 23:14:02

Yes which is why you need to research your stuff using more up to date sources, it's probably one of the biggest annoyances I have with "linux blogs" how long outdated and potentially harmful solutions remain in circulation as active as they do despite much better alternatives having been implemented

All of this long predates the existence of PRIME and there's absolutely no benefit to any of these methods over using PRIME if you have a Turing+ GPU. You are likely to lose frames if you opt for "Nvidia only" as well since that's not actually "nvidia only" but you still output everything to the intel card via roundtrip (unless you have an actual BIOS/UEFI option to switch, that would indeed be best for performance) If you do not, then don't chase for old solutions and go with the actively developed and maintained one.

paozaf · 2023-05-03 13:54:46

V1del wrote:

If this thing is remotely modern drop bumblebee and bbswitch and setup PRIME instead: https://wiki.archlinux.org/title/PRIME# … er_offload

So you are suggesting removing bumblebee and bbswitch to install PRIME?

What about TLP?
Can I leave it plain?

Thanks a lot!

V1del · 2023-05-03 14:14:15

You can leave TLP, by default it blacklists the nvidia card from it's internal power management as that will be handled by the driver itself and on a PRIME setup the nvidia driver as a driver is always loaded. Yes that's what I'm suggesting.

Last edited by V1del (2023-05-03 14:14:32)

paozaf · 2023-05-03 16:51:05

V1del wrote:

You can leave TLP, by default it blacklists the nvidia card from it's internal power management as that will be handled by the driver itself and on a PRIME setup the nvidia driver as a driver is always loaded. Yes that's what I'm suggesting.

I uninstalled Bumblebee and (tried) to install PRIME...it was a mess...any process was able to run on the GPU (as shown by nvidia-smi) and the UI was *unstable*.

I revert back to my previous condition.

I think Bumblebee it's easier to install and configure, the only thing missing is bbswitch right now.

Suggestion on how to make it working?

Thanks

V1del · 2023-05-03 17:04:22

That shouldn't have been the case, assuming you didn't have any xorg.conf hanging around that made every process use the GPU again. The trick is to literally not do anything other than installing nvidia-prime and even that is optional.

bbswitch used to use a very specific functionality of older Optimus systems, it wouldn't surprise me if that option didn't even exist in the first place on newer systems anymore. Other than that if the trigger bbswitch does leads to a fault in the nvidia driver there's little to do here, it might be that there's some potential delay and you'd need to enable the GPU, wait a bit and only then try to access it. If you are in the situation as stated in your log, do things get better if you manually reload the nvidia driver with e.g. modprobe nvidia after waiting for a while from the "ON" trigger?

tdtooke · 2023-05-03 21:17:32

My advice would still be NVIDIA only. In my case my BIOS does have that option. I had to update my BIOS though to get it. Worth noting I'm on an Acer Nitro which definitely is not a Thinkpad so this might not be an option. If it is I really think you should take it. If you're not gaming it is true you don't need any of this, but if you are then as is just won't cut it.

paozaf · 2023-05-03 21:28:59

V1del wrote:

That shouldn't have been the case, assuming you didn't have any xorg.conf hanging around that made every process use the GPU again. The trick is to literally not do anything other than installing nvidia-prime and even that is optional.

Really?

The guide is super long.

I never manually configured xorg.conf (maybe nvidia did this?).

I executed this (after removing bumblebee and bbswitch)

pacman -S  xf86-video-intel nvidia-prime xrandr vulkan-tools
yay -S nvidia-prime-rtd3pm
systemctl start nvidia-persistenced.service
systemctl enable nvidia-persistenced.service

I also add a kernel parameter via grub (nvidia_drm.modeset=1)

Where I am wrong?

@tdtooke I am interested in turning off the NVIDIA card when not needed...I want to save battery....but thanks for helping!

V1del · 2023-05-03 21:32:47

Disable and do not enable the nvidia-persistenced service it logically does the reverse of what you want (though unlikely to have a effect here as it is mostly relevant on servers that have no active process on the nvidia card which you won't have since the Xorg process will generally remain), remove xf86-video-intel you never want this on any system that's newer than 2010, remove that nvidia-prime-rtd3pm package that's just a config file doing unnecessary things and install the nvidia-prime package with pacman which is in the repos, you can keep the kernel parameter, that one is actually beneficial.

The only thing on the page I linked to that is relevant is the exact section I linked you to: https://wiki.archlinux.org/title/PRIME# … er_offload that is three lines of content.

If you still have issues after following the above post your resultant xorg log: https://wiki.archlinux.org/title/Xorg#General

Last edited by V1del (2023-05-03 21:42:01)

paozaf · 2023-05-04 09:40:59

Hi,
I did what you suggested.

The UI is stable but I think something is still wrong.

Here the output of some commands:

$ prime-run glxinfo | grep "OpenGL renderer"
X Error of failed request:  BadValue (integer parameter out of range for operation)
  Major opcode of failed request:  150 (GLX)
  Minor opcode of failed request:  24 (X_GLXCreateNewContext)
  Value in failed request:  0x0
  Serial number of failed request:  50
  Current serial number in output stream:  51


$ prime-run vulkaninfo
ERROR: [Loader Message] Code 0 : loader_scanned_icd_add: Could not get 'vkCreateInstance' via 'vk_icdGetInstanceProcAddr' for ICD libGLX_nvidia.so.0
ERROR: [Loader Message] Code 0 : vkCreateInstance: Found no drivers!
Cannot create Vulkan instance.
This problem is often caused by a faulty installation of the Vulkan driver or attempting to use a GPU that does not support Vulkan.
ERROR at /usr/src/debug/vulkan-tools/Vulkan-Tools-1.3.245/vulkaninfo/vulkaninfo.h:677:vkCreateInstance failed with ERROR_INCOMPATIBLE_DRIVER


$ nvidia-smi
NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.

Last edited by paozaf (2023-05-04 09:49:55)

V1del · 2023-05-04 11:02:50

As mentioned, post the xorg log as well as the outputs of

pacman -Qs nvidia
pacman -Q linux{,-lts,-zen}{,-headers}
pacman -Qs vulkan
uname -a
sudo dmesg

Last edited by V1del (2023-05-04 11:04:50)

paozaf · 2023-05-04 16:31:38

Surprise!
Now I can run a process on the GPU (I don't know what happened before).

The point now is that sometimes I hear a fan turned on without a real reason (the CPU load is very low).
Is it possible it is due to the GPU?
How can I check if the GPU is really turned off?

The commands you said gave me this:

local/cuda 12.1.1-1
    NVIDIA's GPU programming toolkit
local/cudnn 8.8.0.121-1
    NVIDIA CUDA Deep Neural Network library
local/egl-wayland 2:1.1.11-4
    EGLStream-based Wayland external platform
local/libvdpau 1.5-1
    Nvidia VDPAU library
local/libxnvctrl 530.41.03-1
    NVIDIA NV-CONTROL X extension
local/nvidia 530.41.03-9
    NVIDIA drivers for linux
local/nvidia-prime 1.0-4
    NVIDIA Prime Render Offload configuration and utilities
local/nvidia-settings 530.41.03-1
    Tool for configuring the NVIDIA graphics driver
local/nvidia-utils 530.41.03-1
    NVIDIA drivers utilities
local/opencl-nvidia 530.41.03-1
    OpenCL implemention for NVIDIA
linux 6.3.1.arch1-1
linux-headers 6.3.1.arch1-1
error: package 'linux-lts' was not found
error: package 'linux-lts-headers' was not found
error: package 'linux-zen' was not found
error: package 'linux-zen-headers' was not found
local/nvidia-utils 530.41.03-1
    NVIDIA drivers utilities
local/spirv-tools 2022.4-1 (vulkan-devel)
    API and commands for processing SPIR-V modules
local/vulkan-icd-loader 1.3.245-1
    Vulkan Installable Client Driver (ICD) Loader
local/vulkan-tools 1.3.245-1 (vulkan-devel)
    Vulkan Utilities and Tools
Linux LAPTOP 6.3.1-arch1-1 #1 SMP PREEMPT_DYNAMIC Mon, 01 May 2023 17:42:39 +0000 x86_64 GNU/Linux

The journalctl command filtered by the xorg PID is very long...do you need it?

Thanks!

V1del · 2023-05-04 17:07:36

No one was talking about a journal filtered for the xorg PID but for the xorg.log which should not be in your journal, unless you use GDM in which case

sudo journalctl --unit=gdm -b

should have the relevant information.

As for checking how turned off the GPU really is, since nvidia-smi leads to a wakeup it isn't overly useful. In addition to that single components like the graphics card do most often not have direct sensors for calculating power draw directly, so your best bet is checking powertop and verifying that the overall power draw is notably lower while the nvidia GPU is not in active use.

Last edited by V1del (2023-05-04 17:11:09)

paozaf · 2023-05-04 19:37:15

Yes, I'm using GDM.

The log is this

May 04 21:23:09 LAPTOP systemd[1]: Starting GNOME Display Manager...
May 04 21:23:09 LAPTOP systemd[1]: Started GNOME Display Manager.
May 04 21:23:26 LAPTOP gdm-password][1816]: pam_ecryptfs: Passphrase file wrapped
May 04 21:23:26 LAPTOP gdm-password][1800]: gkr-pam: unable to locate daemon control file
May 04 21:23:26 LAPTOP gdm-password][1800]: gkr-pam: stashed password to try later in open session
May 04 21:23:26 LAPTOP gdm-password][1800]: pam_unix(gdm-password:session): session opened for user paolo(uid=1000) by (uid=0)
May 04 21:23:26 LAPTOP gdm-password][1800]: gkr-pam: unlocked login keyring
May 04 21:23:31 LAPTOP gdm[1048]: Gdm: Child process -1261 was already dead.

The fan looks ok now...

seth · 2023-05-04 19:55:34

https://wiki.archlinux.org/title/Xorg#General

But since V1del also asked for dmesg, we'll just cover both:

sudo journalctl -b | curl -F 'file=@-' 0x0.st

Arch Linux

#1 2023-04-30 19:58:33

[SOLVED] Laptop, bumblebee and bbswitch

#2 2023-05-02 14:09:43

Re: [SOLVED] Laptop, bumblebee and bbswitch

#3 2023-05-02 15:48:31

Re: [SOLVED] Laptop, bumblebee and bbswitch

#4 2023-05-02 20:04:43

Re: [SOLVED] Laptop, bumblebee and bbswitch

#5 2023-05-02 20:54:44

Re: [SOLVED] Laptop, bumblebee and bbswitch

#6 2023-05-02 21:13:08

Re: [SOLVED] Laptop, bumblebee and bbswitch

#7 2023-05-02 21:15:31

Re: [SOLVED] Laptop, bumblebee and bbswitch

#8 2023-05-02 21:44:12

Re: [SOLVED] Laptop, bumblebee and bbswitch

#9 2023-05-02 22:57:18

Re: [SOLVED] Laptop, bumblebee and bbswitch

#10 2023-05-02 23:14:02

Re: [SOLVED] Laptop, bumblebee and bbswitch

#11 2023-05-03 13:54:46

Re: [SOLVED] Laptop, bumblebee and bbswitch

#12 2023-05-03 14:14:15

Re: [SOLVED] Laptop, bumblebee and bbswitch

#13 2023-05-03 16:51:05

Re: [SOLVED] Laptop, bumblebee and bbswitch

#14 2023-05-03 17:04:22

Re: [SOLVED] Laptop, bumblebee and bbswitch

#15 2023-05-03 21:17:32

Re: [SOLVED] Laptop, bumblebee and bbswitch

#16 2023-05-03 21:28:59

Re: [SOLVED] Laptop, bumblebee and bbswitch

#17 2023-05-03 21:32:47

Re: [SOLVED] Laptop, bumblebee and bbswitch

#18 2023-05-04 09:40:59

Re: [SOLVED] Laptop, bumblebee and bbswitch

#19 2023-05-04 11:02:50

Re: [SOLVED] Laptop, bumblebee and bbswitch

#20 2023-05-04 16:31:38

Re: [SOLVED] Laptop, bumblebee and bbswitch

#21 2023-05-04 17:07:36

Re: [SOLVED] Laptop, bumblebee and bbswitch

#22 2023-05-04 19:37:15

Re: [SOLVED] Laptop, bumblebee and bbswitch

#23 2023-05-04 19:55:34

Re: [SOLVED] Laptop, bumblebee and bbswitch

Board footer