You are not logged in.
I have upgraded an older mainboard with a GTX 1660 Ti which I had previously used in another Arch system where it was fully operational, including CUDA.
Now on this machine with a fresh Arch installation, I can't seem to get it working with CUDA despite installing all the drivers and packages.
Kernel:
[dennis@0xDBServer ~]$ uname -a
Linux 0xDBServer 5.16.0-arch1-1 #1 SMP PREEMPT Mon, 10 Jan 2022 20:11:47 +0000 x86_64 GNU/Linux
CPU:
[dennis@0xDBServer ~]$ lscpu
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Address sizes: 36 bits physical, 48 bits virtual
Byte Order: Little Endian
CPU(s): 4
On-line CPU(s) list: 0-3
Vendor ID: GenuineIntel
Model name: Intel(R) Core(TM)2 Quad CPU Q6600 @ 2.40GHz
CPU family: 6
Model: 15
Thread(s) per core: 1
Core(s) per socket: 4
Socket(s): 1
Stepping: 11
CPU max MHz: 2403.0000
CPU min MHz: 1603.0000
BogoMIPS: 4801.07
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ht
tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good nopl cpuid aperfmperf pni dtes64 monitor ds_c
pl vmx est tm2 ssse3 cx16 xtpr pdcm lahf_lm pti tpr_shadow vnmi flexpriority vpid dtherm
Virtualization features:
Virtualization: VT-x
Caches (sum of all):
L1d: 128 KiB (4 instances)
L1i: 128 KiB (4 instances)
L2: 8 MiB (2 instances)
NUMA:
NUMA node(s): 1
NUMA node0 CPU(s): 0-3
Vulnerabilities:
Itlb multihit: KVM: Mitigation: VMX disabled
L1tf: Mitigation; PTE Inversion; VMX EPT disabled
Mds: Vulnerable: Clear CPU buffers attempted, no microcode; SMT disabled
Meltdown: Mitigation; PTI
Spec store bypass: Vulnerable
Spectre v1: Mitigation; usercopy/swapgs barriers and __user pointer sanitization
Spectre v2: Mitigation; Full generic retpoline, STIBP disabled, RSB filling
Srbds: Not affected
Tsx async abort: Not affected
Mainboard:
[dennis@0xDBServer ~]$ sudo dmidecode
# dmidecode 3.3
Getting SMBIOS data from sysfs.
SMBIOS 2.5 present.
54 structures occupying 1991 bytes.
Table at 0x000FB4F0.
Handle 0x0000, DMI type 0, 24 bytes
BIOS Information
Vendor: American Megatrends Inc.
Version: V1.7
Release Date: 07/29/2008
Address: 0xF0000
Runtime Size: 64 kB
ROM Size: 512 kB
Characteristics:
ISA is supported
PCI is supported
PNP is supported
APM is supported
BIOS is upgradeable
BIOS shadowing is allowed
ESCD support is available
Boot from CD is supported
Selectable boot is supported
BIOS ROM is socketed
EDD is supported
5.25"/1.2 MB floppy services are supported (int 13h)
3.5"/720 kB floppy services are supported (int 13h)
3.5"/2.88 MB floppy services are supported (int 13h)
Print screen service is supported (int 5h)
8042 keyboard services are supported (int 9h)
Serial services are supported (int 14h)
Printer services are supported (int 17h)
CGA/mono video services are supported (int 10h)
ACPI is supported
USB legacy is supported
LS-120 boot is supported
ATAPI Zip drive boot is supported
BIOS boot specification is supported
Targeted content distribution is supported
BIOS Revision: 8.13
Handle 0x0001, DMI type 1, 27 bytes
System Information
Manufacturer: MSI
Product Name: MS-7350
Version: 1.0
Serial Number: To Be Filled By O.E.M.
UUID: Not Present
Wake-up Type: Power Switch
SKU Number: To Be Filled By O.E.M.
Family: To Be Filled By O.E.M.
relevant packages:
[dennis@0xDBServer ~]$ pacman -Q | grep cuda
cuda 11.5.1-1
cuda-tools 11.5.1-1
[dennis@0xDBServer ~]$ pacman -Q | grep nvidia
lib32-nvidia-cg-toolkit 3.1-7
lib32-nvidia-utils 495.46-1
lib32-opencl-nvidia 495.46-1
nvidia-cg-toolkit 3.1-6
nvidia-dkms 495.46-2
nvidia-settings 495.46-2
nvidia-utils 495.46-2
opencl-nvidia 495.46-2
I have made sure the drivers are loaded and included in the initramfs:
[dennis@0xDBServer ~]$ cat /etc/mkinitcpio.conf | grep MODULES=\(nvidia
MODULES=(nvidia nvidia_modeset nvidia_uvm nvidia_drm)
[dennis@0xDBServer ~]$ lsmod | grep nvidia
nvidia_drm 73728 2
nvidia_uvm 2560000 0
nvidia_modeset 1155072 4 nvidia_drm
nvidia 36970496 175 nvidia_uvm,nvidia_modeset
I have also checked that I am in the video group because I recalled this used to be necessary in the past for some things:
[dennis@0xDBServer ~]$ groups
games systemd-journal video uucp lp input audio wheel dennis
One of the packages also said upon installation to try nvidia-modprobe if CUDA is not available, so I added that to my autostart:
[dennis@0xDBServer ~]$ cat ~/.config/lxsession/LXDE/autostart
@lxpanel --profile LXDE
@pcmanfm --desktop --profile LXDE
@xscreensaver -no-splash
@sudo nvidia-modprobe -c 0 -u
@conky
Screenshot from nvidia-settings stating CUDA cores are there:
Additional version info via nvidia-smi:
[dennis@0xDBServer ~]$ nvidia-smi
Mon Jan 17 07:04:55 2022
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 495.46 Driver Version: 495.46 CUDA Version: 11.5 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 NVIDIA GeForce ... Off | 00000000:01:00.0 On | N/A |
| 0% 42C P8 9W / 130W | 140MiB / 5943MiB | 3% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| 0 N/A N/A 402 G /usr/lib/Xorg 138MiB |
+-----------------------------------------------------------------------------+
And yet despite all that checked, CUDA is not available (e.g. in Blender) and also the deviceQuery example fails:
[dennis@0xDBServer ~]$ /opt/cuda/samples/1_Utilities/deviceQuery/deviceQuery
/opt/cuda/samples/1_Utilities/deviceQuery/deviceQuery Starting...
CUDA Device Query (Runtime API) version (CUDART static linking)
cudaGetDeviceCount returned 802
-> system not yet initialized
Result = FAIL
What am I missing?
Could the mainboard be too old for CUDA?
(there was an older thread from 2020 where CUDA was unavailable in a specific Kernel version https://bbs.archlinux.org/viewtopic.php?id=260036 could this be the case again here?)
Last edited by Dennis (2022-01-18 14:10:23)
Offline
Did you enable passwordless sudo execution for that modprobe command to succeed? Might want to do something akin to https://wiki.archlinux.org/title/NVIDIA … with_NVENC instead.
Offline
Did you enable passwordless sudo execution for that modprobe command to succeed? Might want to do something akin to https://wiki.archlinux.org/title/NVIDIA … with_NVENC instead.
Yes I have passwordless execution of that command in my sudoers file. I have also already tried that udev rule you linked. I even tried additionally running the nvidia-modprobe command manually before trying to access CUDA.
Offline
While I was setting up dual boot on that same machine (first tried Windows10 but that did not even let me install the Nvidia Drivers at all, so I tried Win8.1) I tried CUDA in Windows(8.1) and it worked with Nvidia Drivers 472.xxxx ... maybe I need to try an older Kernel and an older Nvidia Driver Version for Arch as well. I saw 470 drivers series in the AUR.
Offline
I have tried linux-lts510 and the nvidia-470xx packages from AUR now but the issue remains the same. I do not know what else I should be looking for.
Offline
Only suggestion I have is whether you need to check/enable the nvidia-persistenced systemd service or so, seeing as you probably don't intend to have xorg running all the time and afaik the nvidia driver basically suspending the card when xorg isn't running, though you apparently do have it active in these first attempts. The error message from the CUDA sample also reads like it's "just" not entirely initialized by the time you try to run the example.
Offline
I have enabled and started the nvidia-persistenced systemd service as you suggested and I also added my user to the nvidia-persistenced group (because the files under /var/run/nvidia-persistenced use that group as owner). Also ensured again that access to the files under /dev/nvidia* is available.
Unfortunately the issue remains. I tried inside and outside xorg.
However, a "dmesg | grep error" revealed "nvidia-nvswitch: probe of 0000:00:14.0 failed with error -22". I will have to investigate what that means.
Last edited by Dennis (2022-01-18 07:52:13)
Offline
Well, I have given up on this combination of mainboard and GPU and replaced it with an even older GeForce GTS 8800 and I am using the nouveau driver for that. Just can't use that machine for CUDA things. Thanks again for all the suggestions.
Offline