You are not logged in.
Recently, after an update CUDA hasn't been working at all on the linux-ck kernel. Trying to use OBS with NVENC enabled doesn't work, using CUDA with TensorFlow doesn't work either, etc. etc. However, as soon as I boot into the standard linux kernel, everything works fine.
I'm using nvidia-dkms so the driver can work between both kernels. I do have linux-ck-headers and dkms seems to install fine to it, so nothing else seems wrong except the fact that no part of CUDA works when using the linux-ck kernel.
Any help is appreciated.
EDIT: Tried on linux-zen and it seems to work fine there, so a bit clueless on why it won't work on linux-ck
Last edited by SilverMight (2018-08-01 17:13:41)
Offline
Are you using modprobed-db also? Maybe you are missing a module. If you can share the project I can try reproducing it locally.
Last edited by inglor (2018-07-30 10:26:14)
Offline
Are you using modprobed-db also? Maybe you are missing a module. If you can share the project I can try reproducing it locally.
Don't believe so, however I have tried running nvidia-modprobe to no avail. I'll give that a shot.
The kernel can be found at https://aur.archlinux.org/packages/linux-ck/
Offline
The kernel can be found at https://aur.archlinux.org/packages/linux-ck/
Sorry I wasn't clear. If you tell me the steps to reproduce it I can give it a go on my PC which I have linux-ck with DKMS and CUDA avalaible. This is why I was asking for a project.
[..]Trying to use OBS with NVENC enabled doesn't work, using CUDA with TensorFlow doesn't work either, etc. etc. However, as soon as I boot into the standard linux kernel, everything works fine. [..]
Is this coming from a project ?
Offline
My bad, yes. TensorFlow is pretty large so I'd recommend installing OBS (sudo pacman -S obs-studio), going to File -> Settings and then Output and then changing the recording encoder from Software to Hardware (NVENC), then hit start recording.
Offline
Same here.
When I use linux-ck-haswell in repo-ck. I can't run the deviceQuery in cuda samples:
$ cd "cuda sample's directory"
$ ./bin/x86_64/linux/release/deviceQuery
./bin/x86_64/linux/release/deviceQuery Starting...
CUDA Device Query (Runtime API) version (CUDART static linking)
cudaGetDeviceCount returned 30
-> unknown error
Result = FAIL
However I have install nvidia-dkms and it works:
$ lsmod | grep nvidia
nvidia_drm 45056 10
nvidia_modeset 1093632 8 nvidia_drm
nvidia 14061568 825 nvidia_modeset
drm_kms_helper 196608 2 nvidia_drm,i915
drm 466944 13 drm_kms_helper,nvidia_drm,i915
ipmi_msghandler 57344 2 ipmi_devintf,nvidia
If I switch back to linux kernel then cuda works fine.
Offline
Same here.
When I use linux-ck-haswell in repo-ck. I can't run the deviceQuery in cuda samples:$ cd "cuda sample's directory" $ ./bin/x86_64/linux/release/deviceQuery ./bin/x86_64/linux/release/deviceQuery Starting... CUDA Device Query (Runtime API) version (CUDART static linking) cudaGetDeviceCount returned 30 -> unknown error Result = FAIL
However I have install nvidia-dkms and it works:
$ lsmod | grep nvidia nvidia_drm 45056 10 nvidia_modeset 1093632 8 nvidia_drm nvidia 14061568 825 nvidia_modeset drm_kms_helper 196608 2 nvidia_drm,i915 drm 466944 13 drm_kms_helper,nvidia_drm,i915 ipmi_msghandler 57344 2 ipmi_devintf,nvidia
If I switch back to linux kernel then cuda works fine.
Just tried that and got the same results as you.
./deviceQuery Starting...
CUDA Device Query (Runtime API) version (CUDART static linking)
cudaGetDeviceCount returned 30
-> unknown error
Result = FAIL
Offline
Same here (linux-ck with nvidia-dkms and supposed to be working fine). Could it be that nvidia-dkms (the module) is build with gcc8 and Cuda only supports gcc7?
Offline
Same here (linux-ck with nvidia-dkms and supposed to be working fine). Could it be that nvidia-dkms (the module) is build with gcc8 and Cuda only supports gcc7?
I don't think so, since it works fine on any other kernel except the -ck one.
Offline
Enabled NUMA on the linux-ck kernel, recompile and works fine.
$ ./deviceQuery
./deviceQuery Starting...
CUDA Device Query (Runtime API) version (CUDART static linking)
Detected 1 CUDA Capable device(s)
Device 0: "GeForce GTX 1080"
CUDA Driver Version / Runtime Version 9.2 / 9.2
CUDA Capability Major/Minor version number: 6.1
Total amount of global memory: 8116 MBytes (8510701568 bytes)
(20) Multiprocessors, (128) CUDA Cores/MP: 2560 CUDA Cores
GPU Max Clock rate: 1835 MHz (1.84 GHz)
Memory Clock rate: 5005 Mhz
Memory Bus Width: 256-bit
L2 Cache Size: 2097152 bytes
Maximum Texture Dimension Size (x,y,z) 1D=(131072), 2D=(131072, 65536), 3D=(16384, 16384, 16384)
Maximum Layered 1D Texture Size, (num) layers 1D=(32768), 2048 layers
Maximum Layered 2D Texture Size, (num) layers 2D=(32768, 32768), 2048 layers
Total amount of constant memory: 65536 bytes
Total amount of shared memory per block: 49152 bytes
Total number of registers available per block: 65536
Warp size: 32
Maximum number of threads per multiprocessor: 2048
Maximum number of threads per block: 1024
Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
Max dimension size of a grid size (x,y,z): (2147483647, 65535, 65535)
Maximum memory pitch: 2147483647 bytes
Texture alignment: 512 bytes
Concurrent copy and kernel execution: Yes with 2 copy engine(s)
Run time limit on kernels: Yes
Integrated GPU sharing Host Memory: No
Support host page-locked memory mapping: Yes
Alignment requirement for Surfaces: Yes
Device has ECC support: Disabled
Device supports Unified Addressing (UVA): Yes
Device supports Compute Preemption: Yes
Supports Cooperative Kernel Launch: Yes
Supports MultiDevice Co-op Kernel Launch: Yes
Device PCI Domain ID / Bus ID / location ID: 0 / 66 / 0
Compute Mode:
< Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >
deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 9.2, CUDA Runtime Version = 9.2, NumDevs = 1
Result = PASS
$ uname -a
Linux tiamat 4.17.11-1-ck #1 SMP PREEMPT Wed Aug 1 07:42:19 BST 2018 x86_64 GNU/Linux
Offline
Enabled NUMA on the linux-ck kernel, recompile and works fine.
$ ./deviceQuery ./deviceQuery Starting... CUDA Device Query (Runtime API) version (CUDART static linking) Detected 1 CUDA Capable device(s) Device 0: "GeForce GTX 1080" CUDA Driver Version / Runtime Version 9.2 / 9.2 CUDA Capability Major/Minor version number: 6.1 Total amount of global memory: 8116 MBytes (8510701568 bytes) (20) Multiprocessors, (128) CUDA Cores/MP: 2560 CUDA Cores GPU Max Clock rate: 1835 MHz (1.84 GHz) Memory Clock rate: 5005 Mhz Memory Bus Width: 256-bit L2 Cache Size: 2097152 bytes Maximum Texture Dimension Size (x,y,z) 1D=(131072), 2D=(131072, 65536), 3D=(16384, 16384, 16384) Maximum Layered 1D Texture Size, (num) layers 1D=(32768), 2048 layers Maximum Layered 2D Texture Size, (num) layers 2D=(32768, 32768), 2048 layers Total amount of constant memory: 65536 bytes Total amount of shared memory per block: 49152 bytes Total number of registers available per block: 65536 Warp size: 32 Maximum number of threads per multiprocessor: 2048 Maximum number of threads per block: 1024 Max dimension size of a thread block (x,y,z): (1024, 1024, 64) Max dimension size of a grid size (x,y,z): (2147483647, 65535, 65535) Maximum memory pitch: 2147483647 bytes Texture alignment: 512 bytes Concurrent copy and kernel execution: Yes with 2 copy engine(s) Run time limit on kernels: Yes Integrated GPU sharing Host Memory: No Support host page-locked memory mapping: Yes Alignment requirement for Surfaces: Yes Device has ECC support: Disabled Device supports Unified Addressing (UVA): Yes Device supports Compute Preemption: Yes Supports Cooperative Kernel Launch: Yes Supports MultiDevice Co-op Kernel Launch: Yes Device PCI Domain ID / Bus ID / location ID: 0 / 66 / 0 Compute Mode: < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) > deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 9.2, CUDA Runtime Version = 9.2, NumDevs = 1 Result = PASS
$ uname -a Linux tiamat 4.17.11-1-ck #1 SMP PREEMPT Wed Aug 1 07:42:19 BST 2018 x86_64 GNU/Linux
But in linux-ck's PKGBUILD it says that it's not recommend to enable this feature in single CPU platform.
Offline
With CUDA you are using the GPU as a processor, so it is not a single CPU platform anymore.
| alias CUTF='LANG=en_XX.UTF-8@POSIX ' |
Offline
Just compiled with NUMA, definitely fixed the issue. Thanks for the fix
Offline
I will comment the PKGBUILD for CUDA users and reference this discussion, thank you.
CPU-optimized Linux-ck packages @ Repo-ck • AUR packages • Zsh and other configs
Offline