[SOLVED] CUDA not working on linux-ck kernel

SilverMight · 2018-07-29 18:05:21

Recently, after an update CUDA hasn't been working at all on the linux-ck kernel. Trying to use OBS with NVENC enabled doesn't work, using CUDA with TensorFlow doesn't work either, etc. etc. However, as soon as I boot into the standard linux kernel, everything works fine.

I'm using nvidia-dkms so the driver can work between both kernels. I do have linux-ck-headers and dkms seems to install fine to it, so nothing else seems wrong except the fact that no part of CUDA works when using the linux-ck kernel.

Any help is appreciated.

EDIT: Tried on linux-zen and it seems to work fine there, so a bit clueless on why it won't work on linux-ck

Last edited by SilverMight (2018-08-01 17:13:41)

inglor · 2018-07-30 10:25:16

Are you using modprobed-db also? Maybe you are missing a module. If you can share the project I can try reproducing it locally.

Last edited by inglor (2018-07-30 10:26:14)

SilverMight · 2018-07-30 15:49:59

inglor wrote:

Are you using modprobed-db also? Maybe you are missing a module. If you can share the project I can try reproducing it locally.

Don't believe so, however I have tried running nvidia-modprobe to no avail. I'll give that a shot.

The kernel can be found at https://aur.archlinux.org/packages/linux-ck/

inglor · 2018-07-30 17:08:55

SilverMight wrote:

The kernel can be found at https://aur.archlinux.org/packages/linux-ck/

Sorry I wasn't clear. If you tell me the steps to reproduce it I can give it a go on my PC which I have linux-ck with DKMS and CUDA avalaible. This is why I was asking for a project.

SilverMight wrote:

[..]Trying to use OBS with NVENC enabled doesn't work, using CUDA with TensorFlow doesn't work either, etc. etc. However, as soon as I boot into the standard linux kernel, everything works fine. [..]

Is this coming from a project ?

SilverMight · 2018-07-31 14:41:41

My bad, yes. TensorFlow is pretty large so I'd recommend installing OBS (sudo pacman -S obs-studio), going to File -> Settings and then Output and then changing the recording encoder from Software to Hardware (NVENC), then hit start recording.

huyizheng · 2018-07-31 15:11:12

Same here.
When I use linux-ck-haswell in repo-ck. I can't run the deviceQuery in cuda samples:

$ cd "cuda sample's directory"
$ ./bin/x86_64/linux/release/deviceQuery
./bin/x86_64/linux/release/deviceQuery Starting...

 CUDA Device Query (Runtime API) version (CUDART static linking)

cudaGetDeviceCount returned 30
-> unknown error
Result = FAIL

However I have install nvidia-dkms and it works:

$ lsmod | grep nvidia
nvidia_drm             45056  10
nvidia_modeset       1093632  8 nvidia_drm
nvidia              14061568  825 nvidia_modeset
drm_kms_helper        196608  2 nvidia_drm,i915
drm                   466944  13 drm_kms_helper,nvidia_drm,i915
ipmi_msghandler        57344  2 ipmi_devintf,nvidia

If I switch back to linux kernel then cuda works fine.

SilverMight · 2018-07-31 20:51:18

huyizheng wrote:

Same here.
When I use linux-ck-haswell in repo-ck. I can't run the deviceQuery in cuda samples:

$ cd "cuda sample's directory"
$ ./bin/x86_64/linux/release/deviceQuery
./bin/x86_64/linux/release/deviceQuery Starting...

 CUDA Device Query (Runtime API) version (CUDART static linking)

cudaGetDeviceCount returned 30
-> unknown error
Result = FAIL

However I have install nvidia-dkms and it works:

$ lsmod | grep nvidia
nvidia_drm             45056  10
nvidia_modeset       1093632  8 nvidia_drm
nvidia              14061568  825 nvidia_modeset
drm_kms_helper        196608  2 nvidia_drm,i915
drm                   466944  13 drm_kms_helper,nvidia_drm,i915
ipmi_msghandler        57344  2 ipmi_devintf,nvidia

If I switch back to linux kernel then cuda works fine.

Just tried that and got the same results as you.

./deviceQuery Starting...

 CUDA Device Query (Runtime API) version (CUDART static linking)

cudaGetDeviceCount returned 30
-> unknown error
Result = FAIL

inglor · 2018-07-31 21:32:22

Same here (linux-ck with nvidia-dkms and supposed to be working fine). Could it be that nvidia-dkms (the module) is build with gcc8 and Cuda only supports gcc7?

SilverMight · 2018-07-31 21:45:16

inglor wrote:

Same here (linux-ck with nvidia-dkms and supposed to be working fine). Could it be that nvidia-dkms (the module) is build with gcc8 and Cuda only supports gcc7?

I don't think so, since it works fine on any other kernel except the -ck one.

inglor · 2018-08-01 07:41:55

Enabled NUMA on the linux-ck kernel, recompile and works fine.

$ ./deviceQuery
./deviceQuery Starting...

 CUDA Device Query (Runtime API) version (CUDART static linking)

Detected 1 CUDA Capable device(s)

Device 0: "GeForce GTX 1080"
  CUDA Driver Version / Runtime Version          9.2 / 9.2
  CUDA Capability Major/Minor version number:    6.1
  Total amount of global memory:                 8116 MBytes (8510701568 bytes)
  (20) Multiprocessors, (128) CUDA Cores/MP:     2560 CUDA Cores
  GPU Max Clock rate:                            1835 MHz (1.84 GHz)
  Memory Clock rate:                             5005 Mhz
  Memory Bus Width:                              256-bit
  L2 Cache Size:                                 2097152 bytes
  Maximum Texture Dimension Size (x,y,z)         1D=(131072), 2D=(131072, 65536), 3D=(16384, 16384, 16384)
  Maximum Layered 1D Texture Size, (num) layers  1D=(32768), 2048 layers
  Maximum Layered 2D Texture Size, (num) layers  2D=(32768, 32768), 2048 layers
  Total amount of constant memory:               65536 bytes
  Total amount of shared memory per block:       49152 bytes
  Total number of registers available per block: 65536
  Warp size:                                     32
  Maximum number of threads per multiprocessor:  2048
  Maximum number of threads per block:           1024
  Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
  Max dimension size of a grid size    (x,y,z): (2147483647, 65535, 65535)
  Maximum memory pitch:                          2147483647 bytes
  Texture alignment:                             512 bytes
  Concurrent copy and kernel execution:          Yes with 2 copy engine(s)
  Run time limit on kernels:                     Yes
  Integrated GPU sharing Host Memory:            No
  Support host page-locked memory mapping:       Yes
  Alignment requirement for Surfaces:            Yes
  Device has ECC support:                        Disabled
  Device supports Unified Addressing (UVA):      Yes
  Device supports Compute Preemption:            Yes
  Supports Cooperative Kernel Launch:            Yes
  Supports MultiDevice Co-op Kernel Launch:      Yes
  Device PCI Domain ID / Bus ID / location ID:   0 / 66 / 0
  Compute Mode:
     < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >

deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 9.2, CUDA Runtime Version = 9.2, NumDevs = 1
Result = PASS

$ uname -a
Linux tiamat 4.17.11-1-ck #1 SMP PREEMPT Wed Aug 1 07:42:19 BST 2018 x86_64 GNU/Linux

huyizheng · 2018-08-01 09:31:05

inglor wrote:

Enabled NUMA on the linux-ck kernel, recompile and works fine.

$ ./deviceQuery
./deviceQuery Starting...

 CUDA Device Query (Runtime API) version (CUDART static linking)

Detected 1 CUDA Capable device(s)

Device 0: "GeForce GTX 1080"
  CUDA Driver Version / Runtime Version          9.2 / 9.2
  CUDA Capability Major/Minor version number:    6.1
  Total amount of global memory:                 8116 MBytes (8510701568 bytes)
  (20) Multiprocessors, (128) CUDA Cores/MP:     2560 CUDA Cores
  GPU Max Clock rate:                            1835 MHz (1.84 GHz)
  Memory Clock rate:                             5005 Mhz
  Memory Bus Width:                              256-bit
  L2 Cache Size:                                 2097152 bytes
  Maximum Texture Dimension Size (x,y,z)         1D=(131072), 2D=(131072, 65536), 3D=(16384, 16384, 16384)
  Maximum Layered 1D Texture Size, (num) layers  1D=(32768), 2048 layers
  Maximum Layered 2D Texture Size, (num) layers  2D=(32768, 32768), 2048 layers
  Total amount of constant memory:               65536 bytes
  Total amount of shared memory per block:       49152 bytes
  Total number of registers available per block: 65536
  Warp size:                                     32
  Maximum number of threads per multiprocessor:  2048
  Maximum number of threads per block:           1024
  Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
  Max dimension size of a grid size    (x,y,z): (2147483647, 65535, 65535)
  Maximum memory pitch:                          2147483647 bytes
  Texture alignment:                             512 bytes
  Concurrent copy and kernel execution:          Yes with 2 copy engine(s)
  Run time limit on kernels:                     Yes
  Integrated GPU sharing Host Memory:            No
  Support host page-locked memory mapping:       Yes
  Alignment requirement for Surfaces:            Yes
  Device has ECC support:                        Disabled
  Device supports Unified Addressing (UVA):      Yes
  Device supports Compute Preemption:            Yes
  Supports Cooperative Kernel Launch:            Yes
  Supports MultiDevice Co-op Kernel Launch:      Yes
  Device PCI Domain ID / Bus ID / location ID:   0 / 66 / 0
  Compute Mode:
     < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >

deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 9.2, CUDA Runtime Version = 9.2, NumDevs = 1
Result = PASS

$ uname -a
Linux tiamat 4.17.11-1-ck #1 SMP PREEMPT Wed Aug 1 07:42:19 BST 2018 x86_64 GNU/Linux

But in linux-ck's PKGBUILD it says that it's not recommend to enable this feature in single CPU platform.

progandy · 2018-08-01 09:45:47

With CUDA you are using the GPU as a processor, so it is not a single CPU platform anymore.

SilverMight · 2018-08-01 17:14:37

Just compiled with NUMA, definitely fixed the issue. Thanks for the fix

graysky · 2018-08-01 17:48:37

I will comment the PKGBUILD for CUDA users and reference this discussion, thank you.

Arch Linux

#1 2018-07-29 18:05:21

[SOLVED] CUDA not working on linux-ck kernel

#2 2018-07-30 10:25:16

Re: [SOLVED] CUDA not working on linux-ck kernel

#3 2018-07-30 15:49:59

Re: [SOLVED] CUDA not working on linux-ck kernel

#4 2018-07-30 17:08:55

Re: [SOLVED] CUDA not working on linux-ck kernel

#5 2018-07-31 14:41:41

Re: [SOLVED] CUDA not working on linux-ck kernel

#6 2018-07-31 15:11:12

Re: [SOLVED] CUDA not working on linux-ck kernel

#7 2018-07-31 20:51:18

Re: [SOLVED] CUDA not working on linux-ck kernel

#8 2018-07-31 21:32:22

Re: [SOLVED] CUDA not working on linux-ck kernel

#9 2018-07-31 21:45:16

Re: [SOLVED] CUDA not working on linux-ck kernel

#10 2018-08-01 07:41:55

Re: [SOLVED] CUDA not working on linux-ck kernel

#11 2018-08-01 09:31:05

Re: [SOLVED] CUDA not working on linux-ck kernel

#12 2018-08-01 09:45:47

Re: [SOLVED] CUDA not working on linux-ck kernel

#13 2018-08-01 17:14:37

Re: [SOLVED] CUDA not working on linux-ck kernel

#14 2018-08-01 17:48:37

Re: [SOLVED] CUDA not working on linux-ck kernel

Board footer