You are not logged in.

#1 2018-07-29 18:05:21

SilverMight
Member
Registered: 2017-11-20
Posts: 25

[SOLVED] CUDA not working on linux-ck kernel

Recently, after an update CUDA hasn't been working at all on the linux-ck kernel. Trying to use OBS with NVENC enabled doesn't work, using CUDA with TensorFlow doesn't work either, etc. etc. However, as soon as I boot into the standard linux kernel, everything works fine.

I'm using nvidia-dkms so the driver can work between both kernels. I do have linux-ck-headers and dkms seems to install fine to it, so nothing else seems wrong except the fact that no part of CUDA works when using the linux-ck kernel.

Any help is appreciated.

EDIT: Tried on linux-zen and it seems to work fine there, so a bit clueless on why it won't work on linux-ck

Last edited by SilverMight (2018-08-01 17:13:41)

Offline

#2 2018-07-30 10:25:16

inglor
Trusted User (TU)
Registered: 2008-07-22
Posts: 81

Re: [SOLVED] CUDA not working on linux-ck kernel

Are you using modprobed-db also? Maybe you are missing a module. If you can share the project I can try reproducing it locally.

Last edited by inglor (2018-07-30 10:26:14)

Offline

#3 2018-07-30 15:49:59

SilverMight
Member
Registered: 2017-11-20
Posts: 25

Re: [SOLVED] CUDA not working on linux-ck kernel

inglor wrote:

Are you using modprobed-db also? Maybe you are missing a module. If you can share the project I can try reproducing it locally.

Don't believe so, however I have tried running nvidia-modprobe to no avail. I'll give that a shot.

The kernel can be found at https://aur.archlinux.org/packages/linux-ck/

Offline

#4 2018-07-30 17:08:55

inglor
Trusted User (TU)
Registered: 2008-07-22
Posts: 81

Re: [SOLVED] CUDA not working on linux-ck kernel

SilverMight wrote:

Sorry I wasn't clear. If you tell me the steps to reproduce it I can give it a go on my PC which I have linux-ck with DKMS and CUDA avalaible. This is why I was asking for a project. 

SilverMight wrote:

[..]Trying to use OBS with NVENC enabled doesn't work, using CUDA with TensorFlow doesn't work either, etc. etc. However, as soon as I boot into the standard linux kernel, everything works fine. [..]

Is this coming from a project ?

Offline

#5 2018-07-31 14:41:41

SilverMight
Member
Registered: 2017-11-20
Posts: 25

Re: [SOLVED] CUDA not working on linux-ck kernel

My bad, yes. TensorFlow is pretty large so I'd recommend installing OBS (sudo pacman -S obs-studio), going to File -> Settings and then Output and then changing the recording encoder from Software to Hardware (NVENC), then hit start recording.

Offline

#6 2018-07-31 15:11:12

huyizheng
Member
Registered: 2018-05-15
Posts: 20

Re: [SOLVED] CUDA not working on linux-ck kernel

Same here.
When I use linux-ck-haswell in repo-ck. I can't run the deviceQuery in cuda samples:

$ cd "cuda sample's directory"
$ ./bin/x86_64/linux/release/deviceQuery
./bin/x86_64/linux/release/deviceQuery Starting...

 CUDA Device Query (Runtime API) version (CUDART static linking)

cudaGetDeviceCount returned 30
-> unknown error
Result = FAIL

However I have install nvidia-dkms and it works:

$ lsmod | grep nvidia
nvidia_drm             45056  10
nvidia_modeset       1093632  8 nvidia_drm
nvidia              14061568  825 nvidia_modeset
drm_kms_helper        196608  2 nvidia_drm,i915
drm                   466944  13 drm_kms_helper,nvidia_drm,i915
ipmi_msghandler        57344  2 ipmi_devintf,nvidia

If I switch back to linux kernel then cuda works fine.

Offline

#7 2018-07-31 20:51:18

SilverMight
Member
Registered: 2017-11-20
Posts: 25

Re: [SOLVED] CUDA not working on linux-ck kernel

huyizheng wrote:

Same here.
When I use linux-ck-haswell in repo-ck. I can't run the deviceQuery in cuda samples:

$ cd "cuda sample's directory"
$ ./bin/x86_64/linux/release/deviceQuery
./bin/x86_64/linux/release/deviceQuery Starting...

 CUDA Device Query (Runtime API) version (CUDART static linking)

cudaGetDeviceCount returned 30
-> unknown error
Result = FAIL

However I have install nvidia-dkms and it works:

$ lsmod | grep nvidia
nvidia_drm             45056  10
nvidia_modeset       1093632  8 nvidia_drm
nvidia              14061568  825 nvidia_modeset
drm_kms_helper        196608  2 nvidia_drm,i915
drm                   466944  13 drm_kms_helper,nvidia_drm,i915
ipmi_msghandler        57344  2 ipmi_devintf,nvidia

If I switch back to linux kernel then cuda works fine.

Just tried that and got the same results as you.

./deviceQuery Starting...

 CUDA Device Query (Runtime API) version (CUDART static linking)

cudaGetDeviceCount returned 30
-> unknown error
Result = FAIL

Offline

#8 2018-07-31 21:32:22

inglor
Trusted User (TU)
Registered: 2008-07-22
Posts: 81

Re: [SOLVED] CUDA not working on linux-ck kernel

Same here hmm (linux-ck with nvidia-dkms and supposed to be working fine). Could it be that nvidia-dkms (the module) is build with gcc8 and Cuda only supports gcc7?

Offline

#9 2018-07-31 21:45:16

SilverMight
Member
Registered: 2017-11-20
Posts: 25

Re: [SOLVED] CUDA not working on linux-ck kernel

inglor wrote:

Same here hmm (linux-ck with nvidia-dkms and supposed to be working fine). Could it be that nvidia-dkms (the module) is build with gcc8 and Cuda only supports gcc7?

I don't think so, since it works fine on any other kernel except the -ck one.

Offline

#10 2018-08-01 07:41:55

inglor
Trusted User (TU)
Registered: 2008-07-22
Posts: 81

Re: [SOLVED] CUDA not working on linux-ck kernel

Enabled NUMA on the linux-ck kernel, recompile and works fine.

$ ./deviceQuery
./deviceQuery Starting...

 CUDA Device Query (Runtime API) version (CUDART static linking)

Detected 1 CUDA Capable device(s)

Device 0: "GeForce GTX 1080"
  CUDA Driver Version / Runtime Version          9.2 / 9.2
  CUDA Capability Major/Minor version number:    6.1
  Total amount of global memory:                 8116 MBytes (8510701568 bytes)
  (20) Multiprocessors, (128) CUDA Cores/MP:     2560 CUDA Cores
  GPU Max Clock rate:                            1835 MHz (1.84 GHz)
  Memory Clock rate:                             5005 Mhz
  Memory Bus Width:                              256-bit
  L2 Cache Size:                                 2097152 bytes
  Maximum Texture Dimension Size (x,y,z)         1D=(131072), 2D=(131072, 65536), 3D=(16384, 16384, 16384)
  Maximum Layered 1D Texture Size, (num) layers  1D=(32768), 2048 layers
  Maximum Layered 2D Texture Size, (num) layers  2D=(32768, 32768), 2048 layers
  Total amount of constant memory:               65536 bytes
  Total amount of shared memory per block:       49152 bytes
  Total number of registers available per block: 65536
  Warp size:                                     32
  Maximum number of threads per multiprocessor:  2048
  Maximum number of threads per block:           1024
  Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
  Max dimension size of a grid size    (x,y,z): (2147483647, 65535, 65535)
  Maximum memory pitch:                          2147483647 bytes
  Texture alignment:                             512 bytes
  Concurrent copy and kernel execution:          Yes with 2 copy engine(s)
  Run time limit on kernels:                     Yes
  Integrated GPU sharing Host Memory:            No
  Support host page-locked memory mapping:       Yes
  Alignment requirement for Surfaces:            Yes
  Device has ECC support:                        Disabled
  Device supports Unified Addressing (UVA):      Yes
  Device supports Compute Preemption:            Yes
  Supports Cooperative Kernel Launch:            Yes
  Supports MultiDevice Co-op Kernel Launch:      Yes
  Device PCI Domain ID / Bus ID / location ID:   0 / 66 / 0
  Compute Mode:
     < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >

deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 9.2, CUDA Runtime Version = 9.2, NumDevs = 1
Result = PASS
$ uname -a
Linux tiamat 4.17.11-1-ck #1 SMP PREEMPT Wed Aug 1 07:42:19 BST 2018 x86_64 GNU/Linux

Offline

#11 2018-08-01 09:31:05

huyizheng
Member
Registered: 2018-05-15
Posts: 20

Re: [SOLVED] CUDA not working on linux-ck kernel

inglor wrote:

Enabled NUMA on the linux-ck kernel, recompile and works fine.

$ ./deviceQuery
./deviceQuery Starting...

 CUDA Device Query (Runtime API) version (CUDART static linking)

Detected 1 CUDA Capable device(s)

Device 0: "GeForce GTX 1080"
  CUDA Driver Version / Runtime Version          9.2 / 9.2
  CUDA Capability Major/Minor version number:    6.1
  Total amount of global memory:                 8116 MBytes (8510701568 bytes)
  (20) Multiprocessors, (128) CUDA Cores/MP:     2560 CUDA Cores
  GPU Max Clock rate:                            1835 MHz (1.84 GHz)
  Memory Clock rate:                             5005 Mhz
  Memory Bus Width:                              256-bit
  L2 Cache Size:                                 2097152 bytes
  Maximum Texture Dimension Size (x,y,z)         1D=(131072), 2D=(131072, 65536), 3D=(16384, 16384, 16384)
  Maximum Layered 1D Texture Size, (num) layers  1D=(32768), 2048 layers
  Maximum Layered 2D Texture Size, (num) layers  2D=(32768, 32768), 2048 layers
  Total amount of constant memory:               65536 bytes
  Total amount of shared memory per block:       49152 bytes
  Total number of registers available per block: 65536
  Warp size:                                     32
  Maximum number of threads per multiprocessor:  2048
  Maximum number of threads per block:           1024
  Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
  Max dimension size of a grid size    (x,y,z): (2147483647, 65535, 65535)
  Maximum memory pitch:                          2147483647 bytes
  Texture alignment:                             512 bytes
  Concurrent copy and kernel execution:          Yes with 2 copy engine(s)
  Run time limit on kernels:                     Yes
  Integrated GPU sharing Host Memory:            No
  Support host page-locked memory mapping:       Yes
  Alignment requirement for Surfaces:            Yes
  Device has ECC support:                        Disabled
  Device supports Unified Addressing (UVA):      Yes
  Device supports Compute Preemption:            Yes
  Supports Cooperative Kernel Launch:            Yes
  Supports MultiDevice Co-op Kernel Launch:      Yes
  Device PCI Domain ID / Bus ID / location ID:   0 / 66 / 0
  Compute Mode:
     < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >

deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 9.2, CUDA Runtime Version = 9.2, NumDevs = 1
Result = PASS
$ uname -a
Linux tiamat 4.17.11-1-ck #1 SMP PREEMPT Wed Aug 1 07:42:19 BST 2018 x86_64 GNU/Linux

But in linux-ck's PKGBUILD it says that it's not recommend to enable this feature in single CPU platform.

Offline

#12 2018-08-01 09:45:47

progandy
Member
Registered: 2012-05-17
Posts: 5,047

Re: [SOLVED] CUDA not working on linux-ck kernel

With CUDA you are using the GPU as a processor, so it is not a single CPU platform anymore.


| alias CUTF='LANG=en_XX.UTF-8@POSIX ' |

Offline

#13 2018-08-01 17:14:37

SilverMight
Member
Registered: 2017-11-20
Posts: 25

Re: [SOLVED] CUDA not working on linux-ck kernel

Just compiled with NUMA, definitely fixed the issue. Thanks for the fix

Offline

#14 2018-08-01 17:48:37

graysky
Wiki Maintainer
From: :wq
Registered: 2008-12-01
Posts: 10,444
Website

Re: [SOLVED] CUDA not working on linux-ck kernel

I will comment the PKGBUILD for CUDA users and reference this discussion, thank you.


CPU-optimized Linux-ck packages @ Repo-ck  • AUR packagesZsh and other configs

Offline

Board footer

Powered by FluxBB