[SOLVED] Unable to locate CUDA libraries and establish connection

Kvar_ispw17 · 2018-01-09 13:26:32

The problems are:

NVIDIA nsight cannot find the NVIDIA GeForce GT 630M card.
The CUDA compiled programs cannot run correctly on GPU.

Some information about the enviroment:

$ uname -a
Linux dedsec 4.14.12-1-ARCH #1 SMP PREEMPT Fri Jan 5 18:19:34 UTC 2018 x86_64 GNU/Linux
$ lspci | grep VGA
00:02.0 VGA compatible controller: Intel Corporation 3rd Gen Core processor Graphics Controller (rev 09)
01:00.0 VGA compatible controller: NVIDIA Corporation GF108M [GeForce GT 620M/630M/635M/640M LE] (rev a1)
$ echo $PATH
/opt/cuda/bin:/opt/cuda/bin:/usr/local/sbin:/usr/local/bin:/usr/bin:/opt/android-sdk/platform-tools:/opt/android-sdk/tools:/opt/android-sdk/tools/bin:/opt/cuda/bin:/usr/lib/jvm/default/bin:/usr/bin/site_perl:/usr/bin/vendor_perl:/usr/bin/core_perl

I have both graphics card of Intel HD4000 and Nvidia GeForce GT 630M, so I use bumblebee to run programs.
The command optirun runs fine on other programs but when I type:

$ optirun nsight
OpenJDK 64-Bit Server VM warning: ignoring option MaxPermSize=256m; support was removed in 8.0
CompilerOracle: exclude java/lang/reflect/Array.newInstance
### Excluding compile: static java.lang.reflect.Array::newInstance
OpenJDK 64-Bit Server VM warning: You have loaded library /home/kvar_ispw17/.eclipse/org.eclipse.platform_4.4.1_939389857_linux_gtk_x86_64/configuration/org.eclipse.osgi/19/0/.cp/libvp_linux_x86_64.so which might have disabled stack guard. The VM will try to fix the stack guard now.
It's highly recommended that you fix the library with 'execstack -c <libfile>', or link it with '-z noexecstack'.

Then there is a window shows:

Unable to locate CUDA libraries and establish connection with CUDA driver.
unknown event sm_cta_launched

And here are the CUDA test utilities:

[root@dedsec release]# ./deviceQuery
./deviceQuery Starting...

 CUDA Device Query (Runtime API) version (CUDART static linking)

Detected 1 CUDA Capable device(s)

Device 0: "GeForce GT 630M"
  CUDA Driver Version / Runtime Version          9.1 / 9.1
  CUDA Capability Major/Minor version number:    2.1
  Total amount of global memory:                 964 MBytes (1011220480 bytes)
MapSMtoCores for SM 2.1 is undefined.  Default to use 64 Cores/SM
MapSMtoCores for SM 2.1 is undefined.  Default to use 64 Cores/SM
  ( 2) Multiprocessors, ( 64) CUDA Cores/MP:     128 CUDA Cores
  GPU Max Clock rate:                            950 MHz (0.95 GHz)
  Memory Clock rate:                             900 Mhz
  Memory Bus Width:                              128-bit
  L2 Cache Size:                                 131072 bytes
  Maximum Texture Dimension Size (x,y,z)         1D=(65536), 2D=(65536, 65535), 3D=(2048, 2048, 2048)
  Maximum Layered 1D Texture Size, (num) layers  1D=(16384), 2048 layers
  Maximum Layered 2D Texture Size, (num) layers  2D=(16384, 16384), 2048 layers
  Total amount of constant memory:               65536 bytes
  Total amount of shared memory per block:       49152 bytes
  Total number of registers available per block: 32768
  Warp size:                                     32
  Maximum number of threads per multiprocessor:  1536
  Maximum number of threads per block:           1024
  Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
  Max dimension size of a grid size    (x,y,z): (65535, 65535, 65535)
  Maximum memory pitch:                          2147483647 bytes
  Texture alignment:                             512 bytes
  Concurrent copy and kernel execution:          Yes with 1 copy engine(s)
  Run time limit on kernels:                     No
  Integrated GPU sharing Host Memory:            No
  Support host page-locked memory mapping:       Yes
  Alignment requirement for Surfaces:            Yes
  Device has ECC support:                        Disabled
  Device supports Unified Addressing (UVA):      Yes
  Supports Cooperative Kernel Launch:            No
  Supports MultiDevice Co-op Kernel Launch:      No
  Device PCI Domain ID / Bus ID / location ID:   0 / 1 / 0
  Compute Mode:
     < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >

deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 9.1, CUDA Runtime Version = 9.1, NumDevs = 1
Result = PASS
[root@dedsec release]# ./bandwidthTest 
[CUDA Bandwidth Test] - Starting...
Running on...

 Device 0: GeForce GT 630M
 Quick Mode

 Host to Device Bandwidth, 1 Device(s)
 PINNED Memory Transfers
   Transfer Size (Bytes)	Bandwidth(MB/s)
   33554432			5520.8

 Device to Host Bandwidth, 1 Device(s)
 PINNED Memory Transfers
   Transfer Size (Bytes)	Bandwidth(MB/s)
   33554432			5316.6

 Device to Device Bandwidth, 1 Device(s)
 PINNED Memory Transfers
   Transfer Size (Bytes)	Bandwidth(MB/s)
   33554432			24882.4

Result = PASS

NOTE: The CUDA Samples are not meant for performance measurements. Results may vary when GPU Boost is enabled.

However, when I run the specific CUDA programs, all of them run into RUNTIME_ERROR:

[root@dedsec release]# optirun ./lineOfSight 
[./lineOfSight] - Starting...
MapSMtoCores for SM 2.1 is undefined.  Default to use 64 Cores/SM
GPU Device 0: "GeForce GT 630M" with compute capability 2.1

CUDA error at lineOfSight.cu:166 code=18(cudaErrorInvalidTexture) "cudaBindTextureToArray(g_HeightFieldTex, heightFieldArray, channelDesc)"

Any ideas about this? Thanks ahead!

Last edited by Kvar_ispw17 (2018-01-09 23:45:47)

jean_no · 2018-01-09 16:53:48

Hi

cuda 9.1.85-1 has removed the support for Compute Capability 2.1 and 2.0

A+

Kvar_ispw17 · 2018-01-09 23:45:14

jean_no wrote:

Hi
cuda 9.1.85-1 has removed the support for Compute Capability 2.1 and 2.0
A+

THANKS A LOT !
I just found a webpage here : cuda-c-programming-guide.
And I found this :

The compute capability comprises a major revision number X and a minor revision number Y and is denoted by X.Y.
Devices with the same major revision number are of the same core architecture. The major revision number is 7 for devices based on the Volta architecture, 6 for devices based on the Pascal architecture, 5 for devices based on the Maxwell architecture, 3 for devices based on the Kepler architecture, 2 for devices based on the Fermi architecture, and 1 for devices based on the Tesla architecture.
The minor revision number corresponds to an incremental improvement to the core architecture, possibly including new features.

Anyway, this laptop is really old indeed ( Produced in 2011 ) and thanks for your answer!

Kvar_ispw17 · 2018-01-10 00:48:35

Now it's running perfectly.

Arch Linux

#1 2018-01-09 13:26:32

[SOLVED] Unable to locate CUDA libraries and establish connection

#2 2018-01-09 16:53:48

Re: [SOLVED] Unable to locate CUDA libraries and establish connection

#3 2018-01-09 23:45:14

Re: [SOLVED] Unable to locate CUDA libraries and establish connection

#4 2018-01-10 00:48:35

Re: [SOLVED] Unable to locate CUDA libraries and establish connection

Board footer