You are not logged in.
The problems are:
NVIDIA nsight cannot find the NVIDIA GeForce GT 630M card.
The CUDA compiled programs cannot run correctly on GPU.
Some information about the enviroment:
$ uname -a
Linux dedsec 4.14.12-1-ARCH #1 SMP PREEMPT Fri Jan 5 18:19:34 UTC 2018 x86_64 GNU/Linux
$ lspci | grep VGA
00:02.0 VGA compatible controller: Intel Corporation 3rd Gen Core processor Graphics Controller (rev 09)
01:00.0 VGA compatible controller: NVIDIA Corporation GF108M [GeForce GT 620M/630M/635M/640M LE] (rev a1)
$ echo $PATH
/opt/cuda/bin:/opt/cuda/bin:/usr/local/sbin:/usr/local/bin:/usr/bin:/opt/android-sdk/platform-tools:/opt/android-sdk/tools:/opt/android-sdk/tools/bin:/opt/cuda/bin:/usr/lib/jvm/default/bin:/usr/bin/site_perl:/usr/bin/vendor_perl:/usr/bin/core_perlI have both graphics card of Intel HD4000 and Nvidia GeForce GT 630M, so I use bumblebee to run programs.
The command optirun runs fine on other programs but when I type:
$ optirun nsight
OpenJDK 64-Bit Server VM warning: ignoring option MaxPermSize=256m; support was removed in 8.0
CompilerOracle: exclude java/lang/reflect/Array.newInstance
### Excluding compile: static java.lang.reflect.Array::newInstance
OpenJDK 64-Bit Server VM warning: You have loaded library /home/kvar_ispw17/.eclipse/org.eclipse.platform_4.4.1_939389857_linux_gtk_x86_64/configuration/org.eclipse.osgi/19/0/.cp/libvp_linux_x86_64.so which might have disabled stack guard. The VM will try to fix the stack guard now.
It's highly recommended that you fix the library with 'execstack -c <libfile>', or link it with '-z noexecstack'.Then there is a window shows:
Unable to locate CUDA libraries and establish connection with CUDA driver.
unknown event sm_cta_launchedAnd here are the CUDA test utilities:
[root@dedsec release]# ./deviceQuery
./deviceQuery Starting...
CUDA Device Query (Runtime API) version (CUDART static linking)
Detected 1 CUDA Capable device(s)
Device 0: "GeForce GT 630M"
CUDA Driver Version / Runtime Version 9.1 / 9.1
CUDA Capability Major/Minor version number: 2.1
Total amount of global memory: 964 MBytes (1011220480 bytes)
MapSMtoCores for SM 2.1 is undefined. Default to use 64 Cores/SM
MapSMtoCores for SM 2.1 is undefined. Default to use 64 Cores/SM
( 2) Multiprocessors, ( 64) CUDA Cores/MP: 128 CUDA Cores
GPU Max Clock rate: 950 MHz (0.95 GHz)
Memory Clock rate: 900 Mhz
Memory Bus Width: 128-bit
L2 Cache Size: 131072 bytes
Maximum Texture Dimension Size (x,y,z) 1D=(65536), 2D=(65536, 65535), 3D=(2048, 2048, 2048)
Maximum Layered 1D Texture Size, (num) layers 1D=(16384), 2048 layers
Maximum Layered 2D Texture Size, (num) layers 2D=(16384, 16384), 2048 layers
Total amount of constant memory: 65536 bytes
Total amount of shared memory per block: 49152 bytes
Total number of registers available per block: 32768
Warp size: 32
Maximum number of threads per multiprocessor: 1536
Maximum number of threads per block: 1024
Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
Max dimension size of a grid size (x,y,z): (65535, 65535, 65535)
Maximum memory pitch: 2147483647 bytes
Texture alignment: 512 bytes
Concurrent copy and kernel execution: Yes with 1 copy engine(s)
Run time limit on kernels: No
Integrated GPU sharing Host Memory: No
Support host page-locked memory mapping: Yes
Alignment requirement for Surfaces: Yes
Device has ECC support: Disabled
Device supports Unified Addressing (UVA): Yes
Supports Cooperative Kernel Launch: No
Supports MultiDevice Co-op Kernel Launch: No
Device PCI Domain ID / Bus ID / location ID: 0 / 1 / 0
Compute Mode:
< Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >
deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 9.1, CUDA Runtime Version = 9.1, NumDevs = 1
Result = PASS
[root@dedsec release]# ./bandwidthTest
[CUDA Bandwidth Test] - Starting...
Running on...
Device 0: GeForce GT 630M
Quick Mode
Host to Device Bandwidth, 1 Device(s)
PINNED Memory Transfers
Transfer Size (Bytes) Bandwidth(MB/s)
33554432 5520.8
Device to Host Bandwidth, 1 Device(s)
PINNED Memory Transfers
Transfer Size (Bytes) Bandwidth(MB/s)
33554432 5316.6
Device to Device Bandwidth, 1 Device(s)
PINNED Memory Transfers
Transfer Size (Bytes) Bandwidth(MB/s)
33554432 24882.4
Result = PASS
NOTE: The CUDA Samples are not meant for performance measurements. Results may vary when GPU Boost is enabled.However, when I run the specific CUDA programs, all of them run into RUNTIME_ERROR:
[root@dedsec release]# optirun ./lineOfSight
[./lineOfSight] - Starting...
MapSMtoCores for SM 2.1 is undefined. Default to use 64 Cores/SM
GPU Device 0: "GeForce GT 630M" with compute capability 2.1
CUDA error at lineOfSight.cu:166 code=18(cudaErrorInvalidTexture) "cudaBindTextureToArray(g_HeightFieldTex, heightFieldArray, channelDesc)" Any ideas about this? Thanks ahead!
Last edited by Kvar_ispw17 (2018-01-09 23:45:47)
Offline
Hi
cuda 9.1.85-1 has removed the support for Compute Capability 2.1 and 2.0
A+
Offline
Hi
cuda 9.1.85-1 has removed the support for Compute Capability 2.1 and 2.0
A+
THANKS A LOT !
I just found a webpage here : cuda-c-programming-guide.
And I found this :
The compute capability comprises a major revision number X and a minor revision number Y and is denoted by X.Y.
Devices with the same major revision number are of the same core architecture. The major revision number is 7 for devices based on the Volta architecture, 6 for devices based on the Pascal architecture, 5 for devices based on the Maxwell architecture, 3 for devices based on the Kepler architecture, 2 for devices based on the Fermi architecture, and 1 for devices based on the Tesla architecture.
The minor revision number corresponds to an incremental improvement to the core architecture, possibly including new features.
Anyway, this laptop is really old indeed ( Produced in 2011 ) and thanks for your answer!
Offline
Now it's running perfectly. ![]()
Offline