You are not logged in.

#1 2013-10-02 12:04:40

labotsirc
Member
Registered: 2013-08-21
Posts: 108

GPU fails, then works...keeps alternating like that.

For some reason, the GPU, which is a GTX 765M, is not working as expected under Linux.
It works alternating an error and a correct execution. So everytime i need to execute a CUDA application, i need to run two atempts. This is the same for games.

First Attempt -- fail

[cristobal@orion release]$ optirun -vv ./nbody 
[  473.394936] [DEBUG]Reading file: /etc/bumblebee/bumblebee.conf
[  473.395094] [INFO]Configured driver: nvidia
[  473.395182] [DEBUG]optirun version 3.2.1 starting...
[  473.395187] [DEBUG]Active configuration:
[  473.395190] [DEBUG] bumblebeed config file: /etc/bumblebee/bumblebee.conf
[  473.395192] [DEBUG] X display: :8
[  473.395195] [DEBUG] LD_LIBRARY_PATH: /usr/lib/nvidia:/usr/lib32/nvidia
[  473.395207] [DEBUG] Socket path: /var/run/bumblebee.socket
[  473.395209] [DEBUG] Accel/display bridge: auto
[  473.395211] [DEBUG] VGL Compression: proxy
[  473.395213] [DEBUG] VGLrun extra options: 
[  473.395216] [DEBUG] Primus LD Path: /usr/lib/primus:/usr/lib32/primus
[  473.395230] [DEBUG]Using auto-detected bridge virtualgl
[  481.623364] [INFO]Response: No - error: [XORG] (EE) Server terminated successfully (0). Closing log file.

[  481.623387] [ERROR]Cannot access secondary GPU - error: [XORG] (EE) Server terminated successfully (0). Closing log file.

[  481.623391] [DEBUG]Socket closed.
[  481.623415] [ERROR]Aborting because fallback start is disabled.
[  481.623419] [DEBUG]Killing all remaining processes.
[cristobal@orion release]$ 

Second attempt -- works

[cristobal@orion release]$ optirun -vv ./nbody 
[  951.741742] [DEBUG]Reading file: /etc/bumblebee/bumblebee.conf
[  951.742256] [INFO]Configured driver: nvidia
[  951.742487] [DEBUG]optirun version 3.2.1 starting...
[  951.742558] [DEBUG]Active configuration:
[  951.742593] [DEBUG] bumblebeed config file: /etc/bumblebee/bumblebee.conf
[  951.742637] [DEBUG] X display: :8
[  951.742672] [DEBUG] LD_LIBRARY_PATH: /usr/lib/nvidia:/usr/lib32/nvidia
[  951.742703] [DEBUG] Socket path: /var/run/bumblebee.socket
[  951.742743] [DEBUG] Accel/display bridge: auto
[  951.742774] [DEBUG] VGL Compression: proxy
[  951.742804] [DEBUG] VGLrun extra options: 
[  951.742834] [DEBUG] Primus LD Path: /usr/lib/primus:/usr/lib32/primus
[  951.742902] [DEBUG]Using auto-detected bridge virtualgl
[  952.617527] [INFO]Response: Yes. X is active.

[  952.617540] [INFO]Running application using virtualgl.
[  952.617631] [DEBUG]Process vglrun started, PID 4279.
Run "nbody -benchmark [-numbodies=<numBodies>]" to measure perfomance.
	-fullscreen       (run n-body simulation in fullscreen mode)
	-fp64             (use double precision floating point values for simulation)
	-hostmem          (stores simulation data in host memory)
	-benchmark        (run benchmark to measure performance) 
	-numbodies=<N>    (number of bodies (>= 1) to run in simulation) 
	-device=<d>       (where d=0,1,2.... for the CUDA device to use)
	-numdevices=<i>   (where i=(number of CUDA devices > 0) to use for simulation)
	-compare          (compares simulation results running once on the default GPU and once on the CPU)
	-cpu              (run n-body simulation on the CPU)
	-tipsy=<file.bin> (load a tipsy model file for simulation)

> Windowed mode
> Simulation data stored in video memory
> Single precision floating point simulation
> 1 Devices used for simulation
> Compute 3.0 CUDA device: [GeForce GTX 765M]
[  954.523442] [DEBUG]SIGCHILD received, but wait failed with No child processes
[  954.523457] [DEBUG]Socket closed.
[  954.523472] [DEBUG]Killing all remaining processes.
[cristobal@orion release]$ 

This alternating behavior keeps manifesting for all GPU based programs.
Any ideas?

Last edited by labotsirc (2013-10-03 13:19:04)

Offline

#2 2013-12-26 23:23:04

k0tb4tzen
Member
Registered: 2013-11-12
Posts: 32

Re: GPU fails, then works...keeps alternating like that.

I have the same problem with Kernel 3.12-4. It's impossible to start a Steam game with the dedicated graphics card. I have a nVidia Geforce GT750M in an Acer Aspire V5-573g.

first attempt:

optirun -vv glxgears
[ 1433.667049] [DEBUG]Reading file: /etc/bumblebee/bumblebee.conf
[ 1433.667214] [INFO]Configured driver: nvidia
[ 1433.667290] [DEBUG]optirun version 3.2.1 starting...
[ 1433.667300] [DEBUG]Active configuration:
[ 1433.667307] [DEBUG] bumblebeed config file: /etc/bumblebee/bumblebee.conf
[ 1433.667313] [DEBUG] X display: :8
[ 1433.667320] [DEBUG] LD_LIBRARY_PATH: /usr/lib32/primus:/usr/lib/nvidia:/usr/lib32/nvidia
[ 1433.667327] [DEBUG] Socket path: /var/run/bumblebee.socket
[ 1433.667334] [DEBUG] Accel/display bridge: auto
[ 1433.667341] [DEBUG] VGL Compression: proxy
[ 1433.667348] [DEBUG] VGLrun extra options: 
[ 1433.667355] [DEBUG] Primus LD Path: /usr/lib/primus:/usr/lib32/primus
[ 1433.667373] [DEBUG]Using auto-detected bridge virtualgl
[ 1438.456231] [INFO]Response: No - error: [XORG] (EE) Server terminated successfully (0). Closing log file.

[ 1438.456244] [ERROR]Cannot access secondary GPU - error: [XORG] (EE) Server terminated successfully (0). Closing log file.

[ 1438.456249] [DEBUG]Socket closed.
[ 1438.456259] [ERROR]Aborting because fallback start is disabled.
[ 1438.456263] [DEBUG]Killing all remaining processes.

second attempt:

optirun -vv glxgears
[ 1484.436462] [DEBUG]Reading file: /etc/bumblebee/bumblebee.conf
[ 1484.436676] [INFO]Configured driver: nvidia
[ 1484.436797] [DEBUG]optirun version 3.2.1 starting...
[ 1484.436805] [DEBUG]Active configuration:
[ 1484.436808] [DEBUG] bumblebeed config file: /etc/bumblebee/bumblebee.conf
[ 1484.436811] [DEBUG] X display: :8
[ 1484.436813] [DEBUG] LD_LIBRARY_PATH: /usr/lib32/primus:/usr/lib/nvidia:/usr/lib32/nvidia
[ 1484.436816] [DEBUG] Socket path: /var/run/bumblebee.socket
[ 1484.436818] [DEBUG] Accel/display bridge: auto
[ 1484.436821] [DEBUG] VGL Compression: proxy
[ 1484.436823] [DEBUG] VGLrun extra options: 
[ 1484.436826] [DEBUG] Primus LD Path: /usr/lib/primus:/usr/lib32/primus
[ 1484.436843] [DEBUG]Using auto-detected bridge virtualgl
[ 1485.129080] [INFO]Response: Yes. X is active.

[ 1485.129092] [INFO]Running application using virtualgl.
[ 1485.129173] [DEBUG]Process vglrun started, PID 3266.

EDIT:
As suggested in https://bbs.archlinux.org/viewtopic.php … 3#p1309653, the following command fixed it for me:

sudo tee /sys/module/rcutree/parameters/rcu_idle_gp_delay <<<1

Last edited by k0tb4tzen (2013-12-26 23:30:16)

Offline

Board footer

Powered by FluxBB