You are not logged in.

#1 2022-05-14 11:14:23

feng
Member
Registered: 2019-03-26
Posts: 11

Nvidia docker: Error response from daemon

I have two legacy Tesla K80 GPU on my server. As tensorflow and pytorch in the repo are not supporting sm37, I switch to docker containers. I followed the wiki article https://wiki.archlinux.org/title/Docker … commended)

But unfortunately, when executing `docker run -it -u $(id -u):$(id -g) --gpus all  -v /cache/code:/code tensorflow/tensorflow:2.7.1-gpu bash ` to bring up my container yesterday, I came across such a problem:

> docker: Error response from daemon: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error running hook #0: error running hook: exit status 1, stdout: , stderr: nvidia-container-cli: initialization error: driver rpc error: timed out: unknown.



This is very strange as I remember I did not have such a problem when starting an Nvidia GPU docker several weeks ago.


After few hours, I can get the docker running by  running `sudo pkill -SIGHUP dockerd` and `sudo systemctl restart docker` before `docker run -it -u $(id -u):$(id -g) --gpus all  -v /cache/code:/code tensorflow/tensorflow:2.7.1-gpu bash`

But this is just a workaround, I would like to know if someone had an identical problem and if there is a solution for this problem.

Last edited by feng (2022-05-14 11:14:37)

Offline

Board footer

Powered by FluxBB