Wednesday, July 20, 2022

Install cudnn8

Official doc:

https://docs.nvidia.com/deeplearning/cudnn/install-guide/index.html#package-manager-ubuntu-install 


Download header and libs

https://developer.nvidia.com/rdp/cudnn-download

Procedure

  1. Navigate to your <cudnnpath> directory containing the cuDNN tar file.
  2. Unzip the cuDNN package.
    $ tar -xvf cudnn-linux-x86_64-8.x.x.x_cudaX.Y-archive.tar.xz
  3. Copy the following files into the CUDA toolkit directory.
    $ sudo cp cudnn-*-archive/include/cudnn*.h /usr/local/cuda/include 
    $ sudo cp -P cudnn-*-archive/lib/libcudnn* /usr/local/cuda/lib64 
    $ sudo chmod a+r /usr/local/cuda/include/cudnn*.h /usr/local/cuda/lib64/libcudnn*

Wednesday, June 8, 2022

Docker error: could not select device driver

Problem:

docker: Error response from daemon: could not select device driver "" with capabilities: [[gpu]].

Solution:

Install nvidia-docker

Document: https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html#docker

Install guid: https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html#docker


Installing on Ubuntu and Debian

The following steps can be used to setup NVIDIA Container Toolkit on Ubuntu LTS - 16.04, 18.04, 20.4 and Debian - Stretch, Buster distributions.

Setting up Docker

Docker-CE on Ubuntu can be setup using Docker’s official convenience script:

$ curl https://get.docker.com | sh \
  && sudo systemctl --now enable docker

See also

Follow the official instructions for more details and post-install actions.

Setting up NVIDIA Container Toolkit

Setup the package repository and the GPG key:

$ distribution=$(. /etc/os-release;echo $ID$VERSION_ID) \
      && curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg \
      && curl -s -L https://nvidia.github.io/libnvidia-container/$distribution/libnvidia-container.list | \
            sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
            sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list

Note

To get access to experimental features and access to release candidates, you may want to add the experimental branch to the repository listing:

$  distribution=$(. /etc/os-release;echo $ID$VERSION_ID) \
      && curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg \
      && curl -s -L https://nvidia.github.io/libnvidia-container/experimental/$distribution/libnvidia-container.list | \
         sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
         sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.listthe Signed-By option, see the relevant troubleshooting section.

Install the nvidia-docker2 package (and dependencies) after updating the package listing:

$ sudo apt-get update
$ sudo apt-get install -y nvidia-docker2

Restart the Docker daemon to complete the installation after setting the default runtime:

$ sudo systemctl restart docker





Tuesday, May 31, 2022

GPG error: "public key is not available" in Ubuntu

For both server and container:

Official solution

https://developer.nvidia.com/blog/updating-the-cuda-linux-gpg-repository-key/

Remove the outdated signing key

Debian, Ubuntu, WSL

$ sudo apt-key del 7fa2af80

Install the new key

For Debian-based distributions, including Ubuntu, you must also install the new package or manually install the new signing key.

Install the new cuda-keyring package

To avoid the need for manual key installation steps, NVIDIA is providing a new helper package to automate the installation of new signing keys for NVIDIA repositories. 

Replace $distro/$arch in the following commands with values appropriate for your OS; for example:

  • ubuntu1604/x86_64
  • ubuntu1804/cross-linux-sbsa
  • ubuntu1804/ppc64el
  • ubuntu1804/sbsa
  • ubuntu1804/x86_64
  • ubuntu2004/cross-linux-sbsa
  • ubuntu2004/sbsa
  • ubuntu2004/x86_64
  • ubuntu2204/sbsa
  • ubuntu2204/x86_64
  • wsl-ubuntu/x86_64

Debian, Ubuntu, WSL

$ wget https://developer.download.nvidia.com/compute/cuda/repos/$distro/$arch/cuda-keyring_1.0-1_all.deb
$ sudo dpkg -i cuda-keyring_1.0-1_all.deb

Common issues and solutions on Debian-based distros

Here are some common errors that we’ve helped people with. If you see an error not listed here, please comment below.

Duplicate .list entries

{{E: Conflicting values set for option Signed-By regarding source
https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/ /: 
/usr/share/keyrings/cuda-archive-keyring.gpg !=
E: The list of sources could not be read.}}

Solution: If you previously used add-apt-repository to enable the CUDA repository, then remove the duplicate entry.

sudo sed -i '/developer\.download\.nvidia\.com\/compute\/cuda\/repos/d' /etc/apt/sources.list

Also check for and remove cuda*.list files under the /etc/apt/sources.d/ directory.

----------------------------------------------

Error

Err:6 http://packages.microsoft.com/repos/azurecore focal InRelease

  The following signatures couldn't be verified because the public key is not available: NO_PUBKEY EB3E94ADBE1229CF

Reading package lists... Done

W: GPG error: http://packages.microsoft.com/repos/azurecore focal InRelease: The following signatures couldn't be verified because the public key is not available: NO_PUBKEY EB3E94ADBE1229CF

Solution

sudo apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv-keys

EB3E94ADBE1229CF

Problem in docker

RUN sudo apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/3bf863cc.pub

or 


RUN wget https://developer.download.nvidia.com/compute/cuda/repos/$distro/$arch/cuda-keyring_1.0-1_all.deb
RUN sudo dpkg -i cuda-keyring_1.0-1_all.deb


Update 2023-0705

Error:

gpgkeys: key F60F4B3D7FA2AF80 not found on keyserver





The solution is:
wget -qO - https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/7fa2af80.pub | sudo apt-key add -

as documented here: https://developer.nvidia.com/cuda-downloads -> Linux -> x86_64 -> Ubuntu -> 18.04 -> deb (network)



Conflict key error for "sudo apt update", can not find in source

E: Conflicting values set for option Signed-By regarding source https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/ /: /usr/share/keyrings/cuda-archive-keyring.gpg !=

Solution 1:
remove /etc/apt/source.list.d/cuda*, /etc/apt/source.list.d/nccl*
del cuda_key

Solution 2:

sed -i '/developer\.download\.nvidia\.com\/compute\/cuda\/repos/d' /etc/apt/sources.list.d/*
sed -i '/developer\.download\.nvidia\.com\/compute\/machine-learning\/repos/d' /etc/apt/sources.list.d/*

from:

https://askubuntu.com/questions/1424040/e-conflicting-values-set-for-option-signed-by-regarding-source-https-develope/1424054#1424054