[15] NVIDIA Container Toolkit – Instalacja

13 marca 2022 Wyłączono przez Adam [zicherka] Nogły

Zainstaluj pakiet narzędzi NVIDIA Container Toolkit, aby korzystać z procesora GPU na komputerze z poziomu kontenerów.

[1] Zainstaluj sterownik NVIDIA w systemie podstawowym, patrz tutaj.

[2] Zainstaluj Podman, patrz tutaj.

[3] Zainstaluj pakiet narzędzi NVIDIA Container Toolkit.

[root@vlsr05 ~]# curl https://nvidia.github.io/nvidia-docker/centos8/nvidia-docker.repo > /etc/yum.repos.d/nvidia-docker.repo
[root@vlsr05 ~]# dnf install nvidia-container-toolkit

[4] Jeśli SELinux jest uruchomiony, to zmień politykę.

[root@vlsr05 ~]# mcedit nvidiasmi.te
# stwórz nowy
module nvidiasmi 1.0;
require {
        type container_runtime_tmpfs_t;
        type container_t;
        type xserver_misc_device_t;
        class file { open read };
        class chr_file { getattr ioctl open read write };
}
#============= container_t ==============
allow container_t container_runtime_tmpfs_t:file { open read };
allow container_t xserver_misc_device_t:chr_file { getattr ioctl open read write };

[root@vlsr05 ~]# checkmodule -m -M -o nvidiasmi.mod nvidiasmi.te
[root@vlsr05 ~]# semodule_package --outfile nvidiasmi.pp --module nvidiasmi.mod
[root@vlsr05 ~]# semodule -i nvidiasmi.pp

[5] To jest przykład jak używać [nvidia-smi] z kontenerami.

# potwierdź dostępność obrazów Cuda
[root@vlsr05 ~]# curl -s https://registry.hub.docker.com/v1/repositories/nvidia/cuda/tags | sed "s/,/\n/g" | grep name
 "name": "10.0-base"}
 "name": "10.0-base-centos6"}
 "name": "10.0-base-centos7"}
 "name": "10.0-base-ubi7"}
 "name": "10.0-base-ubuntu14.04"}
 "name": "10.0-base-ubuntu16.04"}
 "name": "10.0-base-ubuntu18.04"}
 "name": "10.0-cudnn7-devel"}
 "name": "10.0-cudnn7-devel-centos6"}
 "name": "10.0-cudnn7-devel-centos7"}
. . . . .

# pobierz obraz Cuda 11.4 i uruchom [nvidia-smi]
[root@vlsr05 ~]# podman run -e NVIDIA_VISIBLE_DEVICES=all nvidia/cuda:11.4.0-base nvidia-smi
✔ docker.io/nvidia/cuda:11.4.0-base
Trying to pull docker.io/nvidia/cuda:11.4.0-base...
Getting image source signatures
Copying blob c549ccf8d472 done
Copying blob e7569c5d1eae done
Copying blob cbdf501d7752 done
Copying blob c74b3fce51ac done
Copying blob 7a1f1c045e33 done
Copying config 13cf6a46b9 done
Writing manifest to image destination
Storing signatures
Sat Mar 12 16:10:07 2022
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 510.47.03    Driver Version: 510.47.03    CUDA Version: 11.6     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  Off  | 00000000:01:00.0 Off |                  N/A |
| 36%   37C    P0    N/A /  75W |      0MiB /  2048MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

# podłącz się do interaktywnej sesji i uruchom [nvidia-smi]
[root@vlsr05 ~]# podman run -it -e NVIDIA_VISIBLE_DEVICES=all nvidia/cuda:11.4.0-base bash
root@38d8279cb6ec:/# nvidia-smi
Sat Mar 12 16:11:40 2022
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 510.47.03    Driver Version: 510.47.03    CUDA Version: 11.6     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  Off  | 00000000:01:00.0 Off |                  N/A |
| 35%   37C    P0    N/A /  75W |      0MiB /  2048MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+
root@38d8279cb6ec:/# exit
exit

[root@vlsr05 ~]# podman images
REPOSITORY             TAG          IMAGE ID      CREATED       SIZE
docker.io/nvidia/cuda  11.4.0-base  13cf6a46b953  8 months ago  129 MB