[1] CUDA – Instalacja
11 marca 2022Zainstaluj platformę obliczeniową GPU (GPGPU (General-Purpose computing on Graphics Processing Units)), CUDA (Compute Unified Device Architecture) dostarczoną przez firmę NVIDIA.
Aby korzystać z CUDA, twój komputer musi mieć karty graficzne NVIDIA, które obsługują CUDA. Upewnij się o tym, na poniższej stronie (większość produktów z ostatnich kilku lat jest kompatybilna): https://developer.nvidia.com/cuda-gpus.
Dodatkowo w zależności od posiadanej karty graficznej oraz wymaganych dla niej sterowników potrzebujesz odpowiedniej wersji CUDA: https://docs.nvidia.com/cuda/cuda-toolkit-release-notes/index.html.
CUDA Toolkit | Toolkit Driver Version | |
Linux x86_64 Driver Version | Windows x86_64 Driver Version | |
CUDA 11.6 Update 1 | >=510.47.03 | >=511.65 |
CUDA 11.6 GA | >=510.39.01 | >=511.23 |
CUDA 11.5 Update 2 | >=495.29.05 | >=496.13 |
CUDA 11.5 Update 1 | >=495.29.05 | >=496.13 |
CUDA 11.5 GA | >=495.29.05 | >=496.04 |
CUDA 11.4 Update 4 | >=470.82.01 | >=472.50 |
CUDA 11.4 Update 3 | >=470.82.01 | >=472.50 |
CUDA 11.4 Update 2 | >=470.57.02 | >=471.41 |
CUDA 11.4 Update 1 | >=470.57.02 | >=471.41 |
CUDA 11.4.0 GA | >=470.42.01 | >=471.11 |
CUDA 11.3.1 Update 1 | >=465.19.01 | >=465.89 |
CUDA 11.3.0 GA | >=465.19.01 | >=465.89 |
CUDA 11.2.2 Update 2 | >=460.32.03 | >=461.33 |
CUDA 11.2.1 Update 1 | >=460.32.03 | >=461.09 |
CUDA 11.2.0 GA | >=460.27.03 | >=460.82 |
CUDA 11.1.1 Update 1 | >=455.32 | >=456.81 |
CUDA 11.1 GA | >=455.23 | >=456.38 |
CUDA 11.0.3 Update 1 | >= 450.51.06 | >= 451.82 |
CUDA 11.0.2 GA | >= 450.51.05 | >= 451.48 |
CUDA 11.0.1 RC | >= 450.36.06 | >= 451.22 |
CUDA 10.2.89 | >= 440.33 | >= 441.22 |
CUDA 10.1 (10.1.105 general release, and updates) | >= 418.39 | >= 418.96 |
CUDA 10.0.130 | >= 410.48 | >= 411.31 |
CUDA 9.2 (9.2.148 Update 1) | >= 396.37 | >= 398.26 |
CUDA 9.2 (9.2.88) | >= 396.26 | >= 397.44 |
CUDA 9.1 (9.1.85) | >= 390.46 | >= 391.29 |
CUDA 9.0 (9.0.76) | >= 384.81 | >= 385.54 |
CUDA 8.0 (8.0.61 GA2) | >= 375.26 | >= 376.51 |
CUDA 8.0 (8.0.44) | >= 367.48 | >= 369.30 |
CUDA 7.5 (7.5.16) | >= 352.31 | >= 353.66 |
CUDA 7.0 (7.0.28) | >= 346.46 | >= 347.62 |
Z tej strony można pobrać archiwalne oraz najnowszą wersję CUDA Toolkit: https://developer.nvidia.com/cuda-toolkit-archive.
Dodatkowo dochodzi obsługa GCC. W poniższej tabeli podano maksymalne wersje GCC i ich obsługę przez CUDA.
CUDA version | max supported GCC version |
11.4.1+, 11.5, 11.6 | 11 |
11.1, 11.2, 11.3, 11.4.0 | 10 |
11 | 9 |
10.1, 10.2 | 8 |
9.2, 10.0 | 7 |
9.0, 9.1 | 6 |
8 | 5.3 |
7 | 4.9 |
5.5, 6 | 4.8 |
4.2, 5 | 4.6 |
4.1 | 4.5 |
4.0 | 4.4 |
[1] Zainstaluj sterownik graficzny NVIDIA dla swojej karty graficznej, patrz tutaj.
[2] Zainstaluj CUDA z oficjalnego repozytorium NVIDIA.
Przykład ten opiera się na środowisku, które ustawiłeś w oficjalnym repozytorium NVIDIA podczas instalacji sterownika w poprzedniej sekcji [1].
[root@vlsr05 ~]# dnf install cuda [root@vlsr05 ~]# mcedit /etc/profile.d/cuda116.sh # stwórz nowy export PATH=/usr/local/cuda-11.6/bin${PATH:+:${PATH}} export LD_LIBRARY_PATH=/usr/local/cuda-11.6/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}} [root@vlsr05 ~]# source /etc/profile.d/cuda1161.sh [root@vlsr05 ~]# nvcc --version nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2022 NVIDIA Corporation Built on Thu_Feb_10_18:23:41_PST_2022 Cuda compilation tools, release 11.6, V11.6.112 Build cuda_11.6.r11.6/compiler.30978841_0
[3] Sprawdź poprawność instalacji wykonując poniższe polecenia jako „zwykły” użytkownik.
Od wersji Cuda 11.6 przykłady nie są dostarczane razem z instalatorem i trzeba je pobrać i zainstalować osobno z GitHub. Adres: https://github.com/NVIDIA/cuda-samples.
# skopiuj przykłady [user01@vlsr05 ~]# git clone https://github.com/NVIDIA/cuda-samples.git [user01@vlsr05 ~]$ cd ./cuda-samples/Samples/1_Utilities/deviceQuery # skompiluj przykład deviceQuery [user01@vlsr05 deviceQuery]$ make # uruchom przykład deviceQuery [user01@vlsr05 deviceQuery]$ ./deviceQuery ./deviceQuery Starting... CUDA Device Query (Runtime API) version (CUDART static linking) Detected 1 CUDA Capable device(s) Device 0: "NVIDIA GeForce GTX 1050" CUDA Driver Version / Runtime Version 11.6 / 11.6 CUDA Capability Major/Minor version number: 6.1 Total amount of global memory: 2000 MBytes (2097152000 bytes) (005) Multiprocessors, (128) CUDA Cores/MP: 640 CUDA Cores GPU Max Clock rate: 1468 MHz (1.47 GHz) Memory Clock rate: 3504 Mhz Memory Bus Width: 128-bit L2 Cache Size: 1048576 bytes Maximum Texture Dimension Size (x,y,z) 1D=(131072), 2D=(131072, 65536), 3D=(16384, 16384, 16384) Maximum Layered 1D Texture Size, (num) layers 1D=(32768), 2048 layers Maximum Layered 2D Texture Size, (num) layers 2D=(32768, 32768), 2048 layers Total amount of constant memory: 65536 bytes Total amount of shared memory per block: 49152 bytes Total shared memory per multiprocessor: 98304 bytes Total number of registers available per block: 65536 Warp size: 32 Maximum number of threads per multiprocessor: 2048 Maximum number of threads per block: 1024 Max dimension size of a thread block (x,y,z): (1024, 1024, 64) Max dimension size of a grid size (x,y,z): (2147483647, 65535, 65535) Maximum memory pitch: 2147483647 bytes Texture alignment: 512 bytes Concurrent copy and kernel execution: Yes with 2 copy engine(s) Run time limit on kernels: No Integrated GPU sharing Host Memory: No Support host page-locked memory mapping: Yes Alignment requirement for Surfaces: Yes Device has ECC support: Disabled Device supports Unified Addressing (UVA): Yes Device supports Managed Memory: Yes Device supports Compute Preemption: Yes Supports Cooperative Kernel Launch: Yes Supports MultiDevice Co-op Kernel Launch: Yes Device PCI Domain ID / Bus ID / location ID: 0 / 1 / 0 Compute Mode: < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) > deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 11.6, CUDA Runtime Version = 11.6, NumDevs = 1 Result = PASS # spróbuj uruchomić przykład bandwidthTest [user01@vlsr05 deviceQuery]$ cd ~/cuda-samples/Samples/1_Utilities/bandwidthTest/ [user01@vlsr05 bandwidthTest]$ make [user01@vlsr05 bandwidthTest]$ ./bandwidthTest [CUDA Bandwidth Test] - Starting... Running on... Device 0: NVIDIA GeForce GTX 1050 Quick Mode Host to Device Bandwidth, 1 Device(s) PINNED Memory Transfers Transfer Size (Bytes) Bandwidth(GB/s) 32000000 12.7 Device to Host Bandwidth, 1 Device(s) PINNED Memory Transfers Transfer Size (Bytes) Bandwidth(GB/s) 32000000 13.2 Device to Device Bandwidth, 1 Device(s) PINNED Memory Transfers Transfer Size (Bytes) Bandwidth(GB/s) 32000000 98.4 Result = PASS NOTE: The CUDA Samples are not meant for performance measurements. Results may vary when GPU Boost is enabled.