validation.cuda¶

C API¶

cuda_example_py¶

teachcompute.validation.cuda.cuda_example_py.cuda_device_count() → int¶: Returns the number of cuda devices.

teachcompute.validation.cuda.cuda_example_py.cuda_device_memory(device: int = 0) → tuple¶: Returns the free and total memory for a particular device.

teachcompute.validation.cuda.cuda_example_py.cuda_devices_memory() → list¶: Returns the free and total memory for all devices.

teachcompute.validation.cuda.cuda_example_py.cuda_version() → int¶: Returns the CUDA version the project was compiled with.

teachcompute.validation.cuda.cuda_example_py.get_device_prop(device_id: int = 0) → dict¶: Returns the device properties.

teachcompute.validation.cuda.cuda_example_py.vector_add(v1: numpy.ndarray[numpy.float32], v2: numpy.ndarray[numpy.float32], cuda_device: int = 0, repeat: int = 1) → numpy.ndarray[numpy.float32]¶

Computes the additions of two vectors of the same size with CUDA.

Paramètres:

v1 – array
v2 – array
repeat – number of times to repeat the addition
cuda_device – device id (if mulitple one)

Renvoie:

addition of the two arrays

teachcompute.validation.cuda.cuda_example_py.vector_sum0(vect: numpy.ndarray[numpy.float32], max_threads: int = 256, cuda_device: int = 0) → float¶

Computes the sum of all coefficients with CUDA. Naive method.

Paramètres:

vect – array
max_threads – number of threads to use (it must be a power of 2)
cuda_device – device id (if mulitple one)

Renvoie:

sum

teachcompute.validation.cuda.cuda_example_py.vector_sum6(vect: numpy.ndarray[numpy.float32], max_threads: int = 256, cuda_device: int = 0) → float¶

Computes the sum of all coefficients with CUDA. More efficient method.

Paramètres:

vect – array
max_threads – number of threads to use (it must be a power of 2)
cuda_device – device id (if mulitple one)

Renvoie:

sum

teachcompute.validation.cuda.cuda_example_py.vector_sum_atomic(vect: numpy.ndarray[numpy.float32], max_threads: int = 256, cuda_device: int = 0) → float¶

Computes the sum of all coefficients with CUDA. Uses atomicAdd

Paramètres:

vect – array
max_threads – number of threads to use (it must be a power of 2)
cuda_device – device id (if mulitple one)

Renvoie:

sum

cuda_monitor¶

teachcompute.validation.cuda.cuda_monitor.cuda_version() → int¶: Returns the CUDA version the project was compiled with.

teachcompute.validation.cuda.cuda_monitor.nvml_device_get_count() → int¶: Returns the number of GPU units.

teachcompute.validation.cuda.cuda_monitor.nvml_device_get_memory_info(device: int = 0) → tuple¶: Returns the free memory, the used memory, the total memory for a GPU device.

teachcompute.validation.cuda.cuda_monitor.nvml_init() → None¶: Initializes memory managment from nvml library.

teachcompute.validation.cuda.cuda_monitor.nvml_shutdown() → None¶: Closes memory managment from nvml library.