validation.cuda#
C API#
cuda_example_py#
- onnx_extended.validation.cuda.cuda_example_py.vector_add(v1: numpy.ndarray[numpy.float32], v2: numpy.ndarray[numpy.float32], cuda_device: int = 0) numpy.ndarray[numpy.float32] #
Computes the additions of two vectors of the same size with CUDA.
- Parameters:
v1 – array
v2 – array
cuda_device – device id (if mulitple one)
- Returns:
addition of the two arrays
- onnx_extended.validation.cuda.cuda_example_py.vector_sum0(vect: numpy.ndarray[numpy.float32], max_threads: int = 256, cuda_device: int = 0) float #
Computes the sum of all coefficients with CUDA. Naive method.
- Parameters:
vect – array
max_threads – number of threads to use (it must be a power of 2)
cuda_device – device id (if mulitple one)
- Returns:
sum
- onnx_extended.validation.cuda.cuda_example_py.vector_sum6(vect: numpy.ndarray[numpy.float32], max_threads: int = 256, cuda_device: int = 0) float #
Computes the sum of all coefficients with CUDA. More efficient method.
- Parameters:
vect – array
max_threads – number of threads to use (it must be a power of 2)
cuda_device – device id (if mulitple one)
- Returns:
sum