validation.cuda#

C API#

cuda_example_py#

onnx_extended.validation.cuda.cuda_example_py.vector_add(v1: numpy.ndarray[numpy.float32], v2: numpy.ndarray[numpy.float32], cuda_device: int = 0) numpy.ndarray[numpy.float32]#

Computes the additions of two vectors of the same size with CUDA.

Parameters:
  • v1 – array

  • v2 – array

  • cuda_device – device id (if mulitple one)

Returns:

addition of the two arrays

onnx_extended.validation.cuda.cuda_example_py.vector_sum0(vect: numpy.ndarray[numpy.float32], max_threads: int = 256, cuda_device: int = 0) float#

Computes the sum of all coefficients with CUDA. Naive method.

Parameters:
  • vect – array

  • max_threads – number of threads to use (it must be a power of 2)

  • cuda_device – device id (if mulitple one)

Returns:

sum

onnx_extended.validation.cuda.cuda_example_py.vector_sum6(vect: numpy.ndarray[numpy.float32], max_threads: int = 256, cuda_device: int = 0) float#

Computes the sum of all coefficients with CUDA. More efficient method.

Parameters:
  • vect – array

  • max_threads – number of threads to use (it must be a power of 2)

  • cuda_device – device id (if mulitple one)

Returns:

sum