validation.cpu#
C API#
_validation#
- class onnx_extended.validation.cpu._validation.ElementTime#
- onnx_extended.validation.cpu._validation.benchmark_cache(size: int, verbose: bool = True) float #
Runs a benchmark to measure the cache performance. The function measures the time for N random accesses in array of size N and returns the time divided by N. It copies random elements taken from the array size to random position in another of the same size. It does that size times and return the average time per move. See example Measuring CPU performance.
- Parameters:
size – array size
- Returns:
average time per move
The function measures the time spent in this loop.
for (int64_t i = 0; i < arr_size; ++i) { // index k will jump forth and back, to generate cache misses int64_t k = (i / 2) + (i % 2) * arr_size / 2; arr_b[k] = arr_a[k] + 1; }
The code is benchmark_cache.
- onnx_extended.validation.cpu._validation.benchmark_cache_tree(n_rows: int = 100000, n_features: int = 50, n_trees: int = 200, tree_size: int = 4096, max_depth: int = 10, search_step: int = 64) List[onnx_extended.validation.cpu._validation.ElementTime] #
Simulates the prediction of a random forest. Returns the time taken by every rows for a function doing random addition between an element from the same short buffer and another one taken from a list of trees. See example Measuring CPU performance.
- Parameters:
n_rows – number of rows of the whole batch size
n_features – number of features
n_trees – number of trees
tree_size – size of a tree (= number of nodes * sizeof(node) / sizeof(float))
max_depth – depth of a tree
search_step – evaluate every…
- Returns:
array of time take for every row
The code is benchmark_cache_tree
- onnx_extended.validation.cpu._validation.murmurhash3_bytes_s32(key: str, seed: int = 0) int #
Calls murmurhash3_bytes_s32 from scikit-learn.
- Parameters:
key – string
seed – unsigned integer
- Returns:
hash
- onnx_extended.validation.cpu._validation.vector_add(v1: numpy.ndarray[numpy.float32], v2: numpy.ndarray[numpy.float32]) numpy.ndarray[numpy.float32] #
Computes the addition of 2 vectors of any dimensions. It assumes both vectors have the same dimensions (no broadcast).).
- Parameters:
v1 – first vector
v2 – second vector
- Returns:
new vector
- onnx_extended.validation.cpu._validation.vector_sum(n_columns: int, values: List[float], by_rows: bool) float #
Computes the sum of all elements in an array by rows or by columns. This function is slower than
vector_sum_array
as this function copies the data from an array to a std::vector. This copy (and allocation) is bigger than the compution itself.- Parameters:
n_columns – number of columns
values – all values in an array
by_rows – by rows or by columns
- Returns:
sum of all elements
- onnx_extended.validation.cpu._validation.vector_sum_array(n_columns: int, values: numpy.ndarray[numpy.float32], by_rows: bool) float #
Computes the sum of all elements in an array by rows or by columns.
- Parameters:
n_columns – number of columns
values – all values in an array
by_rows – by rows or by columns
- Returns:
sum of all elements
- onnx_extended.validation.cpu._validation.vector_sum_array_parallel(n_columns: int, values: numpy.ndarray[numpy.float32], by_rows: bool) float #
Computes the sum of all elements in an array by rows or by columns. The computation is parallelized.
- Parameters:
n_columns – number of columns
values – all values in an array
by_rows – by rows or by columns
- Returns:
sum of all elements
- onnx_extended.validation.cpu._validation.vector_sum_array_avx(n_columns: int, values: numpy.ndarray[numpy.float32]) float #
Computes the sum of all elements in an array by rows or by columns. The computation uses AVX instructions (see AVX API).
- Parameters:
n_columns – number of columns
values – all values in an array
- Returns:
sum of all elements
- onnx_extended.validation.cpu._validation.vector_sum_array_avx_parallel(n_columns: int, values: numpy.ndarray[numpy.float32]) float #
Computes the sum of all elements in an array by rows or by columns. The computation uses AVX instructions and parallelization (see AVX API).
- Parameters:
n_columns – number of columns
values – all values in an array
- Returns:
sum of all elements
vector_function_cy#
- onnx_extended.validation.cython.vector_function_cy.vector_add_c(v1, v2)#
Computes the addition of two tensors of the same shape.
- Parameters:
v1 – first tensor
v2 – second tensor
- Returns:
result.