onnx_extended.ortcy¶

It supports any onnxruntime C API greater than version:

<<<

from onnx_extended.ortcy.wrap.ortinf import get_ort_c_api_supported_version

print(get_ort_c_api_supported_version())

>>>

get_ort_c_api_supported_version¶

onnx_extended.ortcy.wrap.ortinf.get_ort_c_api_supported_version()¶: Returns the supported version of onnxruntime C API.

ort_get_available_providers¶

onnx_extended.ortcy.wrap.ortinf.ort_get_available_providers()¶: Returns the list of available providers.

OrtSession¶

class onnx_extended.ortcy.wrap.ortinf.OrtSession(filename, graph_optimization_level=-1, enable_cuda=False, cuda_device_id=0, set_denormal_as_zero=False, optimized_file_path=None, inter_op_num_threads=-1, intra_op_num_threads=-1, custom_libs=None)¶

Wrapper around onnxruntime C API based on cython.

Parameters:

filename – filename (str) or a bytes for a model serialized in memory
graph_optimisation_level – level of graph optimisation, nodes fusion, see onnxruntime Graph Optimizations
enable_cuda – use CUDA provider
cuda_device_id – CUDA device id
set_denormal_as_zero – if a tensor contains too many denormal numbers, the execution is slowing down
optimized_file_path – to write the optimized model
inter_op_num_threads – number of threads used to parallelize the execution of the graph
intra_op_num_threads – number of threads used to parallelize the execution within nodes

Added in version 0.2.0.

get_input_count(self)¶: Returns the number of inputs.

get_output_count(self)¶: Returns the number of outputs.

run(self, list inputs)¶: Runs the inference. The number of inputs and outputs must not exceed 10.

run_1_1(self, ndarray input1)¶: Runs the inference assuming the model has one input and one output.

run_2(self, ndarray input1, ndarray input2)¶: Runs the inference assuming the model has two inputs.