onnx_extended.ortcy#
It supports any onnxruntime C API greater than version:
<<<
from onnx_extended.ortcy.wrap.ortinf import get_ort_c_api_supported_version
print(get_ort_c_api_supported_version())
>>>
16
get_ort_c_api_supported_version#
- onnx_extended.ortcy.wrap.ortinf.get_ort_c_api_supported_version()#
Returns the supported version of onnxruntime C API.
ort_get_available_providers#
- onnx_extended.ortcy.wrap.ortinf.ort_get_available_providers()#
Returns the list of available providers.
OrtSession#
- class onnx_extended.ortcy.wrap.ortinf.OrtSession(filename, graph_optimization_level=-1, enable_cuda=False, cuda_device_id=0, set_denormal_as_zero=False, optimized_file_path=None, inter_op_num_threads=-1, intra_op_num_threads=-1, custom_libs=None)#
Wrapper around onnxruntime C API based on cython.
- Parameters:
filename – filename (str) or a bytes for a model serialized in memory
graph_optimisation_level – level of graph optimisation, nodes fusion, see onnxruntime Graph Optimizations
enable_cuda – use CUDA provider
cuda_device_id – CUDA device id
set_denormal_as_zero – if a tensor contains too many denormal numbers, the execution is slowing down
optimized_file_path – to write the optimized model
inter_op_num_threads – number of threads used to parallelize the execution of the graph
intra_op_num_threads – number of threads used to parallelize the execution within nodes
New in version 0.2.0.
- get_input_count(self)#
Returns the number of inputs.
- get_output_count(self)#
Returns the number of outputs.
- run(self, list inputs)#
Runs the inference. The number of inputs and outputs must not exceed 10.
- run_1_1(self, ndarray input1)#
Runs the inference assuming the model has one input and one output.
- run_2(self, ndarray input1, ndarray input2)#
Runs the inference assuming the model has two inputs.