ortcy#

It supports any onnxruntime C API greater than version:

<<<

from onnx_extended.ortcy.wrap.ortinf import get_ort_c_api_supported_version

print(get_ort_c_api_supported_version())

>>>

    16

get_ort_c_api_supported_version#

onnx_extended.ortcy.wrap.ortinf.get_ort_c_api_supported_version()#

Returns the supported version of onnxruntime C API.

ort_get_available_providers#

onnx_extended.ortcy.wrap.ortinf.ort_get_available_providers()#

Returns the list of available providers.

OrtSession#

class onnx_extended.ortcy.wrap.ortinf.OrtSession(filename, graph_optimization_level=-1, enable_cuda=False, cuda_device_id=0, set_denormal_as_zero=False, optimized_file_path=None, inter_op_num_threads=-1, intra_op_num_threads=-1, custom_libs=None)#

Wrapper around onnxruntime C API based on cython.

Parameters:
  • filename – filename (str) or a bytes for a model serialized in memory

  • graph_optimisation_level – level of graph optimisation, nodes fusion, see onnxruntime Graph Optimizations

  • enable_cuda – use CUDA provider

  • cuda_device_id – CUDA device id

  • set_denormal_as_zero – if a tensor contains too many denormal numbers, the execution is slowing down

  • optimized_file_path – to write the optimized model

  • inter_op_num_threads – number of threads used to parallelize the execution of the graph

  • intra_op_num_threads – number of threads used to parallelize the execution within nodes

New in version 0.2.0.

get_input_count(self)#

Returns the number of inputs.

get_output_count(self)#

Returns the number of outputs.

run(self, list inputs)#

Runs the inference. The number of inputs and outputs must not exceed 10.

run_1_1(self, ndarray input1)#

Runs the inference assuming the model has one input and one output.

run_2(self, ndarray input1, ndarray input2)#

Runs the inference assuming the model has two inputs.