Cython Binding of onnxruntime

onnxruntime implements a python API based on pybind11. This API is custom and does not leverage the C API. This package implements class OrtSession. The bindings is based on cython which faster. The difference is significant when onnxruntime deals with small tensors.

<<<

import numpy
from onnx import TensorProto
from onnx.helper import (
    make_model,
    make_node,
    make_graph,
    make_tensor_value_info,
    make_opsetid,
)
from onnx_extended.ortcy.wrap.ortinf import OrtSession

X = make_tensor_value_info("X", TensorProto.FLOAT, [None, None])
Y = make_tensor_value_info("Y", TensorProto.FLOAT, [None, None])
Z = make_tensor_value_info("Z", TensorProto.FLOAT, [None, None])
node = make_node("Add", ["X", "Y"], ["Z"])
graph = make_graph([node], "add", [X, Y], [Z])
onnx_model = make_model(graph, opset_imports=[make_opsetid("", 18)], ir_version=8)

with open("model.onnx", "wb") as f:
    f.write(onnx_model.SerializeToString())

session = OrtSession("model.onnx")
x = numpy.random.randn(2, 3).astype(numpy.float32)
y = numpy.random.randn(2, 3).astype(numpy.float32)
got = session.run([x, y])

print(got)

>>>

    [array([[ 1.378,  0.677,  3.26 ],
           [-0.803,  0.329, -0.245]], dtype=float32)]

The signature is different compare to onnxruntime session.run(None, {"X": x, "Y": y}) to increase performance. This binding supports custom operators as well. A benchmark Measuring onnxruntime performance against a cython binding compares onnxruntime to this new binding.