yet-another-onnxruntime-extensions#
yet-another-onnxruntime-extensions (yaourt) is an experimental library
of ONNX Runtime extensions: custom C++ operators,
profiling utilities, and plotting helpers.
The source code is available on
GitHub.
Installation#
pip install yet-another-onnxruntime-extensions
Note
The pre-built wheel includes sparse CPU operators only. Fused-kernel CUDA operators must be compiled from source with a CUDA-enabled CMake build — see Getting Started for instructions.
import yaourt
print(yaourt.__version__)
Key Features#
yaourt.ortops— Custom C++ OperatorsSparse CPU operators are shipped as pre-built binaries inside the wheel and registered with ONNX Runtime via
SPARSE_CPU_LIB_PATH. Fused-kernel CUDA operators (FUSED_KERNEL_CUDA_LIB_PATH) require a CUDA-enabled build — see Getting Started for CMake build instructions.yaourt.tools— Profiling ToolsParse ONNX Runtime JSON profiling output into
pandasDataFrames (js_profile_to_dataframe()) and visualize per-operator timings and execution timelines with matplotlib (plot_ort_profile(),plot_ort_profile_timeline()).yaourt.plot— Benchmark and Plot HelpersHorizontal benchmark comparison histograms with error bars (
hhistograms()) and tensor histogram utilities for model analysis (plot_histogram()).yaourt.reference— Reference EvaluatorA pure-Python ONNX evaluator useful for testing and debugging custom operators without a full ONNX Runtime build.
Quick Start#
Run inference with ONNX Runtime:
import numpy as np
import onnxruntime
from yaourt.doc import demo_mlp_model
model = demo_mlp_model("") # filename argument is unused
sess = onnxruntime.InferenceSession(
model.SerializeToString(), providers=["CPUExecutionProvider"]
)
x = np.random.randn(3, 10).astype(np.float32)
(output,) = sess.run(None, {"x": x})
print("Output shape:", output.shape)
Load the custom C++ operators:
import onnxruntime as ort
from yaourt.ortops import SPARSE_CPU_LIB_PATH
opts = ort.SessionOptions()
opts.register_custom_ops_library(str(SPARSE_CPU_LIB_PATH))
Miscellaneous