yobx.helpers.mini_onnx_builder

class yobx.helpers.mini_onnx_builder.MiniOnnxBuilder(target_opset: int = 18, ir_version: int = 10, sep: str = '___')[source][source]

Simplified builder to create very simple ONNX models that store tensors (numpy arrays or torch tensors) as initializers and expose them as model outputs. The resulting model has no inputs — it simply returns the stored values when executed.

Parameters:
  • target_opset – default ONNX opset version (default: 18)

  • ir_version – ONNX IR version (default: 10)

  • sep – separator used to build composite output names (default: "___")

Typical usage — save a plain numpy array and round-trip it through ONNX:

<<<

import numpy as np
from yobx.helpers.mini_onnx_builder import MiniOnnxBuilder
from yobx.reference import ExtendedReferenceEvaluator

builder = MiniOnnxBuilder()
builder.append_output_initializer(
    "weights", np.array([1.0, 2.0, 3.0], dtype=np.float32)
)
model = builder.to_onnx()

ref = ExtendedReferenceEvaluator(model)
(weights,) = ref.run(None, {})
print(weights)  # np.array([1.0, 2.0, 3.0], dtype=np.float32)

>>>

    [1. 2. 3.]

For serializing arbitrary nested Python structures (dicts, tuples, lists, torch tensors, DynamicCache …) prefer the higher-level helpers create_onnx_model_from_input_tensors() and create_input_tensors_from_onnx_model().

append_output_dict(name: str, tensors: Dict[str, ndarray | torch.Tensor])[source][source]

Adds two outputs, a string tensors for the keys and a sequence of tensors for the values.

The output name is name___keys and name___values.

append_output_initializer(name: str, tensor: ndarray | torch.Tensor, randomize: bool = False)[source][source]

Adds an initializer as an output. The initializer name is prefixed by t_. The output name is name. If randomize is True, the tensor is not stored but replaced by a random generator.

append_output_sequence(name: str, tensors: List[ndarray | torch.Tensor])[source][source]

Adds a sequence of initializers as an output. The initializers names are prefixed by seq_. The output name is name.

to_onnx() ModelProto[source][source]

Conversion to onnx. :return: the proto

yobx.helpers.mini_onnx_builder.create_input_tensors_from_onnx_model(proto: str | ModelProto, device: str = 'cpu', engine: str = 'ExtendedReferenceEvaluator', sep: str = '___') Any[source][source]

Deserializes tensors stored with function create_onnx_model_from_input_tensors(). It relies on ExtendedReferenceEvaluator to restore the tensors.

Parameters:
  • proto – onnx.ModelProto or the file itself

  • device – moves the tensor to this device

  • engine – runtime to use, onnx, the default value, onnxruntime

  • sep – separator

Returns:

restored data

See example Dumps intermediate results of a torch model for an example.

import os
from onnx_diagnostic.helpers.mini_onnx_builder import (
    create_input_tensors_from_onnx_model,
)
from onnx_diagnostic.helpers import string_type

restored = create_input_tensors_from_onnx_model("attention_inputs.onnx")
for k, v in restored.items():
    print(f"{k}: {string_type(v, with_shape=True, with_min_max=True)}")
yobx.helpers.mini_onnx_builder.create_onnx_model_from_input_tensors(inputs: Any, switch_low_high: bool | None = None, randomize: bool = False, sep: str = '___') ModelProto[source][source]

Creates a model proto including all the value as initializers. They can be restored by executing the model. We assume these inputs are not bigger than 2Gb, the limit of protobuf. Nothing is implemented yet to get around that limit.

Parameters:
  • inputs – anything

  • switch_low_high – if None, it is equal to switch_low_high=sys.byteorder != "big"

  • randomize – if True, float tensors are not stored but randomized to save space

  • sep – separator

Returns:

onnx.ModelProto

The function raises an error if not supported. An example:

from onnx_diagnostic.helpers.mini_onnx_builder import (
    create_onnx_model_from_input_tensors,
)
import onnx

proto = create_onnx_model_from_input_tensors(
    dict(
        query_states=query_states,
        key_states=key_states,
        value_states=value_states,
        cu_seqlens=cu_seqlens,
        max_seqlen=(cu_seqlens[1:] - cu_seqlens[:-1]).max(),
        scaling=self.scaling,
        attn_output=attn_output,
    )
)
onnx.save(proto, "attention_inputs.onnx")
yobx.helpers.mini_onnx_builder.proto_from_array(arr: torch.Tensor, name: str | None = None, verbose: int = 0) TensorProto[source][source]

Converts a torch Tensor into a onnx.TensorProto.

Parameters:
  • arr – tensor

  • verbose – display the type and shape

Returns:

a onnx.TensorProto