ExtendedModelContainer#

ExtendedModelContainer extends onnx.model_container.ModelContainer to handle large ONNX models whose weight tensors are stored outside the main .onnx file.

Motivation#

Standard ONNX files embed every initializer directly in the protobuf. For large models — those whose weights reach several gigabytes — this approach becomes impractical: the protobuf cannot exceed 2 GB and loading everything into memory before running the model is wasteful.

ONNX addresses this via the external data mechanism: an initializer can reference a separate binary file instead of carrying its bytes inline. onnx.model_container.ModelContainer provides the basic scaffolding for this pattern; ExtendedModelContainer builds on top of it with two important additions:

PyTorch tensor support — large initializers may be torch.Tensor or torch.nn.Parameter objects in addition to numpy.ndarray. The serialization path converts them transparently so callers do not have to worry about dtype mismatches.
Inline local functions — setting container.inline = True inlines every local function defined in the model before writing it to disk, which reduces runtime overhead for exporters that emit function-heavy graphs.

How it works#

An ExtendedModelContainer has two main attributes:

model_proto — the onnx.ModelProto describing the graph. Large initializers appear here with data_location = EXTERNAL and a location key that acts as a symbolic handle (e.g. "#weight").
large_initializers — a plain Python dict mapping each symbolic handle to the actual tensor data (numpy.ndarray, torch.Tensor, or onnx.TensorProto).

Save#

save iterates over every external tensor in the proto, looks up the corresponding data in large_initializers, converts it to raw bytes, and writes it to disk. Two layouts are supported:

all_tensors_to_one_file=True (default) — all weights are packed into a single <model>.onnx.data file. Each tensor’s offset and length fields are updated in the copy of the proto that is saved.
all_tensors_to_one_file=False — each weight gets its own <name>.weight file sitting next to the .onnx file.

The method returns a (possibly modified) copy of model_proto that contains the final file locations so the caller can inspect or validate it.

Load#

load calls onnx.load() with load_external_data=False to read the graph structure, then calls the parent’s _load_large_initializers to memory-map / load the weight files referenced by the proto.

Basic example#

The snippet below builds a tiny ONNX model that has one external initializer, saves it through ExtendedModelContainer, and reloads it.

<<<

import os
import tempfile
import numpy as np
import onnx
import onnx.helper as oh
from yobx.container import ExtendedModelContainer

# ---- Build a minimal model with one external initializer ----
data = np.ones((4, 8), dtype=np.float32)

x_info = oh.make_tensor_value_info("x", onnx.TensorProto.FLOAT, [4, 8])
y_info = oh.make_tensor_value_info("y", onnx.TensorProto.FLOAT, [4, 8])

# Declare the initializer as external (symbolic location "#weight")
init = onnx.TensorProto()
init.data_type = onnx.TensorProto.FLOAT
init.name = "weight"
init.data_location = onnx.TensorProto.EXTERNAL
for d in data.shape:
    init.dims.append(d)
ext = init.external_data.add()
ext.key = "location"
ext.value = "#weight"

graph = oh.make_graph(
    [oh.make_node("Add", ["x", "weight"], ["y"])],
    "demo",
    [x_info],
    [y_info],
    initializer=[init],
)
model = oh.make_model(graph, opset_imports=[oh.make_opsetid("", 18)])

# ---- Populate the container and save ----
container = ExtendedModelContainer()
container.model_proto = model
container.set_large_initializers({"#weight": data})

with tempfile.TemporaryDirectory() as tmp:
    path = os.path.join(tmp, "demo.onnx")
    saved_proto = container.save(path)
    print("Saved proto type :", type(saved_proto).__name__)
    print("Files written     :", sorted(os.listdir(tmp)))

    # ---- Round-trip: reload from disk ----
    loaded = ExtendedModelContainer().load(path)
    weight = next(iter(loaded.large_initializers.values()))
    print("Loaded weight shape:", weight.shape)
    print("Loaded weight dtype:", weight.dtype)

>>>

    Saved proto type : ModelProto
    Files written     : ['demo.onnx', 'demo.onnx.data']
    Loaded weight shape: (4, 8)
    Loaded weight dtype: float32

PyTorch tensor example#

When large initializers are torch.Tensor objects the container serializes them automatically. This is the typical workflow when exporting a PyTorch model to ONNX and deferring disk writes to a later stage.

<<<

import os
import tempfile
import numpy as np
import onnx
import onnx.helper as oh
import torch
from yobx.container import ExtendedModelContainer

data = torch.ones(4, 8, dtype=torch.float32)

x_info = oh.make_tensor_value_info("x", onnx.TensorProto.FLOAT, [4, 8])
y_info = oh.make_tensor_value_info("y", onnx.TensorProto.FLOAT, [4, 8])

init = onnx.TensorProto()
init.data_type = onnx.TensorProto.FLOAT
init.name = "weight"
init.data_location = onnx.TensorProto.EXTERNAL
for d in data.shape:
    init.dims.append(d)
ext = init.external_data.add()
ext.key = "location"
ext.value = "#weight"

graph = oh.make_graph(
    [oh.make_node("Add", ["x", "weight"], ["y"])],
    "demo_torch",
    [x_info],
    [y_info],
    initializer=[init],
)
model = oh.make_model(graph, opset_imports=[oh.make_opsetid("", 18)])

container = ExtendedModelContainer()
container.model_proto = model
container.set_large_initializers({"#weight": data})

with tempfile.TemporaryDirectory() as tmp:
    path = os.path.join(tmp, "demo_torch.onnx")
    container.save(path)
    print("Files written:", sorted(os.listdir(tmp)))

    loaded = ExtendedModelContainer().load(path)
    weight = next(iter(loaded.large_initializers.values()))
    print("Reloaded weight shape:", weight.shape)

>>>

    Files written: ['demo_torch.onnx', 'demo_torch.onnx.data']
    Reloaded weight shape: (4, 8)