.. _l-design-graph-builder: ============ GraphBuilder ============ :class:`yobx.xbuilder.GraphBuilder` simplifies the programmatic construction and optimization of ONNX graphs. It is the primary tool used to convert a :class:`torch.fx.Graph` into a :class:`onnx.ModelProto`, but it can equally be used standalone to build or transform any ONNX graph from scratch. Class Hierarchy =============== :class:`GraphBuilder ` is composed of three cooperative base classes: * :class:`_BuilderRuntime ` — evaluates small constant sub-expressions (e.g. the ``[0, 0, -1]`` passed to a ``Reshape`` node) so the builder can resolve ``-1`` to the correct symbolic formula and fold constants early. * :class:`_ShapeRuntime ` — handles *value-as-shape* tracking needed by operators such as ``Shape``, ``Gather``, ``Concat``, and ``Slice`` when their outputs feed directly into a ``Reshape``. * :class:`_InferenceRuntime ` — walks the graph node by node, dispatching each node to the matching per-operator handler in :mod:`yobx.xshape.shape_type_compute` so that shapes and types are tracked for every intermediate result. Two helper classes round out the public API: * :class:`FunctionOptions ` — controls whether (and how) a sub-graph is exported as a reusable ONNX local function. * :class:`OptimizationOptions ` — selects which optimization passes run inside :meth:`to_onnx `. .. _builder-api-make: Building a graph from scratch ============================== The simplest workflow is: 1. Construct a :class:`GraphBuilder` with an opset version. 2. Call :meth:`make_tensor_input ` to declare each graph input. 3. Call :meth:`make_node ` (or the short-hand ``g.op.(…)`` syntax) to add operators. 4. Call :meth:`make_tensor_output ` to declare each graph output. 5. Call :meth:`to_onnx ` to obtain a :class:`onnx.ModelProto`. .. runpython:: :showcode: import numpy as np import onnx from yobx.helpers.onnx_helper import pretty_onnx from yobx.xbuilder import GraphBuilder TFLOAT = onnx.TensorProto.FLOAT # 1. create builder targeting opset 18 g = GraphBuilder(18, ir_version=10) # 2. declare inputs g.make_tensor_input("X", TFLOAT, ("batch", "seq", 64)) g.make_tensor_input("W", TFLOAT, (64, 32)) # 3. add a MatMul node via the short-hand op accessor result = g.op.MatMul("X", "W") # 4. declare the output and export g.make_tensor_output(result, elem_type=TFLOAT, shape=("batch", "seq", 32), indexed=False) model = g.to_onnx() print(f"nodes : {len(model.graph.node)}") print(f"opset : {model.opset_import[0].version}") print(f"output : {model.graph.output[0].name}") print(pretty_onnx(model)) Loading an existing model ========================= Passing an existing :class:`onnx.ModelProto` to the constructor loads it into the builder so its nodes and initializers can be inspected, modified, or re-optimized. .. runpython:: :showcode: import onnx import onnx.helper as oh from yobx.xbuilder import GraphBuilder TFLOAT = onnx.TensorProto.FLOAT model = oh.make_model( oh.make_graph( [ oh.make_node("Add", ["X", "Y"], ["T"]), oh.make_node("Relu", ["T"], ["Z"]), ], "add_relu", [ oh.make_tensor_value_info("X", TFLOAT, ["batch", 4]), oh.make_tensor_value_info("Y", TFLOAT, ["batch", 4]), ], [oh.make_tensor_value_info("Z", TFLOAT, ["batch", 4])], ), opset_imports=[oh.make_opsetid("", 18)], ir_version=10, ) g = GraphBuilder(model) print("input shapes:", {n: g.get_shape(n) for n in g.input_names}) print("nodes :", [node.op_type for node in g.nodes]) Initializers ============ Initializers (model weights and constants) are added with :meth:`make_initializer `. The builder deduplicates small integer arrays automatically: if the same value is added twice it returns the name of the first occurrence rather than creating a duplicate node. .. runpython:: :showcode: import numpy as np import onnx from yobx.xbuilder import GraphBuilder TFLOAT = onnx.TensorProto.FLOAT g = GraphBuilder(18, ir_version=10) g.make_tensor_input("X", TFLOAT, ("batch", 64)) # Add a weight matrix as an initializer W = np.random.randn(64, 32).astype(np.float32) w_name = g.make_initializer("W", W, source="example") result = g.op.MatMul("X", w_name) g.make_tensor_output(result, elem_type=TFLOAT, shape=("batch", 32), indexed=False) model = g.to_onnx() print("initializer name :", list(g.initializers_dict)[0]) print("initializer shape:", list(g.initializers_dict.values())[0].shape) .. _builder-api: Shape and type tracking ======================= :class:`GraphBuilder` inherits the full :class:`ShapeBuilder ` interface. Shapes and types are registered for every intermediate result as nodes are added, and are used during optimization and for populating ``value_info`` in the exported proto. See :ref:`l-design-expected-api`. Dynamic shapes ============== When some input dimensions are unknown at graph-construction time, they are represented as strings (e.g. ``"batch"``, ``"seq"``). For graphs that are later exported for dynamic-shape inference with ``torch.export``, the builder accepts a ``dynamic_shapes`` dictionary that maps input names to per-axis dimension objects (:class:`torch.export.Dim` or :class:`WrapDim `). :meth:`register_dynamic_objects_from_shape ` registers any string dimension names encountered in a shape so that they are tracked as symbolic dimensions. .. runpython:: :showcode: import onnx from yobx.xbuilder import GraphBuilder TFLOAT = onnx.TensorProto.FLOAT g = GraphBuilder(18, ir_version=10) g.make_tensor_input("X", TFLOAT, ("batch", "seq", 64)) g.make_tensor_input("Y", TFLOAT, ("batch", "seq", 64)) # symbolic dimensions are tracked automatically once shapes are set result = g.op.Add("X", "Y") g.make_tensor_output(result, elem_type=TFLOAT, shape=("batch", "seq", 64), indexed=False) model = g.to_onnx() out = model.graph.output[0] dims = [ d.dim_param if d.dim_param else d.dim_value for d in out.type.tensor_type.shape.dim ] print("output shape:", dims) Optimizations ============= :meth:`to_onnx ` runs a sequence of optimization passes by default. The set of passes is controlled by :class:`OptimizationOptions `. Default passes (in order): .. list-table:: :header-rows: 1 :widths: 25 75 * - Pass - Effect * - ``remove_unused`` - Remove nodes whose outputs are never consumed. * - ``constant_folding`` - Evaluate operators such as ``Transpose``, ``Cast``, ``Reshape``, ``Concat``, ``Add``, ``Mul``, etc. when all inputs are constants and fold the result into an initializer. * - ``remove_identity`` - Remove ``Identity`` nodes. * - ``remove_duplicated_initializer`` - Merge identical constant initializers into a single tensor, removing redundant copies. * - ``patterns`` - Apply user-supplied or built-in fusion patterns (e.g. ``"default"`` enables the default set of ONNX-to-ONNX rewrites). * - ``order`` - Reorder nodes to reduce peak memory by moving each ``Shape`` / ``Size`` node immediately after the node that produces its input (controlled by :class:`OrderAlgorithm `, default ``SHAPE``). .. runpython:: :showcode: import onnx import onnx.helper as oh from yobx.xbuilder import GraphBuilder, OptimizationOptions TFLOAT = onnx.TensorProto.FLOAT model = oh.make_model( oh.make_graph( [ oh.make_node("Identity", ["X"], ["X2"]), oh.make_node("Relu", ["X2"], ["Z"]), ], "id_relu", [oh.make_tensor_value_info("X", TFLOAT, [None, 4])], [oh.make_tensor_value_info("Z", TFLOAT, [None, 4])], ), opset_imports=[oh.make_opsetid("", 18)], ir_version=10, ) opts = OptimizationOptions(remove_identity=True) g = GraphBuilder(model, optimization_options=opts) optimized = g.to_onnx() print("nodes before:", len(model.graph.node)) print("nodes after :", len(optimized.graph.node)) Optimization report =================== Passing ``return_optimize_report=True`` to :meth:`to_onnx ` makes the method return a ``(model, stats)`` tuple instead of just the model. ``stats`` is a list of dictionaries — one entry per optimization pass — that records how many nodes were added or removed and how long each pass took. .. list-table:: :header-rows: 1 :widths: 25 75 * - Key - Description * - ``pattern`` - Name of the optimization pass (e.g. ``"remove_identity"``, ``"constant_folding"``, ``"TransposeTranspose"`` …). * - ``added`` - Number of nodes added by this pass. * - ``removed`` - Number of nodes removed by this pass. * - ``time_in`` - Wall-clock time spent in this pass (seconds). * - ``iteration`` - Iteration number (only for pattern-based passes). * - ``match_index`` - Sequential index of the match within the iteration (pattern passes). * - ``instances`` - Number of times the pattern was matched (pattern passes). The list can be converted to a :class:`pandas.DataFrame` for quick exploration: .. runpython:: :showcode: import pandas import onnx import onnx.helper as oh from yobx.xbuilder import GraphBuilder, OptimizationOptions TFLOAT = onnx.TensorProto.FLOAT model = oh.make_model( oh.make_graph( [ oh.make_node("Identity", ["X"], ["X2"]), oh.make_node("Transpose", ["X2"], ["T"], perm=[1, 0]), oh.make_node("Transpose", ["T"], ["Z"], perm=[1, 0]), ], "demo", [oh.make_tensor_value_info("X", TFLOAT, [3, 4])], [oh.make_tensor_value_info("Z", TFLOAT, [3, 4])], ), opset_imports=[oh.make_opsetid("", 18)], ir_version=10, ) opts = OptimizationOptions(patterns="default") g = GraphBuilder(model, infer_shapes_options=True, optimization_options=opts) optimized = g.to_onnx(return_optimize_report=True) df = pandas.DataFrame(optimized.report.stats) # keep only rows that have numeric added/removed counts df["added"] = df["added"].fillna(0).astype(int) df["removed"] = df["removed"].fillna(0).astype(int) print(df[["pattern", "added", "removed", "time_in"]].to_string(index=False)) print(f"\nnodes before: {len(model.graph.node)}") print(f"nodes after : {len(optimized.graph.node)}") The report can be aggregated by pass name: .. runpython:: :showcode: import pandas import onnx import onnx.helper as oh from yobx.xbuilder import GraphBuilder, OptimizationOptions TFLOAT = onnx.TensorProto.FLOAT model = oh.make_model( oh.make_graph( [ oh.make_node("Identity", ["X"], ["X2"]), oh.make_node("Transpose", ["X2"], ["T"], perm=[1, 0]), oh.make_node("Transpose", ["T"], ["Z"], perm=[1, 0]), ], "demo", [oh.make_tensor_value_info("X", TFLOAT, [3, 4])], [oh.make_tensor_value_info("Z", TFLOAT, [3, 4])], ), opset_imports=[oh.make_opsetid("", 18)], ir_version=10, ) opts = OptimizationOptions(patterns="default") g = GraphBuilder(model, infer_shapes_options=True, optimization_options=opts) art = g.to_onnx(return_optimize_report=True) df = pandas.DataFrame(art.report.stats) for c in ["added", "removed"]: df[c] = df[c].fillna(0).astype(int) agg = df.groupby("pattern")[["added", "removed", "time_in"]].sum() agg = agg[(agg["added"] > 0) | (agg["removed"] > 0)].sort_values( "removed", ascending=False ) print(agg.to_string()) Local functions =============== A sub-graph can be exported as a reusable ONNX local function (a ``FunctionProto``) by passing a :class:`FunctionOptions ` instance to :meth:`to_onnx `. .. runpython:: :showcode: import onnx from yobx.xbuilder import GraphBuilder, FunctionOptions TFLOAT = onnx.TensorProto.FLOAT g = GraphBuilder(18, ir_version=10, as_function=True) g.make_tensor_input("X", TFLOAT, ("batch", 64)) r = g.op.Relu("X") g.make_tensor_output(r, indexed=False) func = g.to_onnx( function_options=FunctionOptions( export_as_function=True, name="MyRelu", domain="my.domain", ), inline=False, ) print(type(func).__name__) print("function name :", func.name) print("function domain:", func.domain) .. _l-graphbuilder-debugging-env: Debugging ========= :class:`GraphBuilder ` respects several environment variables that help narrow down construction or optimization problems: .. list-table:: :header-rows: 1 :widths: 30 70 * - Environment variable - Effect * - ``ONNXSTOP=`` - Raises an exception the moment result ```` is created. * - ``ONNXSTOPSHAPE=`` - Raises an exception the moment result ```` receives a shape. * - ``ONNXSTOPTYPE=`` - Raises an exception the moment result ```` receives a type. * - ``ONNXSTOPOUTPUT=`` - Raises an exception the moment a node produces output ````. * - ``ONNXSTOPVALUESHAPE=`` - Prints extra information for shape-as-value tracking (e.g. inputs to ``Reshape``). * - ``ONNXCST=1`` - Prints which constant is being evaluated. * - ``ONNXFUNC=1`` - Prints details when nodes from a local function domain are added. * - ``ONNXSHAPECOMPUTE=1`` - Raises an exception when a shape is missing for a result that should have one. * - ``NULLSHAPE=1`` - Raises an exception as soon as a null/empty shape is encountered. * - ``ONNXDYNDIM=`` - Prints a message every time dynamic dimension ```` is used. * - ``PRINTNAME=`` - Prints a message every time a node producing ```` is added. In addition, :meth:`get_debug_msg ` returns a detailed text dump of the builder's internal state (known shapes, types, ranks, constants, and node list) which can be printed or logged whenever an assertion fails. :meth:`pretty_text ` returns a human-readable representation of the whole graph (inputs, initializers, nodes, outputs) and is useful for quick visual inspection: .. runpython:: :showcode: import onnx import onnx.helper as oh from yobx.xbuilder import GraphBuilder TFLOAT = onnx.TensorProto.FLOAT model = oh.make_model( oh.make_graph( [ oh.make_node("Add", ["X", "Y"], ["T"]), oh.make_node("Relu", ["T"], ["Z"]), ], "add_relu", [ oh.make_tensor_value_info("X", TFLOAT, ["batch", 4]), oh.make_tensor_value_info("Y", TFLOAT, ["batch", 4]), ], [oh.make_tensor_value_info("Z", TFLOAT, ["batch", 4])], ), opset_imports=[oh.make_opsetid("", 18)], ir_version=10, ) g = GraphBuilder(model) print(g.pretty_text())