GraphBuilder#

yobx.xbuilder.GraphBuilder simplifies the programmatic construction and optimization of ONNX graphs. It is the primary tool used to convert a torch.fx.Graph into a onnx.ModelProto, but it can equally be used standalone to build or transform any ONNX graph from scratch.

Class Hierarchy#

GraphBuilder is composed of three cooperative base classes:

_BuilderRuntime — evaluates small constant sub-expressions (e.g. the [0, 0, -1] passed to a Reshape node) so the builder can resolve -1 to the correct symbolic formula and fold constants early.
_ShapeRuntime — handles value-as-shape tracking needed by operators such as Shape, Gather, Concat, and Slice when their outputs feed directly into a Reshape.
_InferenceRuntime — walks the graph node by node, dispatching each node to the matching per-operator handler in yobx.xshape.shape_type_compute so that shapes and types are tracked for every intermediate result.

Two helper classes round out the public API:

FunctionOptions — controls whether (and how) a sub-graph is exported as a reusable ONNX local function.
OptimizationOptions — selects which optimization passes run inside to_onnx.

Protocol API#

GraphBuilder satisfies a hierarchy of protocols defined in yobx.typing. Callers that do not need the full concrete class should type-annotate against the narrowest protocol that covers the methods they actually use:

Protocol	Scope
`GraphBuilderProtocol`	Core construction API: inputs, outputs, initializers, nodes, opset registration, type/shape/sequence accessors, and `to_onnx()`.
`GraphBuilderExtendedProtocol`	Extends the core protocol with `main_opset`, `op` (the `OpsetProtocol` helper), `set_type_shape_unary_op()`, constant queries, and `get_debug_msg()`. Required by the `yobx.sklearn` converters.
`GraphBuilderTorchProtocol`	Extends `GraphBuilderExtendedProtocol` with the full torch-exporter surface: rank helpers, device helpers, dynamic-shape helpers, sub-builder / local-function support, and miscellaneous utilities used by `FxGraphInterpreter`.
`GraphBuilderPatternOptimizationProtocol`	The read-only view of the graph exposed to pattern-optimization authors inside `match()`. Satisfied by `GraphBuilderPatternOptimization`, not by `GraphBuilder` directly.

The op property returns an object that satisfies OpsetProtocol, which resolves g.op.Add(x, y)-style attribute-access dispatch to ONNX node creation.

Building a graph from scratch#

The simplest workflow is:

Construct a GraphBuilder with an opset version.
Call make_tensor_input to declare each graph input.
Call make_node (or the short-hand g.op.<OpType>(…) syntax) to add operators.
Call make_tensor_output to declare each graph output.
Call to_onnx to obtain a onnx.ModelProto.

<<<

import numpy as np
import onnx
from yobx.helpers.onnx_helper import pretty_onnx
from yobx.xbuilder import GraphBuilder

TFLOAT = onnx.TensorProto.FLOAT

# 1. create builder targeting opset 18
g = GraphBuilder(18, ir_version=10)

# 2. declare inputs
g.make_tensor_input("X", TFLOAT, ("batch", "seq", 64))
g.make_tensor_input("W", TFLOAT, (64, 32))

# 3. add a MatMul node via the short-hand op accessor
result = g.op.MatMul("X", "W")

# 4. declare the output and export
g.make_tensor_output(
    result, elem_type=TFLOAT, shape=("batch", "seq", 32), indexed=False
)
model = g.to_onnx()
print(f"nodes  : {len(model.graph.node)}")
print(f"opset  : {model.opset_import[0].version}")
print(f"output : {model.graph.output[0].name}")
print(pretty_onnx(model))

>>>

    nodes  : 1
    opset  : 18
    output : _onx_matmul_X
    opset: domain='' version=18
    input: name='X' type=dtype('float32') shape=['batch', 'seq', 64]
    input: name='W' type=dtype('float32') shape=[64, 32]
    MatMul(X, W) -> _onx_matmul_X
    output: name='_onx_matmul_X' type=dtype('float32') shape=['batch', 'seq', 32]

Loading an existing model#

Passing an existing onnx.ModelProto to the constructor loads it into the builder so its nodes and initializers can be inspected, modified, or re-optimized.

<<<

import onnx
import onnx.helper as oh
from yobx.xbuilder import GraphBuilder

TFLOAT = onnx.TensorProto.FLOAT

model = oh.make_model(
    oh.make_graph(
        [
            oh.make_node("Add", ["X", "Y"], ["T"]),
            oh.make_node("Relu", ["T"], ["Z"]),
        ],
        "add_relu",
        [
            oh.make_tensor_value_info("X", TFLOAT, ["batch", 4]),
            oh.make_tensor_value_info("Y", TFLOAT, ["batch", 4]),
        ],
        [oh.make_tensor_value_info("Z", TFLOAT, ["batch", 4])],
    ),
    opset_imports=[oh.make_opsetid("", 18)],
    ir_version=10,
)

g = GraphBuilder(model)
print("input  shapes:", {n: g.get_shape(n) for n in g.input_names})
print("nodes        :", [node.op_type for node in g.nodes])

>>>

    input  shapes: {'X': ('batch', 4), 'Y': ('batch', 4)}
    nodes        : ['Add', 'Relu']

Initializers#

Initializers (model weights and constants) are added with make_initializer. The builder deduplicates small integer arrays automatically: if the same value is added twice it returns the name of the first occurrence rather than creating a duplicate node.

<<<

import numpy as np
import onnx
from yobx.xbuilder import GraphBuilder

TFLOAT = onnx.TensorProto.FLOAT

g = GraphBuilder(18, ir_version=10)
g.make_tensor_input("X", TFLOAT, ("batch", 64))

# Add a weight matrix as an initializer
W = np.random.randn(64, 32).astype(np.float32)
w_name = g.make_initializer("W", W, source="example")

result = g.op.MatMul("X", w_name)
g.make_tensor_output(result, elem_type=TFLOAT, shape=("batch", 32), indexed=False)
model = g.to_onnx()
print("initializer name :", list(g.initializers_dict)[0])
print("initializer shape:", list(g.initializers_dict.values())[0].shape)

>>>

    initializer name : W
    initializer shape: (64, 32)

Shape and type tracking#

GraphBuilder inherits the full ShapeBuilder interface. Shapes and types are registered for every intermediate result as nodes are added, and are used during optimization and for populating value_info in the exported proto. See Expected API.

Dynamic shapes#

When some input dimensions are unknown at graph-construction time, they are represented as strings (e.g. "batch", "seq"). For graphs that are later exported for dynamic-shape inference with torch.export, the builder accepts a dynamic_shapes dictionary that maps input names to per-axis dimension objects (torch.export.Dim or WrapDim).

register_dynamic_objects_from_shape registers any string dimension names encountered in a shape so that they are tracked as symbolic dimensions.

<<<

import onnx
from yobx.xbuilder import GraphBuilder

TFLOAT = onnx.TensorProto.FLOAT

g = GraphBuilder(18, ir_version=10)
g.make_tensor_input("X", TFLOAT, ("batch", "seq", 64))
g.make_tensor_input("Y", TFLOAT, ("batch", "seq", 64))

# symbolic dimensions are tracked automatically once shapes are set
result = g.op.Add("X", "Y")
g.make_tensor_output(
    result, elem_type=TFLOAT, shape=("batch", "seq", 64), indexed=False
)
model = g.to_onnx()

out = model.graph.output[0]
dims = [
    d.dim_param if d.dim_param else d.dim_value for d in out.type.tensor_type.shape.dim
]
print("output shape:", dims)

>>>

    output shape: ['batch', 'seq', 64]

Optimizations#

to_onnx runs a sequence of optimization passes by default. The set of passes is controlled by OptimizationOptions.

Default passes (in order):

Pass	Effect
`remove_unused`	Remove nodes whose outputs are never consumed.
`constant_folding`	Evaluate operators such as `Transpose`, `Cast`, `Reshape`, `Concat`, `Add`, `Mul`, etc. when all inputs are constants and fold the result into an initializer.
`remove_identity`	Remove `Identity` nodes.
`remove_duplicated_initializer`	Merge identical constant initializers into a single tensor, removing redundant copies.
`patterns`	Apply user-supplied or built-in fusion patterns (e.g. `"default"` enables the default set of ONNX-to-ONNX rewrites).
`order`	Reorder nodes to reduce peak memory by moving each `Shape` / `Size` node immediately after the node that produces its input (controlled by `OrderAlgorithm`, default `SHAPE`).

<<<

import onnx
import onnx.helper as oh
from yobx.xbuilder import GraphBuilder, OptimizationOptions

TFLOAT = onnx.TensorProto.FLOAT

model = oh.make_model(
    oh.make_graph(
        [
            oh.make_node("Identity", ["X"], ["X2"]),
            oh.make_node("Relu", ["X2"], ["Z"]),
        ],
        "id_relu",
        [oh.make_tensor_value_info("X", TFLOAT, [None, 4])],
        [oh.make_tensor_value_info("Z", TFLOAT, [None, 4])],
    ),
    opset_imports=[oh.make_opsetid("", 18)],
    ir_version=10,
)

opts = OptimizationOptions(remove_identity=True)
g = GraphBuilder(model, optimization_options=opts)
optimized = g.to_onnx()
print("nodes before:", len(model.graph.node))
print("nodes after :", len(optimized.graph.node))

>>>

    nodes before: 2
    nodes after : 1

Optimization report#

Passing return_optimize_report=True to to_onnx makes the method return a (model, stats) tuple instead of just the model. stats is a list of dictionaries — one entry per optimization pass — that records how many nodes were added or removed and how long each pass took.

Key	Description
`pattern`	Name of the optimization pass (e.g. `"remove_identity"`, `"constant_folding"`, `"TransposeTranspose"` …).
`added`	Number of nodes added by this pass.
`removed`	Number of nodes removed by this pass.
`time_in`	Wall-clock time spent in this pass (seconds).
`iteration`	Iteration number (only for pattern-based passes).
`match_index`	Sequential index of the match within the iteration (pattern passes).
`instances`	Number of times the pattern was matched (pattern passes).

The list can be converted to a pandas.DataFrame for quick exploration:

<<<

import pandas
import onnx
import onnx.helper as oh
from yobx.xbuilder import GraphBuilder, OptimizationOptions

TFLOAT = onnx.TensorProto.FLOAT

model = oh.make_model(
    oh.make_graph(
        [
            oh.make_node("Identity", ["X"], ["X2"]),
            oh.make_node("Transpose", ["X2"], ["T"], perm=[1, 0]),
            oh.make_node("Transpose", ["T"], ["Z"], perm=[1, 0]),
        ],
        "demo",
        [oh.make_tensor_value_info("X", TFLOAT, [3, 4])],
        [oh.make_tensor_value_info("Z", TFLOAT, [3, 4])],
    ),
    opset_imports=[oh.make_opsetid("", 18)],
    ir_version=10,
)

opts = OptimizationOptions(patterns="default")
g = GraphBuilder(model, infer_shapes_options=True, optimization_options=opts)
optimized = g.to_onnx(return_optimize_report=True)

df = pandas.DataFrame(optimized.report.stats)
# keep only rows that have numeric added/removed counts
df["added"] = df["added"].fillna(0).astype(int)
df["removed"] = df["removed"].fillna(0).astype(int)
print(df[["pattern", "added", "removed", "time_in"]].to_string(index=False))
print(f"\nnodes before: {len(model.graph.node)}")
print(f"nodes after : {len(optimized.graph.node)}")

>>>

                                         pattern  added  removed      time_in
                        dynamic_dimension_naming      0        0 1.825000e-05
                check_A-dynamic_dimension_naming      0        0 1.026000e-05
                                 check_A-opt-sub      0        0 8.326000e-06
                                 remove_identity      1        2 3.419500e-05
                         check_remove_identity-0      0        0 6.391001e-06
                                   remove_unused      0        0 1.741700e-05
                           check_remove_unused-1      0        0 5.889999e-06
                                constant_folding      0        0 8.919000e-06
                apply_constant_folding_new_inits      0        0          NaN
                        check_constant_folding-2      0        0 5.639000e-06
                                   remove_unused      0        0 1.140400e-05
                           check_remove_unused-3      0        0 5.006999e-06
                                        patterns      0        1 5.047453e-03
                                check_pattern_00      0        0 1.208800e-05
                 match_BatchNormalizationPattern      0        0 7.896999e-06
         match_BatchNormalizationTrainingPattern      0        0 3.875000e-06
                               match_CastPattern      0        0 3.679999e-06
                           match_CastCastPattern      0        0 2.874001e-06
                       match_ConcatGatherPattern      0        0 3.088000e-06
                      match_ConcatReshapePattern      0        0 4.213000e-06
                       match_ConvBiasNullPattern      0        0 2.811999e-06
                            match_PadConvPattern      0        0 2.413000e-06
                             match_ExpandPattern      0        0 3.125000e-06
              match_ExpandUnsqueezeExpandPattern      0        0 2.346998e-06
                       match_GatherConcatPattern      0        0 2.637000e-06
                       match_GatherGatherPattern      0        0 2.216000e-06
                        match_GatherShapePattern      0        0 2.864999e-06
                               match_GeluPattern      0        0 9.720006e-07
                           match_IdentityPattern      0        0 2.266200e-05
                          match_LeakyReluPattern      0        0 1.049211e-03
              match_MulUnsqueezeUnsqueezePattern      0        0 1.964800e-05
                            match_ReshapePattern      0        0 6.918001e-06
                     match_ReshapeSqueezePattern      0        0 4.711999e-06
         match_ShapeBasedReshapeIsSqueezePattern      0        0 4.748001e-06
             match_ShapeBasedStaticExpandPattern      0        0 3.363000e-06
      match_ShapeBasedEditDistanceReshapePattern      0        0 3.499999e-06
                 match_ShapeBasedIdentityPattern      0        0 8.875000e-06
                 match_ShapedBasedReshapePattern      0        0 3.474999e-06
             match_ShapeBasedSameChildrenPattern      0        0 3.038000e-06
            match_ShapeBasedShapeShapeAddPattern      0        0 2.646000e-06
                     match_ShapeTransposePattern      0        0 2.883000e-06
                     match_UnsqueezeShapePattern      0        0 2.696999e-06
                     match_ReshapeReshapePattern      0        0 3.313000e-06
                       match_SameChildrenPattern      0        0 8.353001e-06
              match_SameChildrenFromInputPattern      0        0 7.140001e-06
        match_SoftmaxCrossEntropyLossCastPattern      0        0 1.923494e-03
                         match_SqueezeAddPattern      0        0 7.332001e-06
             match_SqueezeBinaryUnsqueezePattern      0        0 3.077999e-06
                   match_SqueezeUnsqueezePattern      0        0 3.873001e-06
                match_StaticConcatReshapePattern      0        0 4.842999e-06
                  match_SwapExpandReshapePattern      0        0 3.452000e-06
                match_SwapExpandUnsqueezePattern      0        0 2.692999e-06
                          match_SwapUnaryPattern      0        0 2.416400e-05
             match_SwapUnsqueezeTransposePattern      0        0 9.410000e-06
                    match_TransposeGatherPattern      0        0 2.838000e-06
          match_TransposeReshapeTransposePattern      0        0 6.640001e-06
                 match_TransposeTransposePattern      0        0 3.709300e-05
          match_UnsqueezeOrSqueezeReshapePattern      0        0 4.014000e-06
                   match_UnsqueezeReshapePattern      0        0 2.644001e-06
                 match_UnsqueezeUnsqueezePattern      0        0 2.704001e-06
                  match_FunctionAttentionPattern      0        0 3.453000e-06
               match_FunctionAttentionGQAPattern      0        0 3.611000e-06
                         insert_and_remove_nodes      0        0 8.707700e-05
                 apply_TransposeTransposePattern      1        2 1.530000e-04
                               check_pattern_A10      0        0 1.056000e-06
                               check_pattern_A20      0        0 8.576000e-06
                         remove_duplicated_shape      0        0 2.389001e-06
                               check_pattern_BD0      0        0 4.809001e-06
                           remove_identity_nodes      0        0 1.483900e-05
                               check_pattern_BI0      0        0 4.363001e-06
                                   remove_unused      0        0 1.233000e-05
                              check_pattern_BUS0      0        0 3.878000e-06
                         build_graph_for_pattern      0        0 9.689000e-06
                                     iteration_0      0        0 3.573477e-03
                 match_BatchNormalizationPattern      0        0 4.162001e-06
         match_BatchNormalizationTrainingPattern      0        0 2.171000e-06
                               match_CastPattern      0        0 1.807999e-06
                           match_CastCastPattern      0        0 1.618000e-06
                       match_ConcatGatherPattern      0        0 2.167000e-06
                      match_ConcatReshapePattern      0        0 2.813998e-06
                       match_ConvBiasNullPattern      0        0 1.986000e-06
                            match_PadConvPattern      0        0 7.021999e-06
                             match_ExpandPattern      0        0 1.826002e-06
              match_ExpandUnsqueezeExpandPattern      0        0 1.593999e-06
                       match_GatherConcatPattern      0        0 1.803000e-06
                       match_GatherGatherPattern      0        0 1.468001e-06
                        match_GatherShapePattern      0        0 1.935001e-06
                               match_GeluPattern      0        0 7.629988e-07
                           match_IdentityPattern      0        0 2.522000e-06
                          match_LeakyReluPattern      0        0 5.534001e-06
              match_MulUnsqueezeUnsqueezePattern      0        0 2.282000e-06
                            match_ReshapePattern      0        0 2.078999e-06
                     match_ReshapeSqueezePattern      0        0 2.019000e-06
         match_ShapeBasedReshapeIsSqueezePattern      0        0 2.399001e-06
             match_ShapeBasedStaticExpandPattern      0        0 1.738001e-06
      match_ShapeBasedEditDistanceReshapePattern      0        0 1.692000e-06
                 match_ShapeBasedIdentityPattern      0        0 1.931001e-06
                 match_ShapedBasedReshapePattern      0        0 1.736000e-06
             match_ShapeBasedSameChildrenPattern      0        0 1.758999e-06
            match_ShapeBasedShapeShapeAddPattern      0        0 1.614000e-06
                     match_ShapeTransposePattern      0        0 1.726001e-06
                     match_UnsqueezeShapePattern      0        0 1.513001e-06
                     match_ReshapeReshapePattern      0        0 1.733000e-06
                       match_SameChildrenPattern      0        0 3.588999e-06
              match_SameChildrenFromInputPattern      0        0 4.701000e-06
        match_SoftmaxCrossEntropyLossCastPattern      0        0 4.132000e-06
                         match_SqueezeAddPattern      0        0 1.450000e-06
             match_SqueezeBinaryUnsqueezePattern      0        0 1.445000e-06
                   match_SqueezeUnsqueezePattern      0        0 1.772001e-06
                match_StaticConcatReshapePattern      0        0 1.674000e-06
                  match_SwapExpandReshapePattern      0        0 1.557000e-06
                match_SwapExpandUnsqueezePattern      0        0 1.299000e-06
                          match_SwapUnaryPattern      0        0 1.602999e-06
             match_SwapUnsqueezeTransposePattern      0        0 1.621000e-06
                    match_TransposeGatherPattern      0        0 1.275999e-06
          match_TransposeReshapeTransposePattern      0        0 1.360999e-06
                 match_TransposeTransposePattern      0        0 1.528000e-06
          match_UnsqueezeOrSqueezeReshapePattern      0        0 1.565000e-06
                   match_UnsqueezeReshapePattern      0        0 1.766000e-06
                 match_UnsqueezeUnsqueezePattern      0        0 1.486998e-06
                  match_FunctionAttentionPattern      0        0 1.656999e-06
               match_FunctionAttentionGQAPattern      0        0 2.564000e-06
                               check_pattern_A20      0        0 7.670000e-06
                         remove_duplicated_shape      0        0 1.864000e-06
                               check_pattern_BD0      0        0 5.178001e-06
                           remove_identity_nodes      0        0 1.377400e-05
                               check_pattern_BI0      0        0 4.239000e-06
                                   remove_unused      0        0 1.035100e-05
                              check_pattern_BUS0      0        0 3.571000e-06
                         build_graph_for_pattern      0        0 7.932000e-06
                                     iteration_1      0        0 2.375970e-04
                 match_BatchNormalizationPattern      0        0 2.831200e-05
         match_BatchNormalizationTrainingPattern      0        0 4.658999e-06
         match_CastLayerNormalizationCastPattern      0        0 3.526000e-06
                               match_CastPattern      0        0 1.531000e-06
                     match_CastCastBinaryPattern      0        0 2.180999e-06
                           match_CastCastPattern      0        0 1.294000e-06
                         match_CastOpCastPattern      0        0 2.931001e-06
                           match_ClipClipPattern      0        0 2.345001e-06
                        match_ConcatEmptyPattern      0        0 2.344999e-06
                       match_ConcatGatherPattern      0        0 1.782000e-06
                      match_ConcatReshapePattern      0        0 2.142000e-06
                   match_ConcatTwiceUnaryPattern      0        0 3.117000e-06
              match_ConstantToInitializerPattern      0        0 2.338000e-06
                       match_ConvBiasNullPattern      0        0 1.683000e-06
                            match_PadConvPattern      0        0 1.373000e-06
                            match_DropoutPattern      0        0 2.036000e-06
                             match_ExpandPattern      0        0 1.529001e-06
                    match_ExpandBroadcastPattern      0        0 1.824999e-06
                         match_ExpandSwapPattern      0        0 2.589999e-06
              match_ExpandUnsqueezeExpandPattern      0        0 1.461000e-06
                       match_GatherConcatPattern      0        0 1.697001e-06
                       match_GatherGatherPattern      0        0 1.389000e-06
                       match_GathersSplitPattern      0        0 2.233999e-06
                        match_GatherShapePattern      0        0 1.208400e-05
                               match_GeluPattern      0        0 8.600000e-07
                           match_IdentityPattern      0        0 1.994000e-06
                 match_LayerNormalizationPattern      0        0 2.192999e-06
            match_LayerNormalizationScalePattern      0        0 2.435001e-06
                          match_LeakyReluPattern      0        0 4.488998e-06
                            match_MaxReluPattern      0        0 1.812999e-06
                    match_MulMulMulScalarPattern      0        0 2.685001e-06
              match_MulUnsqueezeUnsqueezePattern      0        0 1.464001e-06
                             match_NotNotPattern      0        0 6.414000e-06
                           match_NotWherePattern      0        0 2.282000e-06
                      match_ReduceArgTopKPattern      0        0 2.564000e-06
                      match_ReduceReshapePattern      0        0 2.186000e-06
                 match_ReduceSumNormalizePattern      0        0 2.104001e-06
                            match_ReshapePattern      0        0 1.709001e-06
               match_ReshapeMatMulReshapePattern      0        0 1.861001e-06
                        match_Reshape2Of3Pattern      0        0 1.761000e-06
               match_ReshapeReshapeBinaryPattern      0        0 1.939999e-06
                     match_ReshapeSqueezePattern      0        0 1.687000e-06
                      match_GemmTransposePattern      0        0 2.163000e-06
                  match_MatMulReshape2Of3Pattern      0        0 2.149001e-06
                       match_MulMulMatMulPattern      0        0 1.710001e-06
         match_ShapeBasedReshapeIsSqueezePattern      0        0 1.623999e-06
             match_ShapeBasedStaticExpandPattern      0        0 1.320001e-06
             match_ShapeBasedConcatExpandPattern      0        0 1.726999e-06
      match_ShapeBasedEditDistanceReshapePattern      0        0 1.623999e-06
                 match_ShapeBasedIdentityPattern      0        0 1.635999e-06
          match_ShapeBasedExpandBroadcastPattern      0        0 2.006998e-06
    match_ShapeBasedExpandBroadcastMatMulPattern      0        0 1.809000e-06
      match_ShapeBasedExpandCastWhereSwapPattern      0        0 1.696000e-06
               match_ShapeBasedExpandSwapPattern      0        0 1.962999e-06
              match_ShapeBasedMatMulToMulPattern      0        0 1.934999e-06
                 match_ShapedBasedReshapePattern      0        0 1.767999e-06
             match_ShapeBasedSameChildrenPattern      0        0 1.593000e-06
            match_ShapeBasedShapeShapeAddPattern      0        0 1.351000e-06
                     match_ShapeTransposePattern      0        0 1.324001e-06
                     match_UnsqueezeShapePattern      0        0 1.409000e-06
                     match_ReshapeReshapePattern      0        0 1.543000e-06
                    match_RotaryEmbeddingPattern      0        0 2.304001e-06
                       match_SameChildrenPattern      0        0 3.507001e-06
              match_SameChildrenFromInputPattern      0        0 4.048001e-06
                match_SequenceConstructAtPattern      0        0 2.377001e-06
          match_SplitToSequenceSequenceAtPattern      0        0 1.868000e-06
                         match_SliceSlicePattern      0        0 2.488001e-06
                        match_SlicesSplitPattern      0        0 1.876999e-06
        match_SoftmaxCrossEntropyLossCastPattern      0        0 3.846000e-06
                        match_SplitConcatPattern      0        0 1.962000e-06
                         match_SqueezeAddPattern      0        0 1.303000e-06
             match_SqueezeBinaryUnsqueezePattern      0        0 1.230001e-06
                   match_SqueezeUnsqueezePattern      0        0 1.351000e-06
                match_StaticConcatReshapePattern      0        0 1.680000e-06
                            match_Sub1MulPattern      0        0 1.876000e-06
                  match_SwapExpandReshapePattern      0        0 1.358001e-06
                match_SwapExpandUnsqueezePattern      0        0 1.529001e-06
                 match_SwapRangeAddScalarPattern      0        0 2.183000e-06
                          match_SwapUnaryPattern      0        0 1.491000e-06
             match_SwapUnsqueezeTransposePattern      0        0 1.473001e-06
                  match_SwitchOrderBinaryPattern      0        0 2.334000e-06
            match_SwitchReshapeActivationPattern      0        0 2.175000e-06
              match_TransposeEqualReshapePattern      0        0 1.820999e-06
                    match_TransposeGatherPattern      0        0 1.324999e-06
                    match_TransposeMatMulPattern      0        0 1.798000e-06
             match_TransposeReshapeMatMulPattern      0        0 2.163000e-06
          match_TransposeReshapeTransposePattern      0        0 1.569999e-06
                 match_TransposeTransposePattern      0        0 4.527001e-06
                     match_UnsqueezeEqualPattern      0        0 1.928001e-06
          match_UnsqueezeOrSqueezeReshapePattern      0        0 1.659999e-06
                   match_UnsqueezeReshapePattern      0        0 1.558999e-06
                 match_UnsqueezeUnsqueezePattern      0        0 1.372000e-06
                           match_WhereAddPattern      0        0 1.927001e-06
                   match_RotaryConcatPartPattern      0        0 1.856999e-06
                  match_FunctionAttentionPattern      0        0 1.764000e-06
               match_FunctionAttentionGQAPattern      0        0 2.493000e-06
                 match_FunctionCausalMaskPattern      0        0 2.099001e-06
           match_FunctionCausalMaskMulAddPattern      0        0 1.933000e-06
                match_FunctionCosSinCachePattern      0        0 1.901000e-06
        match_FunctionHalfRotaryEmbeddingPattern      0        0 1.833001e-06
                   match_RMSNormalizationPattern      0        0 1.783001e-06
                match_RMSNormalizationMulPattern      0        0 1.929000e-06
                               check_pattern_A20      0        0 8.536001e-06
                         remove_duplicated_shape      0        0 1.704000e-06
                               check_pattern_BD0      0        0 4.172001e-06
                           remove_identity_nodes      0        0 1.408300e-05
                               check_pattern_BI0      0        0 4.182000e-06
                                   remove_unused      0        0 1.023500e-05
                              check_pattern_BUS0      0        0 3.859001e-06
                         build_graph_for_pattern      0        0 7.670000e-06
                                     iteration_2      0        0 4.211450e-04
                 match_BatchNormalizationPattern      0        0 2.671000e-06
         match_BatchNormalizationTrainingPattern      0        0 2.032000e-06
         match_CastLayerNormalizationCastPattern      0        0 1.765000e-06
                               match_CastPattern      0        0 1.557999e-06
                     match_CastCastBinaryPattern      0        0 2.042001e-06
                           match_CastCastPattern      0        0 1.512999e-06
                         match_CastOpCastPattern      0        0 1.942000e-06
                           match_ClipClipPattern      0        0 1.723000e-06
                        match_ConcatEmptyPattern      0        0 1.732000e-06
                       match_ConcatGatherPattern      0        0 1.589000e-06
                      match_ConcatReshapePattern      0        0 1.873001e-06
                   match_ConcatTwiceUnaryPattern      0        0 1.806000e-06
              match_ConstantToInitializerPattern      0        0 1.586001e-06
                       match_ConvBiasNullPattern      0        0 1.262999e-06
                            match_PadConvPattern      0        0 1.384000e-06
                            match_DropoutPattern      0        0 1.451999e-06
                             match_ExpandPattern      0        0 1.482998e-06
                    match_ExpandBroadcastPattern      0        0 1.337001e-06
                         match_ExpandSwapPattern      0        0 1.433000e-06
              match_ExpandUnsqueezeExpandPattern      0        0 1.229000e-06
                       match_GatherConcatPattern      0        0 1.494998e-06
                       match_GatherGatherPattern      0        0 1.293000e-06
                       match_GathersSplitPattern      0        0 1.304001e-06
                        match_GatherShapePattern      0        0 1.322000e-06
                               match_GeluPattern      0        0 9.089999e-07
                           match_IdentityPattern      0        0 1.812001e-06
                 match_LayerNormalizationPattern      0        0 1.371000e-06
            match_LayerNormalizationScalePattern      0        0 1.344000e-06
                          match_LeakyReluPattern      0        0 3.318999e-06
                            match_MaxReluPattern      0        0 1.349001e-06
                    match_MulMulMulScalarPattern      0        0 1.529999e-06
              match_MulUnsqueezeUnsqueezePattern      0        0 1.692999e-06
                             match_NotNotPattern      0        0 1.344000e-06
                           match_NotWherePattern      0        0 1.860000e-06
                      match_ReduceArgTopKPattern      0        0 1.788001e-06
                      match_ReduceReshapePattern      0        0 1.599001e-06
                 match_ReduceSumNormalizePattern      0        0 1.352999e-06
                            match_ReshapePattern      0        0 1.652001e-06
               match_ReshapeMatMulReshapePattern      0        0 1.401000e-06
                        match_Reshape2Of3Pattern      0        0 1.492999e-06
               match_ReshapeReshapeBinaryPattern      0        0 1.425000e-06
                     match_ReshapeSqueezePattern      0        0 1.570001e-06
                      match_GemmTransposePattern      0        0 1.516000e-06
                  match_MatMulReshape2Of3Pattern      0        0 1.544999e-06
                       match_MulMulMatMulPattern      0        0 1.335000e-06
         match_ShapeBasedReshapeIsSqueezePattern      0        0 1.495000e-06
             match_ShapeBasedStaticExpandPattern      0        0 1.539001e-06
             match_ShapeBasedConcatExpandPattern      0        0 1.541999e-06
      match_ShapeBasedEditDistanceReshapePattern      0        0 1.353999e-06
                 match_ShapeBasedIdentityPattern      0        0 1.330000e-06
          match_ShapeBasedExpandBroadcastPattern      0        0 1.350001e-06
    match_ShapeBasedExpandBroadcastMatMulPattern      0        0 1.467000e-06
      match_ShapeBasedExpandCastWhereSwapPattern      0        0 1.298000e-06
               match_ShapeBasedExpandSwapPattern      0        0 1.426000e-06
              match_ShapeBasedMatMulToMulPattern      0        0 1.455001e-06
                 match_ShapedBasedReshapePattern      0        0 1.896000e-06
             match_ShapeBasedSameChildrenPattern      0        0 1.791999e-06
            match_ShapeBasedShapeShapeAddPattern      0        0 1.426000e-06
                     match_ShapeTransposePattern      0        0 1.278000e-06
                     match_UnsqueezeShapePattern      0        0 1.356999e-06
                     match_ReshapeReshapePattern      0        0 1.553000e-06
                    match_RotaryEmbeddingPattern      0        0 1.422999e-06
                       match_SameChildrenPattern      0        0 3.374000e-06
              match_SameChildrenFromInputPattern      0        0 3.800998e-06
                match_SequenceConstructAtPattern      0        0 1.480001e-06
          match_SplitToSequenceSequenceAtPattern      0        0 1.438000e-06
                         match_SliceSlicePattern      0        0 1.468001e-06
                        match_SlicesSplitPattern      0        0 1.373000e-06
        match_SoftmaxCrossEntropyLossCastPattern      0        0 3.556001e-06
                        match_SplitConcatPattern      0        0 2.045999e-06
                         match_SqueezeAddPattern      0        0 1.496001e-06
             match_SqueezeBinaryUnsqueezePattern      0        0 1.246000e-06
                   match_SqueezeUnsqueezePattern      0        0 1.524000e-06
                match_StaticConcatReshapePattern      0        0 1.595001e-06
                            match_Sub1MulPattern      0        0 1.547000e-06
                  match_SwapExpandReshapePattern      0        0 1.770000e-06
                match_SwapExpandUnsqueezePattern      0        0 1.418999e-06
                 match_SwapRangeAddScalarPattern      0        0 1.488001e-06
                          match_SwapUnaryPattern      0        0 1.394001e-06
             match_SwapUnsqueezeTransposePattern      0        0 1.356000e-06
                  match_SwitchOrderBinaryPattern      0        0 1.742999e-06
            match_SwitchReshapeActivationPattern      0        0 1.689999e-06
              match_TransposeEqualReshapePattern      0        0 1.442000e-06
                    match_TransposeGatherPattern      0        0 1.349999e-06
                    match_TransposeMatMulPattern      0        0 1.614000e-06
             match_TransposeReshapeMatMulPattern      0        0 1.553000e-06
          match_TransposeReshapeTransposePattern      0        0 1.296001e-06
                 match_TransposeTransposePattern      0        0 1.333001e-06
                     match_UnsqueezeEqualPattern      0        0 1.450000e-06
          match_UnsqueezeOrSqueezeReshapePattern      0        0 1.439001e-06
                   match_UnsqueezeReshapePattern      0        0 1.488001e-06
                 match_UnsqueezeUnsqueezePattern      0        0 1.266000e-06
                           match_WhereAddPattern      0        0 1.186001e-06
                   match_RotaryConcatPartPattern      0        0 1.357999e-06
                  match_FunctionAttentionPattern      0        0 1.495000e-06
               match_FunctionAttentionGQAPattern      0        0 2.130000e-06
                 match_FunctionCausalMaskPattern      0        0 1.537001e-06
           match_FunctionCausalMaskMulAddPattern      0        0 1.286000e-06
                match_FunctionCosSinCachePattern      0        0 1.406001e-06
        match_FunctionHalfRotaryEmbeddingPattern      0        0 1.353999e-06
                   match_RMSNormalizationPattern      0        0 1.482000e-06
                match_RMSNormalizationMulPattern      0        0 1.512000e-06
                       match_AttentionGQAPattern      0        0 1.992999e-06
                               check_pattern_A20      0        0 6.460999e-06
                         remove_duplicated_shape      0        0 1.483000e-06
                               check_pattern_BD0      0        0 4.905000e-06
                           remove_identity_nodes      0        0 1.389600e-05
                               check_pattern_BI0      0        0 3.755000e-06
                                   remove_unused      0        0 9.897001e-06
                              check_pattern_BUS0      0        0 3.671001e-06
                         build_graph_for_pattern      0        0 7.326002e-06
                                     iteration_3      0        0 3.307160e-04
                 match_BatchNormalizationPattern      0        0 2.618001e-06
         match_BatchNormalizationTrainingPattern      0        0 1.842001e-06
         match_CastLayerNormalizationCastPattern      0        0 2.018001e-06
                               match_CastPattern      0        0 1.589999e-06
                     match_CastCastBinaryPattern      0        0 1.382999e-06
                           match_CastCastPattern      0        0 1.245000e-06
                         match_CastOpCastPattern      0        0 1.773000e-06
                           match_ClipClipPattern      0        0 1.375000e-06
                        match_ConcatEmptyPattern      0        0 1.573000e-06
                       match_ConcatGatherPattern      0        0 1.568000e-06
                      match_ConcatReshapePattern      0        0 1.799001e-06
                   match_ConcatTwiceUnaryPattern      0        0 1.814000e-06
              match_ConstantToInitializerPattern      0        0 1.765000e-06
                       match_ConvBiasNullPattern      0        0 1.410999e-06
                            match_PadConvPattern      0        0 1.341001e-06
                            match_DropoutPattern      0        0 1.699998e-06
                             match_ExpandPattern      0        0 1.344999e-06
                    match_ExpandBroadcastPattern      0        0 1.348000e-06
                         match_ExpandSwapPattern      0        0 1.384000e-06
              match_ExpandUnsqueezeExpandPattern      0        0 1.322000e-06
                       match_GatherConcatPattern      0        0 1.601002e-06
                       match_GatherGatherPattern      0        0 1.349001e-06
                       match_GathersSplitPattern      0        0 1.434999e-06
                        match_GatherShapePattern      0        0 1.413000e-06
                               match_GeluPattern      0        0 9.510004e-07
                           match_IdentityPattern      0        0 1.490998e-06
                 match_LayerNormalizationPattern      0        0 1.721000e-06
            match_LayerNormalizationScalePattern      0        0 1.373999e-06
                          match_LeakyReluPattern      0        0 3.566001e-06
                            match_MaxReluPattern      0        0 1.962000e-06
                    match_MulMulMulScalarPattern      0        0 1.592000e-06
              match_MulUnsqueezeUnsqueezePattern      0        0 1.386999e-06
                             match_NotNotPattern      0        0 1.325001e-06
                           match_NotWherePattern      0        0 1.239001e-06
                      match_ReduceArgTopKPattern      0        0 2.194000e-06
                      match_ReduceReshapePattern      0        0 1.471999e-06
                 match_ReduceSumNormalizePattern      0        0 1.765000e-06
                            match_ReshapePattern      0        0 1.635000e-06
               match_ReshapeMatMulReshapePattern      0        0 1.407001e-06
                        match_Reshape2Of3Pattern      0        0 1.572000e-06
               match_ReshapeReshapeBinaryPattern      0        0 1.405000e-06
                     match_ReshapeSqueezePattern      0        0 1.597000e-06
                          match_MatMulAddPattern      0        0 2.183000e-06
                      match_GemmTransposePattern      0        0 1.388000e-06
                  match_MatMulReshape2Of3Pattern      0        0 1.466999e-06
                       match_MulMulMatMulPattern      0        0 1.504999e-06
         match_ShapeBasedReshapeIsSqueezePattern      0        0 1.561000e-06
             match_ShapeBasedStaticExpandPattern      0        0 1.565000e-06
             match_ShapeBasedConcatExpandPattern      0        0 1.547000e-06
      match_ShapeBasedEditDistanceReshapePattern      0        0 1.688999e-06
                 match_ShapeBasedIdentityPattern      0        0 1.539001e-06
          match_ShapeBasedExpandBroadcastPattern      0        0 1.420000e-06
    match_ShapeBasedExpandBroadcastMatMulPattern      0        0 1.291999e-06
      match_ShapeBasedExpandCastWhereSwapPattern      0        0 1.466000e-06
               match_ShapeBasedExpandSwapPattern      0        0 1.540000e-06
              match_ShapeBasedMatMulToMulPattern      0        0 1.306000e-06
                 match_ShapedBasedReshapePattern      0        0 1.938999e-06
             match_ShapeBasedSameChildrenPattern      0        0 1.598000e-06
            match_ShapeBasedShapeShapeAddPattern      0        0 1.485001e-06
                     match_ShapeTransposePattern      0        0 1.339000e-06
                     match_UnsqueezeShapePattern      0        0 1.540999e-06
                     match_ReshapeReshapePattern      0        0 1.578001e-06
                    match_RotaryEmbeddingPattern      0        0 1.458000e-06
                       match_SameChildrenPattern      0        0 3.142999e-06
              match_SameChildrenFromInputPattern      0        0 3.540001e-06
                match_SequenceConstructAtPattern      0        0 1.603001e-06
          match_SplitToSequenceSequenceAtPattern      0        0 1.556000e-06
                         match_SliceSlicePattern      0        0 1.447999e-06
                        match_SlicesSplitPattern      0        0 1.398001e-06
        match_SoftmaxCrossEntropyLossCastPattern      0        0 2.890001e-06
                        match_SplitConcatPattern      0        0 1.666000e-06
                         match_SqueezeAddPattern      0        0 1.388000e-06
             match_SqueezeBinaryUnsqueezePattern      0        0 1.231001e-06
                   match_SqueezeUnsqueezePattern      0        0 1.520999e-06
                match_StaticConcatReshapePattern      0        0 1.625000e-06
                            match_Sub1MulPattern      0        0 1.458000e-06
                  match_SwapExpandReshapePattern      0        0 3.204001e-06
                match_SwapExpandUnsqueezePattern      0        0 1.381999e-06
                 match_SwapRangeAddScalarPattern      0        0 1.365001e-06
                          match_SwapUnaryPattern      0        0 1.482000e-06
             match_SwapUnsqueezeTransposePattern      0        0 1.683000e-06
                  match_SwitchOrderBinaryPattern      0        0 1.461000e-05
            match_SwitchReshapeActivationPattern      0        0 2.141000e-05
              match_TransposeEqualReshapePattern      0        0 3.870000e-06
                    match_TransposeGatherPattern      0        0 2.602001e-06
                    match_TransposeMatMulPattern      0        0 2.207000e-06
             match_TransposeReshapeMatMulPattern      0        0 2.030001e-06
          match_TransposeReshapeTransposePattern      0        0 2.277999e-06
                 match_TransposeTransposePattern      0        0 1.576000e-06
                     match_UnsqueezeEqualPattern      0        0 2.111999e-06
          match_UnsqueezeOrSqueezeReshapePattern      0        0 2.123001e-06
                   match_UnsqueezeReshapePattern      0        0 1.850000e-06
                 match_UnsqueezeUnsqueezePattern      0        0 1.431001e-06
                           match_WhereAddPattern      0        0 1.427999e-06
                   match_RotaryConcatPartPattern      0        0 1.876999e-06
                  match_FunctionAttentionPattern      0        0 1.793002e-06
               match_FunctionAttentionGQAPattern      0        0 2.488001e-06
                 match_FunctionCausalMaskPattern      0        0 1.766999e-06
           match_FunctionCausalMaskMulAddPattern      0        0 1.259999e-06
                match_FunctionCosSinCachePattern      0        0 1.475000e-06
        match_FunctionHalfRotaryEmbeddingPattern      0        0 1.394001e-06
                   match_RMSNormalizationPattern      0        0 1.929000e-06
                match_RMSNormalizationMulPattern      0        0 1.390999e-06
                       match_AttentionGQAPattern      0        0 1.662998e-06
                               check_pattern_A20      0        0 8.887000e-06
                         remove_duplicated_shape      0        0 1.568000e-06
                               check_pattern_BD0      0        0 4.238000e-06
                           remove_identity_nodes      0        0 1.304600e-05
                               check_pattern_BI0      0        0 4.172000e-06
                                   remove_unused      0        0 9.439998e-06
                              check_pattern_BUS0      0        0 3.703000e-06
                         build_graph_for_pattern      0        0 7.035000e-06
                                check_patterns-4      0        0 7.928000e-06
                                   remove_unused      0        0 9.664000e-06
                           check_remove_unused-5      0        0 3.980002e-06
                                 remove_identity      0        0 1.177800e-05
                         check_remove_identity-6      0        0 3.310000e-06
                                constant_folding      0        0 7.044000e-06
                apply_constant_folding_new_inits      0        0          NaN
                        check_constant_folding-7      0        0 3.206000e-06
                                   remove_unused      0        0 7.207000e-06
                           check_remove_unused-8      0        0 2.864001e-06
                   remove_duplicated_initializer      0        0 1.390001e-06
           check_remove_duplicated_initializer-9      0        0 3.235000e-06
                                 remove_identity      0        0 9.383000e-06
                        check_remove_identity-10      0        0 3.122999e-06
                                   remove_unused      0        0 5.909000e-06
                          check_remove_unused-11      0        0 2.962001e-06
                                           order      0        0 3.436300e-05
                                    check_orderA      0        0 5.322001e-06
                                    check_orderL      0        0 4.018000e-06
                                     shape_order      0        0 1.597100e-05
                                           order      0        0          NaN
                                  check_order-12      0        0 4.267000e-06
                                    optimization      0        2 5.345884e-03
    
    nodes before: 3
    nodes after : 1

The report can be aggregated by pass name:

<<<

import pandas
import onnx
import onnx.helper as oh
from yobx.xbuilder import GraphBuilder, OptimizationOptions

TFLOAT = onnx.TensorProto.FLOAT

model = oh.make_model(
    oh.make_graph(
        [
            oh.make_node("Identity", ["X"], ["X2"]),
            oh.make_node("Transpose", ["X2"], ["T"], perm=[1, 0]),
            oh.make_node("Transpose", ["T"], ["Z"], perm=[1, 0]),
        ],
        "demo",
        [oh.make_tensor_value_info("X", TFLOAT, [3, 4])],
        [oh.make_tensor_value_info("Z", TFLOAT, [3, 4])],
    ),
    opset_imports=[oh.make_opsetid("", 18)],
    ir_version=10,
)

opts = OptimizationOptions(patterns="default")
g = GraphBuilder(model, infer_shapes_options=True, optimization_options=opts)
art = g.to_onnx(return_optimize_report=True)

df = pandas.DataFrame(art.report.stats)
for c in ["added", "removed"]:
    df[c] = df[c].fillna(0).astype(int)
agg = df.groupby("pattern")[["added", "removed", "time_in"]].sum()
agg = agg[(agg["added"] > 0) | (agg["removed"] > 0)].sort_values(
    "removed", ascending=False
)
print(agg.to_string())

>>>

                                     added  removed   time_in
    pattern                                                  
    apply_TransposeTransposePattern      1        2  0.000119
    optimization                         0        2  0.004356
    remove_identity                      1        2  0.000047
    patterns                             0        1  0.004098

Local functions#

A sub-graph can be exported as a reusable ONNX local function (a FunctionProto) by passing a FunctionOptions instance to to_onnx.

<<<

import onnx
from yobx.xbuilder import GraphBuilder, FunctionOptions

TFLOAT = onnx.TensorProto.FLOAT

g = GraphBuilder(18, ir_version=10, as_function=True)
g.make_tensor_input("X", TFLOAT, ("batch", 64))
r = g.op.Relu("X")
g.make_tensor_output(r, indexed=False)

func = g.to_onnx(
    function_options=FunctionOptions(
        export_as_function=True,
        name="MyRelu",
        domain="my.domain",
    ),
    inline=False,
)
proto = func.proto
print(type(proto).__name__)
print("function name  :", proto.name)
print("function domain:", proto.domain)

>>>

    FunctionProto
    function name  : MyRelu
    function domain: my.domain

Debugging GraphBuilder with Environment Variables#

GraphBuilder respects several environment variables that help narrow down construction or optimization problems:

Environment variable	Effect
`ONNXSTOP=<name>`	Raises an exception the moment result `<name>` is created.
`ONNXSTOPSHAPE=<name>`	Raises an exception the moment result `<name>` receives a shape.
`ONNXSTOPTYPE=<name>`	Raises an exception the moment result `<name>` receives a type.
`ONNXSTOPOUTPUT=<name>`	Raises an exception the moment a node produces output `<name>`.
`ONNXSTOPVALUESHAPE=<name>`	Prints extra information for shape-as-value tracking (e.g. inputs to `Reshape`).
`ONNXCST=1`	Prints which constant is being evaluated.
`ONNXFUNC=1`	Prints details when nodes from a local function domain are added.
`ONNXSHAPECOMPUTE=1`	Raises an exception when a shape is missing for a result that should have one.
`NULLSHAPE=1`	Raises an exception as soon as a null/empty shape is encountered.
`ONNXDYNDIM=<name>`	Prints a message every time dynamic dimension `<name>` is used.
`PRINTNAME=<name>`	Prints a message every time a node producing `<name>` is added.

In addition, get_debug_msg returns a detailed text dump of the builder’s internal state (known shapes, types, ranks, constants, and node list) which can be printed or logged whenever an assertion fails.

pretty_text returns a human-readable representation of the whole graph (inputs, initializers, nodes, outputs) and is useful for quick visual inspection:

<<<

import onnx
import onnx.helper as oh
from yobx.xbuilder import GraphBuilder

TFLOAT = onnx.TensorProto.FLOAT

model = oh.make_model(
    oh.make_graph(
        [
            oh.make_node("Add", ["X", "Y"], ["T"]),
            oh.make_node("Relu", ["T"], ["Z"]),
        ],
        "add_relu",
        [
            oh.make_tensor_value_info("X", TFLOAT, ["batch", 4]),
            oh.make_tensor_value_info("Y", TFLOAT, ["batch", 4]),
        ],
        [oh.make_tensor_value_info("Z", TFLOAT, ["batch", 4])],
    ),
    opset_imports=[oh.make_opsetid("", 18)],
    ir_version=10,
)

g = GraphBuilder(model)
print(g.pretty_text())

>>>

    
    dyn---: batch -> WrapSym(batch)
    dynrev: batch -> [('batch', SymInt(batch))]
    dynsrc: batch -> [{batch:('input_name', 'X'), batch:('axis', 0)}, {batch:('input_name', 'Y'), batch:('axis', 0)}, {batch:('input_name', 'Z'), batch:('axis', 0)}]
    opset: : 18
    input:: X                                                                       |T1: batch x 4
    input:: Y                                                                       |T1: batch x 4
    Add: X, Y -> T                                                                  |T1: batch x 4
    Relu: T -> Z                                                                    |T1: batch x 4
    output:: Z                                                                      |T1: batch x 4