Exported ONNX with Dynamic Shapes

The following script shows the exported program for many short cases and various l-plot-export-with-dynamic-shape to retrieve an ONNX model equivalent to the original model.

<<<

import inspect
import textwrap
import pandas
from onnx_diagnostic.helpers import string_type
from onnx_diagnostic.helpers.onnx_helper import pretty_onnx
from onnx_diagnostic.torch_export_patches.eval import discover, run_exporter
from onnx_diagnostic.ext_test_case import unit_test_going

cases = discover()
print()
print(":ref:`Summary <ledx-summary-exported-program>`")
print()
sorted_cases = sorted(cases.items())
if unit_test_going():
    sorted_cases = sorted_cases[:3]
for name, cls_model in sorted_cases:
    print(f"* :ref:`{name} <ledx-model-case-export-{name}>`")
print()
print()

obs = []
for name, cls_model in sorted(cases.items()):
    print()
    print(f".. _ledx-model-case-export-{name}:")
    print()
    print(name)
    print("=" * len(name))
    print()
    print("forward")
    print("+++++++")
    print()
    print(".. code-block:: python")
    print()
    src = inspect.getsource(cls_model.forward)
    if src:
        print(textwrap.indent(textwrap.dedent(src), "    "))
    else:
        print("    # code is missing")
    print()
    print()
    for exporter in ("custom", "dynamo-ir"):
        expname = exporter.replace("export-", "")
        print()
        print(expname)
        print("+" * len(expname))
        print()
        res = run_exporter(exporter, cls_model, True, quiet=True)
        case_ref = f":ref:`{name} <ledx-model-case-export-{name}>`"
        expo = exporter.split("-", maxsplit=1)[-1]
        if "inputs" in res:
            print(f"* **inputs:** ``{string_type(res['inputs'], with_shape=True)}``")
        if "dynamic_shapes" in res:
            print(f"* **shapes:** ``{string_type(res['dynamic_shapes'])}``")
        print()
        print()
        if "onx" in res:
            print(".. code-block:: text")
            print()
            print(textwrap.indent(pretty_onnx(res["onx"]), "    "))
            print()
            print()
            if "error" not in res:
                obs.append(dict(case=case_ref, error="", exporter=expo))
        if "error" in res:
            print("**FAILED**")
            print()
            print(".. code-block:: text")
            print()
            err = str(res["error"])
            if err:
                print(textwrap.indent(err, "    "))
            else:
                print("    # no error found for the failure")
            print()
            print()
            obs.append(dict(case=case_ref, error="FAIL", exporter=expo))

print()
print(".. _ledx-summary-exported-program:")
print()
print("Summary")
print("+++++++")
print()
df = pandas.DataFrame(obs)
piv = df.pivot(index="case", columns="exporter", values="error")
print(piv.to_markdown(tablefmt="rst"))
print()

>>>

Summary

AtenAsStrided

forward

def forward(self, x):
    y = torch.as_strided(x, (2, 2, 8, 4), (128, 8, 16, 1))
    return y

custom

FAILED

The implementation is still incorrect, x='x', shape=('batch', 2, 8, 8), size=[2, 2, 8, 4], stride=[128, 8, 16, 1], storage_offset=None
--DEBUG--
[GraphBuilder-XRS] Message starts, there are 0 initializers, 0 nodes, 1 inputs, 1 outputs.
--CONSTRAINTS--
    batch = {'s77'}
    s77 = {'batch'}
--SHAPE--
_dynamic_examples=
dynamic_objects=
   batch = 'batch'
   s77 = 's77'
dynamic_objects_rev=
   'batch' = <class 'list'>
     tuple
       'batch'
       ERR**: <class 'torch.SymInt'>:'batch'
dynamic_dimensions_source={'batch': [{'axis': 0, 'input_name': 'x'}]}
dynamic_dimensions_source_flat=['x']
output_dynamic_dimensions_source_flat=None
dynamic_alias={'s77': 'batch'}
dynamic_shapes={'x': {0: Dim('batch', min=0)}}
_known_shapes={'x': ('batch', 2, 8, 8)}
_known_types={'x': 1}
_known_value_shape={}
_known_constants=[]
_known_ranks={}
--PARAMETERS--
_parameter_renaming=
--TORCH-USERS--
    as_strided -> {output}
    x -> {as_strided}
--TORCH-SHAPES--
    x: ('run_node', ('', ('val', torch.float32, torch.Size([s77, 2, 8, 8])))) --- 1:4:('batch', 2, 8, 8):
    as_strided: ('run_node', ('', ('val', torch.float32, torch.Size([2, 2, 8, 4])))) --- :::
--ONNX--
-- EXEPATH --
export-export_options=ExportOptions()
-- process.graph_module --
ExportedProgram:
    class GraphModule(torch.nn.Module):
        def forward(self, x: "f32[s77, 2, 8, 8]"):
             # File: ~/github/onnx-diagnostic/onnx_diagnostic/torch_export_patches/eval/model_cases.py:186 in forward, code: y = torch.as_strided(x, (2, 2, 8, 4), (128, 8, 16, 1))
            as_strided: "f32[2, 2, 8, 4]" = torch.ops.aten.as_strided.default(x, [2, 2, 8, 4], [128, 8, 16, 1]);  x = None
            return (as_strided,)
        
Graph signature: 
    # inputs
    x: USER_INPUT

    # outputs
    as_strided: USER_OUTPUT

Range constraints: {s77: VR[0, int_oo]}

-- process.graph_module.graph --
graph():
    %x : [num_users=1] = placeholder[target=x]
    %as_strided : [num_users=1] = call_function[target=torch.ops.aten.as_strided.default](args = (%x, [2, 2, 8, 4], [128, 8, 16, 1]), kwargs = {})
    return (as_strided,)
-- process.progress --
node 1/3 target=aten.as_strided.default
-- 1 INPUTS
[GraphBuilder-XRS.make_tensor_input] x[1:batchx2x8x8]
-- 0 INITIALIZERS
-- 0 OUTPUTS
[GraphBuilder-XRS] Message completed, there are 0 initializers, 0 nodes, 1 inputs, 1 outputs.

dynamo-ir

  • inputs: #1[(T1s2x2x8x8,)]

  • shapes: dict(x:{0:Dim(batch)})

<
   ir_version: 10,
   opset_import: ["" : 18],
   producer_name: "pytorch",
   producer_version: "2.9.0.dev20250701+cu126"
>
main_graph (float[batch,2,8,8] x) => (float[2,2,8,4] as_strided) 
   <int64[4] val_0, int64[4] val_1, int64[1] neg_1, int64[1] rank_tensor, int64 indices, seq(float[]) one_seq, int64 rank_0, float[?] self_flatten>
{
   [node_Constant_0] val_0 = Constant <value: tensor = int64[4] {2,2,8,4}> ()
   [node_Constant_1] val_1 = Constant <value: tensor = int64[4] {128,8,16,1}> ()
   [n0] neg_1 = Constant <value_ints: ints = [-1]> ()
   [node_Constant_2] rank_tensor = Constant <value: tensor = int64[1] rank_tensor {4}> ()
   [n3] indices = Constant <value_int: int = 0> ()
   [n4] one_seq = SequenceEmpty ()
   [n5] rank_0 = Constant <value_int: int = 4> ()
   [n6_2] one_seq_16, indices_17 = Loop (rank_0, "", one_seq, indices) <body: graph = loop_body (int64 i, bool cond_in,  one_seq_1,  indices_2) => (bool cond_out,  one_seq_15,  indices_13) 
      <int64 rank_3_cast, int64 tmp, int64 int64_1_cast, int64 j, int64[1] j_tensor, int64[1] size_dim_j, int64[?] size_after_j, int64[1] stride_dim_j, int64 int64_0_cast, int64 int64_1_5_cast, int64[?] tmp_6, int64[?] add_value, int64 int64_0_7_cast, bool cond, float[1] tmp_14>
{
      [node_Constant_1] rank_3_cast = Constant <value: tensor = int64 rank_3_cast {4}> ()
      [n2_2] tmp = Sub (rank_3_cast, i)
      [node_Constant_3] int64_1_cast = Constant <value: tensor = int64 int64_1_cast {1}> ()
      [n5_2] j = Sub (tmp, int64_1_cast)
      [n6] j_tensor = Reshape (j, neg_1)
      [n7] size_dim_j = Gather <axis: int = 0> (val_0, j_tensor)
      [n8] size_after_j = Slice (val_0, j_tensor, rank_tensor)
      [n9] stride_dim_j = Gather <axis: int = 0> (val_1, j_tensor)
      [n10] indices_4 = Expand (indices_2, size_after_j)
      [node_Constant_5] int64_0_cast = Constant <value: tensor = int64 int64_0_cast {0}> ()
      [node_Constant_7] int64_1_5_cast = Constant <value: tensor = int64 int64_1_5_cast {1}> ()
      [n15] tmp_6 = Range (int64_0_cast, size_dim_j, int64_1_5_cast)
      [n16] add_value = Mul (tmp_6, stride_dim_j)
      [node_Constant_9] int64_0_7_cast = Constant <value: tensor = int64 int64_0_7_cast {0}> ()
      [n19] cond = Equal (i, int64_0_7_cast)
      [n20] shape_11 = If (cond) <then_branch: graph = thenGraph_39 () => (int64[1] shape) {
         [n0_3] shape = Identity (size_dim_j)
      }, else_branch: graph = elseGraph_39 () => (int64[] shape_10) 
         <float[1] tmp_8>
{
         [n0_4] ones = ConcatFromSequence <axis: int = 0> (one_seq_1)
         [n1_3] tmp_8 = Cast <to: int = 1> (size_dim_j)
         [n2_3] shape_9 = Concat <axis: int = 0> (tmp_8, ones)
         [n3_3] shape_10 = Cast <to: int = 7> (shape_9)
      }>
      [n21] add_value_12 = Reshape (add_value, shape_11)
      [n22] indices_13 = Add (indices_4, add_value_12)
      [n23] tmp_14 = Constant <value_floats: floats = [1]> ()
      [n24] one_seq_15 = SequenceInsert (one_seq_1, tmp_14)
      [n25] cond_out = Identity (cond_in)
   }>
   [n8_2] self_flatten = Reshape (x, neg_1)
   [n10_2] storage_offset_cast = CastLike (indices, indices_17)
   [n11_2] indices_19 = Add (indices_17, storage_offset_cast)
   [n12_2] as_strided = Gather (self_flatten, indices_19)
}

AtenInterpolate

forward

def forward(self, x):
    y = torch.nn.functional.interpolate(
        x,
        scale_factor=2.0,
        mode="bilinear",
        recompute_scale_factor=False,
    )
    return y

custom

  • inputs: #1[(T1s2x2x3x4,)]

  • shapes: dict(x:{0:Dim(batch)})

<
   ir_version: 8,
   opset_import: ["" : 18],
   metadata_props: ["large_model": "False", "inline": "True", "external_threshold": "1024", "function_options": "FunctionOptions()", "optimize": "True", "n_initializers": "1", "n_large_initializers": "0", "size_initializers": "16", "size_large_initializers": "0", "n_nodes": "3", "n_nodes_other_domain": "0", "mask_outputs": "[True]", "input_args": "(T1s2x2x3x4[-3.3515872955322266,2.3441219329833984:A-0.05102310386913208],)", "input_kwargs": "None", "optimizations": "OptimizationOptions(constant_folding={'Transpose', 'Cast', 'Reshape', 'Concat'}, patterns=[BatchNormalizationPattern(), BatchNormalizationTrainingPattern(), CastLayerNormalizationCastPattern(), CastPattern(), CastCastBinaryPattern(), CastOpCastPattern(), ClipClipPattern(), ComputationCastOpCastPattern(), ConcatGatherPattern(), ConvBiasNullPattern(), DropoutPattern(), ExpandPattern(), ExpandBroadcastPattern(), ExpandSwapPattern(), GeluPattern(), IdentityPattern(), LayerNormalizationPattern(), LayerNormalizationScalePattern(), LeakyReluPattern(), MulMulMulScalarPattern(), ReduceReshapePattern(), ReduceSumNormalizePattern(), ReshapePattern(), ReshapeMatMulReshapePattern(), Reshape2Of3Pattern(), ReshapeReshapeBinaryPattern(), MatMulAddPattern(), GemmTransposePattern(), MatMulReshape2Of3Pattern(), MulMulMatMulPattern(), ReshapeReshapePattern(), RotaryConcatPartPattern(), SameChildrenPattern(), SequenceConstructAtPattern(), SliceSlicePattern(), SlicesSplitPattern(), SoftmaxCrossEntropyLossCastPattern(), SplitConcatPattern(), SqueezeAddPattern(), SqueezeUnsqueezePattern(), Sub1MulPattern(), SwitchOrderBinaryPattern(), SwitchReshapeActivationPattern(), TransposeEqualReshapePattern(), TransposeMatMulPattern(), TransposeReshapeMatMulPattern(), TransposeReshapeTransposePattern(), TransposeTransposePattern(), UnsqueezeEqualPattern(), UnsqueezeUnsqueezePattern()])", "dynamic_shapes": "{'x': {0: Dim('batch', min=0)}}", "_discovered_shape_constraints": "{'batch': {'s77'}, 's77': {'batch'}}", "_known_value_shapes": "{
  _onx_concat_x::Shape:2: ('batch', 2, 6, 8),
  x::Shape:2: ('batch', 2),
}"]
>
experiment (float[batch,2,3,4] x) => (float[batch,2,6,8] output_0) 
   <int64[2] init7_s2_6_8 =  {6,8}, float[batch,2,6,8] upsample_bilinear2d, int64[2] "x::Shape:2">
{
   [upsample_bicubic2d_vec] "x::Shape:2" = Shape <end: int = 2, start: int = 0> (x)
   [upsample_bicubic2d_vec2] "_onx_concat_x::Shape:2" = Concat <axis: int = 0> ("x::Shape:2", init7_s2_6_8)
   [upsample_bicubic2d_vec3] output_0 = Resize <coordinate_transformation_mode: string = "pytorch_half_pixel", mode: string = "linear", nearest_mode: string = "floor"> (x, "", "", "_onx_concat_x::Shape:2")
}

dynamo-ir

  • inputs: #1[(T1s2x2x3x4,)]

  • shapes: dict(x:{0:Dim(batch)})

<
   ir_version: 10,
   opset_import: ["" : 18],
   producer_name: "pytorch",
   producer_version: "2.9.0.dev20250701+cu126"
>
main_graph (float[batch,2,3,4] x) => (float[batch,2,6,8] upsample_bilinear2d) 
   <float[4] val_0>
{
   [node_Constant_0] val_0 = Constant <value_floats: floats = [1, 1, 2, 2]> ()
   [node_upsample_bilinear2d] upsample_bilinear2d = Resize <keep_aspect_ratio_policy: string = "stretch", antialias: int = 0, extrapolation_value: float = 0, exclude_outside: int = 0, nearest_mode: string = "floor", coordinate_transformation_mode: string = "pytorch_half_pixel", cubic_coeff_a: float = -0.75, mode: string = "linear"> (x, "", val_0)
}

AtenNonZero

forward

def forward(self, x):
    y = torch.nonzero(x)
    return y

custom

  • inputs: #1[(T1s3x4,)]

  • shapes: dict(x:{0:Dim(batch)})

<
   ir_version: 8,
   opset_import: ["" : 18],
   metadata_props: ["large_model": "False", "inline": "True", "external_threshold": "1024", "function_options": "FunctionOptions()", "optimize": "True", "n_initializers": "0", "n_large_initializers": "0", "size_initializers": "0", "size_large_initializers": "0", "n_nodes": "2", "n_nodes_other_domain": "0", "mask_outputs": "[True]", "input_args": "(T1s3x4[-2.22353458404541,1.8057029247283936:A0.12694389124711355],)", "input_kwargs": "None", "optimizations": "OptimizationOptions(constant_folding={'Transpose', 'Cast', 'Reshape', 'Concat'}, patterns=[BatchNormalizationPattern(), BatchNormalizationTrainingPattern(), CastLayerNormalizationCastPattern(), CastPattern(), CastCastBinaryPattern(), CastOpCastPattern(), ClipClipPattern(), ComputationCastOpCastPattern(), ConcatGatherPattern(), ConvBiasNullPattern(), DropoutPattern(), ExpandPattern(), ExpandBroadcastPattern(), ExpandSwapPattern(), GeluPattern(), IdentityPattern(), LayerNormalizationPattern(), LayerNormalizationScalePattern(), LeakyReluPattern(), MulMulMulScalarPattern(), ReduceReshapePattern(), ReduceSumNormalizePattern(), ReshapePattern(), ReshapeMatMulReshapePattern(), Reshape2Of3Pattern(), ReshapeReshapeBinaryPattern(), MatMulAddPattern(), GemmTransposePattern(), MatMulReshape2Of3Pattern(), MulMulMatMulPattern(), ReshapeReshapePattern(), RotaryConcatPartPattern(), SameChildrenPattern(), SequenceConstructAtPattern(), SliceSlicePattern(), SlicesSplitPattern(), SoftmaxCrossEntropyLossCastPattern(), SplitConcatPattern(), SqueezeAddPattern(), SqueezeUnsqueezePattern(), Sub1MulPattern(), SwitchOrderBinaryPattern(), SwitchReshapeActivationPattern(), TransposeEqualReshapePattern(), TransposeMatMulPattern(), TransposeReshapeMatMulPattern(), TransposeReshapeTransposePattern(), TransposeTransposePattern(), UnsqueezeEqualPattern(), UnsqueezeUnsqueezePattern()])", "dynamic_shapes": "{'x': {0: Dim('batch', min=0)}}", "_discovered_shape_constraints": "{'NEWDIM_nonzero': {'u0'},
 'batch': {'s77'},
 's77': {'batch'},
 'u0': {'NEWDIM_nonzero'}}", "_known_value_shapes": "{
  nonzero::Shape:1: ('NEWDIM_nonzero',),
  sym_size_int_1: NEWDIM_nonzero,
}"]
>
experiment (float[batch,4] x) => (int64[NEWDIM_nonzero,2] output_0) 
   <int64 sym_size_int_1, bool sym_constrain_range_for_size_default, int64[2,NEWDIM_nonzero] _onx_nonzero_x, int64[NEWDIM_nonzero,2] nonzero, int64[1] "nonzero::Shape:1">
{
   [nonzero] _onx_nonzero_x = NonZero (x)
   [nonzero2] output_0 = Transpose <perm: ints = [1, 0]> (_onx_nonzero_x)
}

dynamo-ir

  • inputs: #1[(T1s3x4,)]

  • shapes: dict(x:{0:Dim(batch)})

<
   ir_version: 10,
   opset_import: ["" : 18],
   producer_name: "pytorch",
   producer_version: "2.9.0.dev20250701+cu126"
>
main_graph (float[batch,4] x) => (int64[u0,2] nonzero) 
   <int64[2,?] val_0>
{
   [node_NonZero_0] val_0 = NonZero (x)
   [node_nonzero] nonzero = Transpose <perm: ints = [1, 0]> (val_0)
}

AtenNonZeroTuple

forward

def forward(self, x):
    y = torch.nonzero(x, as_tuple=True)
    return y[0], y[1]

custom

  • inputs: #1[(T1s3x4,)]

  • shapes: dict(x:{0:Dim(batch)})

<
   ir_version: 8,
   opset_import: ["" : 18],
   metadata_props: ["large_model": "False", "inline": "True", "external_threshold": "1024", "function_options": "FunctionOptions()", "optimize": "True", "n_initializers": "1", "n_large_initializers": "0", "size_initializers": "8", "size_large_initializers": "0", "n_nodes": "4", "n_nodes_other_domain": "0", "mask_outputs": "[True, True]", "input_args": "(T1s3x4[-2.1581575870513916,0.9051743745803833:A-0.11730112507939339],)", "input_kwargs": "None", "optimizations": "OptimizationOptions(constant_folding={'Transpose', 'Cast', 'Reshape', 'Concat'}, patterns=[BatchNormalizationPattern(), BatchNormalizationTrainingPattern(), CastLayerNormalizationCastPattern(), CastPattern(), CastCastBinaryPattern(), CastOpCastPattern(), ClipClipPattern(), ComputationCastOpCastPattern(), ConcatGatherPattern(), ConvBiasNullPattern(), DropoutPattern(), ExpandPattern(), ExpandBroadcastPattern(), ExpandSwapPattern(), GeluPattern(), IdentityPattern(), LayerNormalizationPattern(), LayerNormalizationScalePattern(), LeakyReluPattern(), MulMulMulScalarPattern(), ReduceReshapePattern(), ReduceSumNormalizePattern(), ReshapePattern(), ReshapeMatMulReshapePattern(), Reshape2Of3Pattern(), ReshapeReshapeBinaryPattern(), MatMulAddPattern(), GemmTransposePattern(), MatMulReshape2Of3Pattern(), MulMulMatMulPattern(), ReshapeReshapePattern(), RotaryConcatPartPattern(), SameChildrenPattern(), SequenceConstructAtPattern(), SliceSlicePattern(), SlicesSplitPattern(), SoftmaxCrossEntropyLossCastPattern(), SplitConcatPattern(), SqueezeAddPattern(), SqueezeUnsqueezePattern(), Sub1MulPattern(), SwitchOrderBinaryPattern(), SwitchReshapeActivationPattern(), TransposeEqualReshapePattern(), TransposeMatMulPattern(), TransposeReshapeMatMulPattern(), TransposeReshapeTransposePattern(), TransposeTransposePattern(), UnsqueezeEqualPattern(), UnsqueezeUnsqueezePattern()])", "dynamic_shapes": "{'x': {0: Dim('batch', min=0)}}", "_discovered_shape_constraints": "{'batch': {'s77'}, 's77': {'batch'}}", "_known_value_shapes": "{
  getitem_2::Shape:1: ('u0',),
  sym_size_int_1: u0,
}"]
>
experiment (float[batch,4] x) => (int64[u0] output_0, int64[u0] output_1) 
   <int64[1] "init7_s1_-1" =  {-1}, int64 sym_size_int_1, int64[1] "getitem_2::Shape:1", bool sym_constrain_range_for_size_default, int64[2,NEWDIM_nonzero] _onx_nonzero_x, int64[u0] getitem_1, int64[u0] getitem_2, int64[?,?] _onx_split_nonzero_x_1, int64[u0] "nonzero_numpy#0", int64[?,?] _onx_split_nonzero_x_0, int64[u0] "nonzero_numpy#1">
{
   [nonzero_numpy] _onx_nonzero_x = NonZero (x)
   [nonzero_numpy2] _onx_split_nonzero_x_0, _onx_split_nonzero_x_1 = Split <num_outputs: int = 2> (_onx_nonzero_x)
   [nonzero_numpy3] output_0 = Reshape (_onx_split_nonzero_x_0, "init7_s1_-1")
   [nonzero_numpy4] output_1 = Reshape (_onx_split_nonzero_x_1, "init7_s1_-1")
}

dynamo-ir

  • inputs: #1[(T1s3x4,)]

  • shapes: dict(x:{0:Dim(batch)})

<
   ir_version: 10,
   opset_import: ["" : 18],
   producer_name: "pytorch",
   producer_version: "2.9.0.dev20250701+cu126"
>
main_graph (float[batch,4] x) => (int64[u0] getitem, int64[u0] getitem_1) 
   <int64[2,?] val_0, int64[u0,2] nonzero, int64[u0,1] unbind_split_0, int64[u0,1] unbind_split_1, int64[1] unbind_axis>
{
   [node_NonZero_0] val_0 = NonZero (x)
   [node_nonzero] nonzero = Transpose <perm: ints = [1, 0]> (val_0)
   [node_Split_4] unbind_split_0, unbind_split_1 = Split <axis: int = 1, num_outputs: int = 2> (nonzero)
   [node_Constant_5] unbind_axis = Constant <value_ints: ints = [1]> ()
   [node_Squeeze_6] getitem = Squeeze (unbind_split_0, unbind_axis)
   [node_Squeeze_7] getitem_1 = Squeeze (unbind_split_1, unbind_axis)
}

AtenRollPos

forward

def forward(self, x):
    return torch.roll(x, 1, -1)

custom

  • inputs: #1[(T1s2x3x4,)]

  • shapes: dict(x:{0:Dim(batch)})

<
   ir_version: 8,
   opset_import: ["" : 18],
   metadata_props: ["large_model": "False", "inline": "True", "external_threshold": "1024", "function_options": "FunctionOptions()", "optimize": "True", "n_initializers": "3", "n_large_initializers": "0", "size_initializers": "24", "size_large_initializers": "0", "n_nodes": "3", "n_nodes_other_domain": "0", "mask_outputs": "[True]", "input_args": "(T1s2x3x4[10.0,33.0:A21.5],)", "input_kwargs": "None", "optimizations": "OptimizationOptions(constant_folding={'Transpose', 'Cast', 'Reshape', 'Concat'}, patterns=[BatchNormalizationPattern(), BatchNormalizationTrainingPattern(), CastLayerNormalizationCastPattern(), CastPattern(), CastCastBinaryPattern(), CastOpCastPattern(), ClipClipPattern(), ComputationCastOpCastPattern(), ConcatGatherPattern(), ConvBiasNullPattern(), DropoutPattern(), ExpandPattern(), ExpandBroadcastPattern(), ExpandSwapPattern(), GeluPattern(), IdentityPattern(), LayerNormalizationPattern(), LayerNormalizationScalePattern(), LeakyReluPattern(), MulMulMulScalarPattern(), ReduceReshapePattern(), ReduceSumNormalizePattern(), ReshapePattern(), ReshapeMatMulReshapePattern(), Reshape2Of3Pattern(), ReshapeReshapeBinaryPattern(), MatMulAddPattern(), GemmTransposePattern(), MatMulReshape2Of3Pattern(), MulMulMatMulPattern(), ReshapeReshapePattern(), RotaryConcatPartPattern(), SameChildrenPattern(), SequenceConstructAtPattern(), SliceSlicePattern(), SlicesSplitPattern(), SoftmaxCrossEntropyLossCastPattern(), SplitConcatPattern(), SqueezeAddPattern(), SqueezeUnsqueezePattern(), Sub1MulPattern(), SwitchOrderBinaryPattern(), SwitchReshapeActivationPattern(), TransposeEqualReshapePattern(), TransposeMatMulPattern(), TransposeReshapeMatMulPattern(), TransposeReshapeTransposePattern(), TransposeTransposePattern(), UnsqueezeEqualPattern(), UnsqueezeUnsqueezePattern()])", "dynamic_shapes": "{'x': {0: Dim('batch', min=0)}}", "_discovered_shape_constraints": "{'batch': {'s77'}, 's77': {'batch'}}"]
>
experiment (float[batch,3,4] x) => (float[batch,3,4] output_0) 
   <int64[1] "init7_s1_-1" =  {-1}, int64[1] init7_s1_4 =  {4}, int64[1] init7_s1_0 =  {0}, float[batch,3,4] roll, float[batch,3,4] _onx_concat_slice_x, float[?,?,?] _onx_slice_x2, float[?,?,?] _onx_slice_x>
{
   [roll] _onx_slice_x = Slice (x, "init7_s1_-1", init7_s1_4, "init7_s1_-1")
   [roll2] _onx_slice_x2 = Slice (x, init7_s1_0, "init7_s1_-1", "init7_s1_-1")
   [roll3] output_0 = Concat <axis: int = -1> (_onx_slice_x, _onx_slice_x2)
}

dynamo-ir

  • inputs: #1[(T1s2x3x4,)]

  • shapes: dict(x:{0:Dim(batch)})

<
   ir_version: 10,
   opset_import: ["" : 18],
   producer_name: "pytorch",
   producer_version: "2.9.0.dev20250701+cu126"
>
main_graph (float[batch,3,4] x) => (float[batch,3,4] roll) 
   <int64[1] neg_1, int64[1] dim_tensor, int64[1] shift_tensor, int64[1] slice_length_3, int64[1] tmp_4, float[batch,3,3] suffix, int64 tmp_5, int64[1] tmp_6, float[?,?,?] prefix>
{
   [n0] neg_1 = Constant <value_ints: ints = [-1]> ()
   [node_Constant_1] dim_tensor = Constant <value: tensor = int64[1] dim_tensor {-1}> ()
   [node_Constant_2] shift_tensor = Constant <value: tensor = int64[1] shift_tensor {1}> ()
   [node_Constant_7] slice_length_3 = Constant <value: tensor = int64[1] slice_length_3 {3}> ()
   [n8] tmp_4 = Constant <value_ints: ints = [0]> ()
   [n9] suffix = Slice (x, tmp_4, slice_length_3, dim_tensor)
   [n10] tmp_5 = Size (x)
   [n11] tmp_6 = Reshape (tmp_5, neg_1)
   [n12] prefix = Slice (x, slice_length_3, tmp_6, dim_tensor)
   [n13] roll = Concat <axis: int = -1> (prefix, suffix)
}

AtenRollRelu

forward

def forward(self, x):
    return torch.relu(torch.roll(x, -1, -1))

custom

  • inputs: #1[(T1s2x3x4,)]

  • shapes: dict(x:{0:Dim(batch)})

<
   ir_version: 8,
   opset_import: ["" : 18],
   metadata_props: ["large_model": "False", "inline": "True", "external_threshold": "1024", "function_options": "FunctionOptions()", "optimize": "True", "n_initializers": "4", "n_large_initializers": "0", "size_initializers": "32", "size_large_initializers": "0", "n_nodes": "4", "n_nodes_other_domain": "0", "mask_outputs": "[True]", "input_args": "(T1s2x3x4[10.0,33.0:A21.5],)", "input_kwargs": "None", "optimizations": "OptimizationOptions(constant_folding={'Transpose', 'Cast', 'Reshape', 'Concat'}, patterns=[BatchNormalizationPattern(), BatchNormalizationTrainingPattern(), CastLayerNormalizationCastPattern(), CastPattern(), CastCastBinaryPattern(), CastOpCastPattern(), ClipClipPattern(), ComputationCastOpCastPattern(), ConcatGatherPattern(), ConvBiasNullPattern(), DropoutPattern(), ExpandPattern(), ExpandBroadcastPattern(), ExpandSwapPattern(), GeluPattern(), IdentityPattern(), LayerNormalizationPattern(), LayerNormalizationScalePattern(), LeakyReluPattern(), MulMulMulScalarPattern(), ReduceReshapePattern(), ReduceSumNormalizePattern(), ReshapePattern(), ReshapeMatMulReshapePattern(), Reshape2Of3Pattern(), ReshapeReshapeBinaryPattern(), MatMulAddPattern(), GemmTransposePattern(), MatMulReshape2Of3Pattern(), MulMulMatMulPattern(), ReshapeReshapePattern(), RotaryConcatPartPattern(), SameChildrenPattern(), SequenceConstructAtPattern(), SliceSlicePattern(), SlicesSplitPattern(), SoftmaxCrossEntropyLossCastPattern(), SplitConcatPattern(), SqueezeAddPattern(), SqueezeUnsqueezePattern(), Sub1MulPattern(), SwitchOrderBinaryPattern(), SwitchReshapeActivationPattern(), TransposeEqualReshapePattern(), TransposeMatMulPattern(), TransposeReshapeMatMulPattern(), TransposeReshapeTransposePattern(), TransposeTransposePattern(), UnsqueezeEqualPattern(), UnsqueezeUnsqueezePattern()])", "dynamic_shapes": "{'x': {0: Dim('batch', min=0)}}", "_discovered_shape_constraints": "{'batch': {'s77'}, 's77': {'batch'}}"]
>
experiment (float[batch,3,4] x) => (float[batch,3,4] output_0) 
   <int64[1] init7_s1_1 =  {1}, int64[1] init7_s1_4 =  {4}, int64[1] "init7_s1_-1" =  {-1}, int64[1] init7_s1_0 =  {0}, float[batch,3,4] roll, float[batch,3,4] _onx_concat_slice_x, float[batch,3,4] relu, float[?,?,?] _onx_slice_x2, float[?,?,?] _onx_slice_x>
{
   [roll] _onx_slice_x = Slice (x, init7_s1_1, init7_s1_4, "init7_s1_-1")
   [roll2] _onx_slice_x2 = Slice (x, init7_s1_0, init7_s1_1, "init7_s1_-1")
   [roll3] _onx_concat_slice_x = Concat <axis: int = -1> (_onx_slice_x, _onx_slice_x2)
   [relu] output_0 = Relu (_onx_concat_slice_x)
}

dynamo-ir

  • inputs: #1[(T1s2x3x4,)]

  • shapes: dict(x:{0:Dim(batch)})

<
   ir_version: 10,
   opset_import: ["" : 18],
   producer_name: "pytorch",
   producer_version: "2.9.0.dev20250701+cu126"
>
main_graph (float[batch,3,4] x) => (float[batch,3,4] relu) 
   <int64[1] neg_1, int64[1] dim_tensor, int64[1] slice_length_3, int64[1] tmp_4, float[batch,3,1] suffix, int64 tmp_5, int64[1] tmp_6, float[?,?,?] prefix, float[batch,3,4] roll>
{
   [n0] neg_1 = Constant <value_ints: ints = [-1]> ()
   [node_Constant_1] dim_tensor = Constant <value: tensor = int64[1] dim_tensor {-1}> ()
   [node_Constant_6] slice_length_3 = Constant <value: tensor = int64[1] slice_length_3 {1}> ()
   [n8] tmp_4 = Constant <value_ints: ints = [0]> ()
   [n9] suffix = Slice (x, tmp_4, slice_length_3, dim_tensor)
   [n10] tmp_5 = Size (x)
   [n11] tmp_6 = Reshape (tmp_5, neg_1)
   [n12] prefix = Slice (x, slice_length_3, tmp_6, dim_tensor)
   [n13] roll = Concat <axis: int = -1> (prefix, suffix)
   [node_relu] relu = Relu (roll)
}

BuildInIsInstance

forward

def forward(self, x, lx: list | torch.Tensor):
    if isinstance(lx, list):
        t = lx[0] * lx[1].sum(axis=1, keepdim=True)
        return torch.sigmoid(self.linear(x)) - self.buff + t
    return torch.sigmoid(self.linear(x)) - self.buff + lx

custom

  • inputs: #2[(T1s4x3,#2[T1s4x1,T1s4x2]),(T1s8x3,#2[T1s8x1,T1s8x2])]

  • shapes: dict(x:{0:Dim(batch)},lx:#2[{0:Dim(batch)},{0:Dim(batch)}])

<
   ir_version: 8,
   opset_import: ["" : 18],
   metadata_props: ["large_model": "False", "inline": "True", "external_threshold": "1024", "function_options": "FunctionOptions()", "optimize": "True", "n_initializers": "4", "n_large_initializers": "0", "size_initializers": "28", "size_large_initializers": "0", "n_nodes": "6", "n_nodes_other_domain": "0", "mask_outputs": "[True]", "input_args": "(T1s4x3[10.0,21.0:A15.5],#2[T1s4x1[10.0,13.0:A11.5],T1s4x2[10.0,17.0:A13.5]])", "input_kwargs": "None", "optimizations": "OptimizationOptions(constant_folding={'Transpose', 'Cast', 'Reshape', 'Concat'}, patterns=[BatchNormalizationPattern(), BatchNormalizationTrainingPattern(), CastLayerNormalizationCastPattern(), CastPattern(), CastCastBinaryPattern(), CastOpCastPattern(), ClipClipPattern(), ComputationCastOpCastPattern(), ConcatGatherPattern(), ConvBiasNullPattern(), DropoutPattern(), ExpandPattern(), ExpandBroadcastPattern(), ExpandSwapPattern(), GeluPattern(), IdentityPattern(), LayerNormalizationPattern(), LayerNormalizationScalePattern(), LeakyReluPattern(), MulMulMulScalarPattern(), ReduceReshapePattern(), ReduceSumNormalizePattern(), ReshapePattern(), ReshapeMatMulReshapePattern(), Reshape2Of3Pattern(), ReshapeReshapeBinaryPattern(), MatMulAddPattern(), GemmTransposePattern(), MatMulReshape2Of3Pattern(), MulMulMatMulPattern(), ReshapeReshapePattern(), RotaryConcatPartPattern(), SameChildrenPattern(), SequenceConstructAtPattern(), SliceSlicePattern(), SlicesSplitPattern(), SoftmaxCrossEntropyLossCastPattern(), SplitConcatPattern(), SqueezeAddPattern(), SqueezeUnsqueezePattern(), Sub1MulPattern(), SwitchOrderBinaryPattern(), SwitchReshapeActivationPattern(), TransposeEqualReshapePattern(), TransposeMatMulPattern(), TransposeReshapeMatMulPattern(), TransposeReshapeTransposePattern(), TransposeTransposePattern(), UnsqueezeEqualPattern(), UnsqueezeUnsqueezePattern()])", "dynamic_shapes": "{'lx': [{0: Dim('batch', min=0)}, {0: Dim('batch', min=0)}],
 'x': {0: Dim('batch', min=0)}}", "_discovered_shape_constraints": "{'batch': {'s53', 's50', 's77'},
 's50': {'batch', 's53'},
 's53': {'batch'},
 's77': {'batch'}}"]
>
experiment (float[batch,3] x, float[batch,1] lx_0, float[batch,2] lx_1) => (float[batch,1] output_0) 
   <float[1] b_buff =  {0.5}, int64[1] init7_s1_1 =  {1}, float[1,3] "GemmTransposePattern--p_linear_weight::T10" =  {0.126205,0.342849,0.132504}, float[1] "linear.bias" =  {0.187118}, float[batch,1] _onx_matmul_x, float[batch,1] add, float[batch,1] mul, float[batch,1] sum_1, float[1,3] p_linear_weight, float[batch,1] linear, int64[2] "init7_s2_1_-1", float[batch,1] sigmoid, float[batch,1] sub, float[1,3] "linear.weight", float[1] p_linear_bias, float[3,1] "p_linear_weight::T10">
{
   [sum] sum_1 = ReduceSum <keepdims: int = 1> (lx_1, init7_s1_1)
   [mul_Tensor] mul = Mul (lx_0, sum_1)
   ["GemmTransposePattern--MatMulAddPattern--Opset2"] linear = Gemm <transB: int = 1> (x, "GemmTransposePattern--p_linear_weight::T10", "linear.bias")
   [sigmoid] sigmoid = Sigmoid (linear)
   [sub_Tensor] sub = Sub (sigmoid, b_buff)
   [add_Tensor] output_0 = Add (sub, mul)
}

dynamo-ir

  • inputs: #2[(T1s4x3,#2[T1s4x1,T1s4x2]),(T1s8x3,#2[T1s8x1,T1s8x2])]

  • shapes: dict(x:{0:Dim(batch)},lx:#2[{0:Dim(batch)},{0:Dim(batch)}])

<
   ir_version: 10,
   opset_import: ["" : 18],
   producer_name: "pytorch",
   producer_version: "2.9.0.dev20250701+cu126"
>
main_graph (float[batch,3] x, float[batch,1] lx_0, float[batch,2] lx_1) => (float[batch,1] add_15) 
   <float[1,3] "linear.weight" =  {0.518632,-0.238197,-0.215139}, float[1] "linear.bias" =  {-0.154246}, float[1] buff =  {0.5}, float[1,3] "linear.weight", float[1] "linear.bias", float[1] buff, int64[1] val_6, float[batch,1] sum_1, float[batch,1] mul_1, float[batch,1] linear, float[batch,1] sigmoid, float[batch,1] sub_4>
{
   [node_Constant_9] val_6 = Constant <value: tensor = int64[1] val_6 {1}> ()
   [node_sum_1] sum_1 = ReduceSum <noop_with_empty_axes: int = 0, keepdims: int = 1> (lx_1, val_6)
   [node_mul_1] mul_1 = Mul (lx_0, sum_1)
   [node_linear] linear = Gemm <beta: float = 1, transB: int = 1, alpha: float = 1, transA: int = 0> (x, "linear.weight", "linear.bias")
   [node_sigmoid] sigmoid = Sigmoid (linear)
   [node_sub_4] sub_4 = Sub (sigmoid, buff)
   [node_add_15] add_15 = Add (sub_4, mul_1)
}

BuildInLen

forward

def forward(self, x, lx: list):
    t = lx[0] * lx[1].sum(axis=1, keepdim=True)
    if len(lx) > 2:
        t = t + lx[2].sum(axis=1, keepdim=True)
    return torch.sigmoid(self.linear(x)) - self.buff + t

custom

  • inputs: #2[(T1s4x3,#2[T1s4x1,T1s4x2]),(T1s8x3,#3[T1s8x1,T1s8x2,T1s8x3])]

  • shapes: dict(x:{0:Dim(batch)},lx:#2[{0:Dim(batch)},{0:Dim(batch)}])

<
   ir_version: 8,
   opset_import: ["" : 18],
   metadata_props: ["large_model": "False", "inline": "True", "external_threshold": "1024", "function_options": "FunctionOptions()", "optimize": "True", "n_initializers": "4", "n_large_initializers": "0", "size_initializers": "28", "size_large_initializers": "0", "n_nodes": "6", "n_nodes_other_domain": "0", "mask_outputs": "[True]", "input_args": "(T1s4x3[10.0,21.0:A15.5],#2[T1s4x1[10.0,13.0:A11.5],T1s4x2[10.0,17.0:A13.5]])", "input_kwargs": "None", "optimizations": "OptimizationOptions(constant_folding={'Transpose', 'Cast', 'Reshape', 'Concat'}, patterns=[BatchNormalizationPattern(), BatchNormalizationTrainingPattern(), CastLayerNormalizationCastPattern(), CastPattern(), CastCastBinaryPattern(), CastOpCastPattern(), ClipClipPattern(), ComputationCastOpCastPattern(), ConcatGatherPattern(), ConvBiasNullPattern(), DropoutPattern(), ExpandPattern(), ExpandBroadcastPattern(), ExpandSwapPattern(), GeluPattern(), IdentityPattern(), LayerNormalizationPattern(), LayerNormalizationScalePattern(), LeakyReluPattern(), MulMulMulScalarPattern(), ReduceReshapePattern(), ReduceSumNormalizePattern(), ReshapePattern(), ReshapeMatMulReshapePattern(), Reshape2Of3Pattern(), ReshapeReshapeBinaryPattern(), MatMulAddPattern(), GemmTransposePattern(), MatMulReshape2Of3Pattern(), MulMulMatMulPattern(), ReshapeReshapePattern(), RotaryConcatPartPattern(), SameChildrenPattern(), SequenceConstructAtPattern(), SliceSlicePattern(), SlicesSplitPattern(), SoftmaxCrossEntropyLossCastPattern(), SplitConcatPattern(), SqueezeAddPattern(), SqueezeUnsqueezePattern(), Sub1MulPattern(), SwitchOrderBinaryPattern(), SwitchReshapeActivationPattern(), TransposeEqualReshapePattern(), TransposeMatMulPattern(), TransposeReshapeMatMulPattern(), TransposeReshapeTransposePattern(), TransposeTransposePattern(), UnsqueezeEqualPattern(), UnsqueezeUnsqueezePattern()])", "dynamic_shapes": "{'lx': [{0: Dim('batch', min=0)}, {0: Dim('batch', min=0)}],
 'x': {0: Dim('batch', min=0)}}", "_discovered_shape_constraints": "{'batch': {'s53', 's50', 's77'},
 's50': {'batch', 's53'},
 's53': {'batch'},
 's77': {'batch'}}"]
>
experiment (float[batch,3] x, float[batch,1] lx_0, float[batch,2] lx_1) => (float[batch,1] output_0) 
   <float[1] b_buff =  {0.5}, int64[1] init7_s1_1 =  {1}, float[1,3] "GemmTransposePattern--p_linear_weight::T10" =  {-0.0998398,0.350802,0.130283}, float[1] "linear.bias" =  {-0.44322}, float[batch,1] _onx_matmul_x, float[batch,1] add, float[batch,1] mul, float[batch,1] sum_1, float[1,3] p_linear_weight, float[batch,1] linear, int64[2] "init7_s2_1_-1", float[batch,1] sigmoid, float[batch,1] sub, float[1,3] "linear.weight", float[1] p_linear_bias, float[3,1] "p_linear_weight::T10">
{
   [sum] sum_1 = ReduceSum <keepdims: int = 1> (lx_1, init7_s1_1)
   [mul_Tensor] mul = Mul (lx_0, sum_1)
   ["GemmTransposePattern--MatMulAddPattern--Opset2"] linear = Gemm <transB: int = 1> (x, "GemmTransposePattern--p_linear_weight::T10", "linear.bias")
   [sigmoid] sigmoid = Sigmoid (linear)
   [sub_Tensor] sub = Sub (sigmoid, b_buff)
   [add_Tensor] output_0 = Add (sub, mul)
}

FAILED

diff.1

dynamo-ir

  • inputs: #2[(T1s4x3,#2[T1s4x1,T1s4x2]),(T1s8x3,#3[T1s8x1,T1s8x2,T1s8x3])]

  • shapes: dict(x:{0:Dim(batch)},lx:#2[{0:Dim(batch)},{0:Dim(batch)}])

<
   ir_version: 10,
   opset_import: ["" : 18],
   producer_name: "pytorch",
   producer_version: "2.9.0.dev20250701+cu126"
>
main_graph (float[batch,3] x, float[batch,1] lx_0, float[batch,2] lx_1) => (float[batch,1] add_15) 
   <float[1,3] "linear.weight" =  {0.0222517,0.573125,0.478325}, float[1] "linear.bias" =  {-0.12366}, float[1] buff =  {0.5}, float[1,3] "linear.weight", float[1] "linear.bias", float[1] buff, int64[1] val_6, float[batch,1] sum_1, float[batch,1] mul_1, float[batch,1] linear, float[batch,1] sigmoid, float[batch,1] sub_4>
{
   [node_Constant_9] val_6 = Constant <value: tensor = int64[1] val_6 {1}> ()
   [node_sum_1] sum_1 = ReduceSum <noop_with_empty_axes: int = 0, keepdims: int = 1> (lx_1, val_6)
   [node_mul_1] mul_1 = Mul (lx_0, sum_1)
   [node_linear] linear = Gemm <beta: float = 1, transB: int = 1, alpha: float = 1, transA: int = 0> (x, "linear.weight", "linear.bias")
   [node_sigmoid] sigmoid = Sigmoid (linear)
   [node_sub_4] sub_4 = Sub (sigmoid, buff)
   [node_add_15] add_15 = Add (sub_4, mul_1)
}

FAILED

diff.1

ComplexPolar

forward

def forward(self, x, angle):
    return torch.polar(x, angle)

custom

  • inputs: #1[(T1s4x4,T1s4x4)]

  • shapes: dict(x:{0:Dim(batch)},angle:{0:Dim(batch)})

<
   ir_version: 8,
   opset_import: ["" : 18],
   metadata_props: ["large_model": "False", "inline": "True", "external_threshold": "1024", "function_options": "FunctionOptions()", "optimize": "True", "n_initializers": "1", "n_large_initializers": "0", "size_initializers": "8", "size_large_initializers": "0", "n_nodes": "8", "n_nodes_other_domain": "0", "mask_outputs": "[True]", "input_args": "(T1s4x4[0.010826051235198975,0.9339432716369629:A0.4240015111863613],T1s4x4[0.07700252532958984,0.9605667591094971:A0.4933854453265667])", "input_kwargs": "None", "optimizations": "OptimizationOptions(constant_folding={'Transpose', 'Cast', 'Reshape', 'Concat'}, patterns=[BatchNormalizationPattern(), BatchNormalizationTrainingPattern(), CastLayerNormalizationCastPattern(), CastPattern(), CastCastBinaryPattern(), CastOpCastPattern(), ClipClipPattern(), ComputationCastOpCastPattern(), ConcatGatherPattern(), ConvBiasNullPattern(), DropoutPattern(), ExpandPattern(), ExpandBroadcastPattern(), ExpandSwapPattern(), GeluPattern(), IdentityPattern(), LayerNormalizationPattern(), LayerNormalizationScalePattern(), LeakyReluPattern(), MulMulMulScalarPattern(), ReduceReshapePattern(), ReduceSumNormalizePattern(), ReshapePattern(), ReshapeMatMulReshapePattern(), Reshape2Of3Pattern(), ReshapeReshapeBinaryPattern(), MatMulAddPattern(), GemmTransposePattern(), MatMulReshape2Of3Pattern(), MulMulMatMulPattern(), ReshapeReshapePattern(), RotaryConcatPartPattern(), SameChildrenPattern(), SequenceConstructAtPattern(), SliceSlicePattern(), SlicesSplitPattern(), SoftmaxCrossEntropyLossCastPattern(), SplitConcatPattern(), SqueezeAddPattern(), SqueezeUnsqueezePattern(), Sub1MulPattern(), SwitchOrderBinaryPattern(), SwitchReshapeActivationPattern(), TransposeEqualReshapePattern(), TransposeMatMulPattern(), TransposeReshapeMatMulPattern(), TransposeReshapeTransposePattern(), TransposeTransposePattern(), UnsqueezeEqualPattern(), UnsqueezeUnsqueezePattern()])", "dynamic_shapes": "{'angle': {0: Dim('batch', min=0)}, 'x': {0: Dim('batch', min=0)}}", "_discovered_shape_constraints": "{'batch': {'s77', 's96'}, 's77': {'batch'}, 's96': {'batch'}}"]
>
experiment (float[batch,4] x, float[batch,4] angle) => (complex64[batch,4] output_0) 
   <complex64[1] init14_s1_ = ..., complex64[batch,4] "x::C14", float[batch,4] _onx_cos_angle, complex64[batch,4] "_onx_sin_angle::C14", complex64[batch,4] "_onx_mul_sin_angle::C14", complex64[batch,4] "_onx_add_cos_angle::C14", complex64[batch,4] polar, complex64[batch,4] "_onx_cos_angle::C14", float[batch,4] _onx_sin_angle>
{
   [polar] _onx_cos_angle = Cos (angle)
   [polar2] "_onx_cos_angle::C14" = Cast <to: int = 14> (_onx_cos_angle)
   [polar3] _onx_sin_angle = Sin (angle)
   [polar4] "_onx_sin_angle::C14" = Cast <to: int = 14> (_onx_sin_angle)
   [polar5] "_onx_mul_sin_angle::C14" = Mul ("_onx_sin_angle::C14", init14_s1_)
   [polar6] "x::C14" = Cast <to: int = 14> (x)
   [polar7] "_onx_add_cos_angle::C14" = Add ("_onx_cos_angle::C14", "_onx_mul_sin_angle::C14")
   [polar8] output_0 = Mul ("x::C14", "_onx_add_cos_angle::C14")
}

FAILED

[ONNXRuntimeError] : 10 : INVALID_GRAPH : This is an invalid model. Type Error: Type 'tensor(complex64)' of input parameter (_onx_sin_angle::C14) of operator (Mul) in node (polar5) is invalid.

dynamo-ir

  • inputs: #1[(T1s4x4,T1s4x4)]

  • shapes: dict(x:{0:Dim(batch)},angle:{0:Dim(batch)})

<
   ir_version: 10,
   opset_import: ["" : 18],
   producer_name: "pytorch",
   producer_version: "2.9.0.dev20250701+cu126"
>
main_graph (float[batch,4] x, float[batch,4] angle) => (float[batch,4,2] polar) 
   <float[batch,4] tmp, float[batch,4] tmp_0, int64[1] int64_m1_1d, float[batch,4,1] real, float[batch,4] tmp_1, float[batch,4] tmp_2, float[batch,4,1] imag>
{
   [n0] tmp = Cos (angle)
   [n1] tmp_0 = Mul (x, tmp)
   [n2] int64_m1_1d = Constant <value: tensor = int64[1] int64_m1_1d {-1}> ()
   [n3] real = Unsqueeze (tmp_0, int64_m1_1d)
   [n4] tmp_1 = Sin (angle)
   [n5] tmp_2 = Mul (x, tmp_1)
   [n7] imag = Unsqueeze (tmp_2, int64_m1_1d)
   [n8] polar = Concat <axis: int = -1> (real, imag)
}

FAILED

diff.0

ControlFlowCond

forward

def forward(self, x):
    def true_fn(x):
        return torch.sin(x)

    def false_fn(x):
        return torch.cos(x)

    return torch.cond(x.sum() > 0, true_fn, false_fn, [x])

custom

  • inputs: #1[(T1s5x3,)]

  • shapes: dict(x:{0:Dim(batch)})

<
   ir_version: 8,
   opset_import: ["" : 18, "local_functions" : 1],
   metadata_props: ["large_model": "False", "inline": "True", "external_threshold": "1024", "function_options": "FunctionOptions()", "optimize": "True", "n_initializers": "1", "n_large_initializers": "0", "size_initializers": "4", "size_large_initializers": "0", "n_nodes": "3", "n_nodes_other_domain": "0", "mask_outputs": "[True]", "input_args": "(T1s5x3[0.130266010761261,0.9946584105491638:A0.528652552763621],)", "input_kwargs": "None", "optimizations": "OptimizationOptions(constant_folding={'Transpose', 'Cast', 'Reshape', 'Concat'}, patterns=[BatchNormalizationPattern(), BatchNormalizationTrainingPattern(), CastLayerNormalizationCastPattern(), CastPattern(), CastCastBinaryPattern(), CastOpCastPattern(), ClipClipPattern(), ComputationCastOpCastPattern(), ConcatGatherPattern(), ConvBiasNullPattern(), DropoutPattern(), ExpandPattern(), ExpandBroadcastPattern(), ExpandSwapPattern(), GeluPattern(), IdentityPattern(), LayerNormalizationPattern(), LayerNormalizationScalePattern(), LeakyReluPattern(), MulMulMulScalarPattern(), ReduceReshapePattern(), ReduceSumNormalizePattern(), ReshapePattern(), ReshapeMatMulReshapePattern(), Reshape2Of3Pattern(), ReshapeReshapeBinaryPattern(), MatMulAddPattern(), GemmTransposePattern(), MatMulReshape2Of3Pattern(), MulMulMatMulPattern(), ReshapeReshapePattern(), RotaryConcatPartPattern(), SameChildrenPattern(), SequenceConstructAtPattern(), SliceSlicePattern(), SlicesSplitPattern(), SoftmaxCrossEntropyLossCastPattern(), SplitConcatPattern(), SqueezeAddPattern(), SqueezeUnsqueezePattern(), Sub1MulPattern(), SwitchOrderBinaryPattern(), SwitchReshapeActivationPattern(), TransposeEqualReshapePattern(), TransposeMatMulPattern(), TransposeReshapeMatMulPattern(), TransposeReshapeTransposePattern(), TransposeTransposePattern(), UnsqueezeEqualPattern(), UnsqueezeUnsqueezePattern()])", "dynamic_shapes": "{'x': {0: Dim('batch', min=0)}}", "_discovered_shape_constraints": "{'batch': {'s77'}, 's77': {'batch'}}"]
>
experiment (float[batch,3] x) => (float[batch,3] output_0) 
   <float init1_s_ =  {0}, float sum_1, bool gt, float[1] "sum_1::RSh1", float[1] "init1_s_::RSh1", int64[1] init7_s1_1, float[batch,3] getitem>
{
   [sum] sum_1 = ReduceSum <keepdims: int = 0> (x)
   [gt_Scalar3] gt = Greater (sum_1, init1_s_)
   [cond] output_0 = If (gt) <else_branch: graph = experiment () => ( "cond#0") {
      [cos2] "cond#0" = Cos (x)
   }, then_branch: graph = experiment () => ( "cond#0") {
      [sin2] "cond#0" = Sin (x)
   }>
}

dynamo-ir

  • inputs: #1[(T1s5x3,)]

  • shapes: dict(x:{0:Dim(batch)})

<
   ir_version: 10,
   opset_import: ["" : 18],
   producer_name: "pytorch",
   producer_version: "2.9.0.dev20250701+cu126"
>
main_graph (float[batch,3] x) => (float[batch,3] getitem) 
   <float sum_1, float scalar_tensor_default, bool gt>
{
   [node_sum_1] sum_1 = ReduceSum <noop_with_empty_axes: int = 0, keepdims: int = 0> (x)
   [node_Constant_1] scalar_tensor_default = Constant <value: tensor = float scalar_tensor_default {0}> ()
   [node_gt] gt = Greater (sum_1, scalar_tensor_default)
   [node_cond__0] getitem = If (gt) <then_branch: graph = true_graph_0 () => (float[batch,3] sin_true_graph_0) {
      [node_sin] sin_true_graph_0 = Sin (x)
   }, else_branch: graph = false_graph_0 () => (float[batch,3] cos_false_graph_0) {
      [node_cos] cos_false_graph_0 = Cos (x)
   }>
}

ControlFlowCond2Inputs

forward

def forward(self, x, y):
    def true_fn(x, y):
        return torch.sin(x), torch.cos(x) + y

    def false_fn(x, y):
        return torch.cos(x), torch.sin(x) + y

    return torch.cond(x.sum() > 0, true_fn, false_fn, [x, y])

custom

  • inputs: #1[(T1s5x3,T1s5x3)]

  • shapes: dict(x:{0:Dim(batch)},y:{0:Dim(batch)})

<
   ir_version: 8,
   opset_import: ["" : 18, "local_functions" : 1],
   metadata_props: ["large_model": "False", "inline": "True", "external_threshold": "1024", "function_options": "FunctionOptions()", "optimize": "True", "n_initializers": "1", "n_large_initializers": "0", "size_initializers": "4", "size_large_initializers": "0", "n_nodes": "3", "n_nodes_other_domain": "0", "mask_outputs": "[True, True]", "input_args": "(T1s5x3[0.05943065881729126,0.8915067911148071:A0.5680543144543966],T1s5x3[0.016381025314331055,0.933613657951355:A0.4243027885754903])", "input_kwargs": "None", "optimizations": "OptimizationOptions(constant_folding={'Transpose', 'Cast', 'Reshape', 'Concat'}, patterns=[BatchNormalizationPattern(), BatchNormalizationTrainingPattern(), CastLayerNormalizationCastPattern(), CastPattern(), CastCastBinaryPattern(), CastOpCastPattern(), ClipClipPattern(), ComputationCastOpCastPattern(), ConcatGatherPattern(), ConvBiasNullPattern(), DropoutPattern(), ExpandPattern(), ExpandBroadcastPattern(), ExpandSwapPattern(), GeluPattern(), IdentityPattern(), LayerNormalizationPattern(), LayerNormalizationScalePattern(), LeakyReluPattern(), MulMulMulScalarPattern(), ReduceReshapePattern(), ReduceSumNormalizePattern(), ReshapePattern(), ReshapeMatMulReshapePattern(), Reshape2Of3Pattern(), ReshapeReshapeBinaryPattern(), MatMulAddPattern(), GemmTransposePattern(), MatMulReshape2Of3Pattern(), MulMulMatMulPattern(), ReshapeReshapePattern(), RotaryConcatPartPattern(), SameChildrenPattern(), SequenceConstructAtPattern(), SliceSlicePattern(), SlicesSplitPattern(), SoftmaxCrossEntropyLossCastPattern(), SplitConcatPattern(), SqueezeAddPattern(), SqueezeUnsqueezePattern(), Sub1MulPattern(), SwitchOrderBinaryPattern(), SwitchReshapeActivationPattern(), TransposeEqualReshapePattern(), TransposeMatMulPattern(), TransposeReshapeMatMulPattern(), TransposeReshapeTransposePattern(), TransposeTransposePattern(), UnsqueezeEqualPattern(), UnsqueezeUnsqueezePattern()])", "dynamic_shapes": "{'x': {0: Dim('batch', min=0)}, 'y': {0: Dim('batch', min=0)}}", "_discovered_shape_constraints": "{'batch': {'s17', 's77'}, 's17': {'batch'}, 's77': {'batch'}}"]
>
experiment (float[batch,3] x, float[batch,3] y) => (float[batch,3] output_0, float[batch,3] output_1) 
   <float init1_s_ =  {0}, float sum_1, bool gt, float[1] "sum_1::RSh1", float[1] "init1_s_::RSh1", float[batch,3] getitem_1, int64[1] init7_s1_1, float[batch,3] getitem>
{
   [sum] sum_1 = ReduceSum <keepdims: int = 0> (x)
   [gt_Scalar3] gt = Greater (sum_1, init1_s_)
   [cond] output_0, output_1 = If (gt) <else_branch: graph = experiment () => ( "cond#0",  "cond#1") {
      [cos2] "cond#0" = Cos (x)
      [sin2] sin2 = Sin (x)
      [add_Tensor2] "cond#1" = Add (sin2, y)
   }, then_branch: graph = experiment () => ( "cond#0",  "cond#1") {
      [sin32] "cond#0" = Sin (x)
      [cos32] cos2 = Cos (x)
      [add_Tensor32] "cond#1" = Add (cos2, y)
   }>
}

dynamo-ir

  • inputs: #1[(T1s5x3,T1s5x3)]

  • shapes: dict(x:{0:Dim(batch)},y:{0:Dim(batch)})

<
   ir_version: 10,
   opset_import: ["" : 18],
   producer_name: "pytorch",
   producer_version: "2.9.0.dev20250701+cu126"
>
main_graph (float[batch,3] x, float[batch,3] y) => (float[batch,3] getitem, float[batch,3] getitem_1) 
   <float sum_1, float scalar_tensor_default, bool gt>
{
   [node_sum_1] sum_1 = ReduceSum <noop_with_empty_axes: int = 0, keepdims: int = 0> (x)
   [node_Constant_1] scalar_tensor_default = Constant <value: tensor = float scalar_tensor_default {0}> ()
   [node_gt] gt = Greater (sum_1, scalar_tensor_default)
   [node_cond__1] getitem, getitem_1 = If (gt) <then_branch: graph = true_graph_0 () => (float[batch,3] sin_true_graph_0, float[batch,3] add_12_true_graph_0) 
      <float[batch,3] cos>
{
      [node_sin] sin_true_graph_0 = Sin (x)
      [node_cos] cos = Cos (x)
      [node_add_12] add_12_true_graph_0 = Add (cos, y)
   }, else_branch: graph = false_graph_0 () => (float[batch,3] cos_false_graph_0, float[batch,3] add_12_false_graph_0) 
      <float[batch,3] sin_2>
{
      [node_cos_2] cos_false_graph_0 = Cos (x)
      [node_sin_2] sin_2 = Sin (x)
      [node_add_12_2] add_12_false_graph_0 = Add (sin_2, y)
   }>
}

ControlFlowCond2Outputs

forward

def forward(self, x):
    def true_fn(x):
        return torch.sin(x), torch.cos(x)

    def false_fn(x):
        return torch.cos(x), torch.sin(x)

    return torch.cond(x.sum() > 0, true_fn, false_fn, [x])

custom

  • inputs: #1[(T1s5x3,)]

  • shapes: dict(x:{0:Dim(batch)})

<
   ir_version: 8,
   opset_import: ["" : 18, "local_functions" : 1],
   metadata_props: ["large_model": "False", "inline": "True", "external_threshold": "1024", "function_options": "FunctionOptions()", "optimize": "True", "n_initializers": "1", "n_large_initializers": "0", "size_initializers": "4", "size_large_initializers": "0", "n_nodes": "3", "n_nodes_other_domain": "0", "mask_outputs": "[True, True]", "input_args": "(T1s5x3[0.003947198390960693,0.8443840146064758:A0.48277585903803505],)", "input_kwargs": "None", "optimizations": "OptimizationOptions(constant_folding={'Transpose', 'Cast', 'Reshape', 'Concat'}, patterns=[BatchNormalizationPattern(), BatchNormalizationTrainingPattern(), CastLayerNormalizationCastPattern(), CastPattern(), CastCastBinaryPattern(), CastOpCastPattern(), ClipClipPattern(), ComputationCastOpCastPattern(), ConcatGatherPattern(), ConvBiasNullPattern(), DropoutPattern(), ExpandPattern(), ExpandBroadcastPattern(), ExpandSwapPattern(), GeluPattern(), IdentityPattern(), LayerNormalizationPattern(), LayerNormalizationScalePattern(), LeakyReluPattern(), MulMulMulScalarPattern(), ReduceReshapePattern(), ReduceSumNormalizePattern(), ReshapePattern(), ReshapeMatMulReshapePattern(), Reshape2Of3Pattern(), ReshapeReshapeBinaryPattern(), MatMulAddPattern(), GemmTransposePattern(), MatMulReshape2Of3Pattern(), MulMulMatMulPattern(), ReshapeReshapePattern(), RotaryConcatPartPattern(), SameChildrenPattern(), SequenceConstructAtPattern(), SliceSlicePattern(), SlicesSplitPattern(), SoftmaxCrossEntropyLossCastPattern(), SplitConcatPattern(), SqueezeAddPattern(), SqueezeUnsqueezePattern(), Sub1MulPattern(), SwitchOrderBinaryPattern(), SwitchReshapeActivationPattern(), TransposeEqualReshapePattern(), TransposeMatMulPattern(), TransposeReshapeMatMulPattern(), TransposeReshapeTransposePattern(), TransposeTransposePattern(), UnsqueezeEqualPattern(), UnsqueezeUnsqueezePattern()])", "dynamic_shapes": "{'x': {0: Dim('batch', min=0)}}", "_discovered_shape_constraints": "{'batch': {'s77'}, 's77': {'batch'}}"]
>
experiment (float[batch,3] x) => (float[batch,3] output_0, float[batch,3] output_1) 
   <float init1_s_ =  {0}, float sum_1, bool gt, float[1] "sum_1::RSh1", float[1] "init1_s_::RSh1", float[batch,3] getitem_1, int64[1] init7_s1_1, float[batch,3] getitem>
{
   [sum] sum_1 = ReduceSum <keepdims: int = 0> (x)
   [gt_Scalar3] gt = Greater (sum_1, init1_s_)
   [cond] output_0, output_1 = If (gt) <else_branch: graph = experiment () => ( "cond#0",  "cond#1") {
      [cos2] "cond#0" = Cos (x)
      [sin2] "cond#1" = Sin (x)
   }, then_branch: graph = experiment () => ( "cond#0",  "cond#1") {
      [sin32] "cond#0" = Sin (x)
      [cos32] "cond#1" = Cos (x)
   }>
}

dynamo-ir

  • inputs: #1[(T1s5x3,)]

  • shapes: dict(x:{0:Dim(batch)})

<
   ir_version: 10,
   opset_import: ["" : 18],
   producer_name: "pytorch",
   producer_version: "2.9.0.dev20250701+cu126"
>
main_graph (float[batch,3] x) => (float[batch,3] getitem, float[batch,3] getitem_1) 
   <float sum_1, float scalar_tensor_default, bool gt>
{
   [node_sum_1] sum_1 = ReduceSum <noop_with_empty_axes: int = 0, keepdims: int = 0> (x)
   [node_Constant_1] scalar_tensor_default = Constant <value: tensor = float scalar_tensor_default {0}> ()
   [node_gt] gt = Greater (sum_1, scalar_tensor_default)
   [node_cond__1] getitem, getitem_1 = If (gt) <then_branch: graph = true_graph_0 () => (float[batch,3] sin_true_graph_0, float[batch,3] cos_true_graph_0) {
      [node_sin] sin_true_graph_0 = Sin (x)
      [node_cos] cos_true_graph_0 = Cos (x)
   }, else_branch: graph = false_graph_0 () => (float[batch,3] cos_false_graph_0, float[batch,3] sin_false_graph_0) {
      [node_cos_2] cos_false_graph_0 = Cos (x)
      [node_sin_2] sin_false_graph_0 = Sin (x)
   }>
}

ControlFlowCondConstant

forward

def forward(self, x):
    def true_fn(x):
        return torch.sin(x) - torch.ones(x.shape, dtype=x.dtype)

    def false_fn(x):
        return torch.cos(x) + torch.ones((1, 1024), dtype=x.dtype)

    return torch.cond(x.sum() > 0, true_fn, false_fn, [x])

custom

  • inputs: #1[(T1s1024x1024,)]

  • shapes: dict(x:{0:Dim(batch)})

<
   ir_version: 8,
   opset_import: ["" : 18, "local_functions" : 1],
   metadata_props: ["large_model": "False", "inline": "True", "external_threshold": "1024", "function_options": "FunctionOptions()", "optimize": "True", "n_initializers": "1", "n_large_initializers": "0", "size_initializers": "4", "size_large_initializers": "0", "n_nodes": "3", "n_nodes_other_domain": "0", "mask_outputs": "[True]", "input_args": "(T1s1024x1024[1.7881393432617188e-07,0.9999999403953552:A0.4999965746537782],)", "input_kwargs": "None", "optimizations": "OptimizationOptions(constant_folding={'Transpose', 'Cast', 'Reshape', 'Concat'}, patterns=[BatchNormalizationPattern(), BatchNormalizationTrainingPattern(), CastLayerNormalizationCastPattern(), CastPattern(), CastCastBinaryPattern(), CastOpCastPattern(), ClipClipPattern(), ComputationCastOpCastPattern(), ConcatGatherPattern(), ConvBiasNullPattern(), DropoutPattern(), ExpandPattern(), ExpandBroadcastPattern(), ExpandSwapPattern(), GeluPattern(), IdentityPattern(), LayerNormalizationPattern(), LayerNormalizationScalePattern(), LeakyReluPattern(), MulMulMulScalarPattern(), ReduceReshapePattern(), ReduceSumNormalizePattern(), ReshapePattern(), ReshapeMatMulReshapePattern(), Reshape2Of3Pattern(), ReshapeReshapeBinaryPattern(), MatMulAddPattern(), GemmTransposePattern(), MatMulReshape2Of3Pattern(), MulMulMatMulPattern(), ReshapeReshapePattern(), RotaryConcatPartPattern(), SameChildrenPattern(), SequenceConstructAtPattern(), SliceSlicePattern(), SlicesSplitPattern(), SoftmaxCrossEntropyLossCastPattern(), SplitConcatPattern(), SqueezeAddPattern(), SqueezeUnsqueezePattern(), Sub1MulPattern(), SwitchOrderBinaryPattern(), SwitchReshapeActivationPattern(), TransposeEqualReshapePattern(), TransposeMatMulPattern(), TransposeReshapeMatMulPattern(), TransposeReshapeTransposePattern(), TransposeTransposePattern(), UnsqueezeEqualPattern(), UnsqueezeUnsqueezePattern()])", "dynamic_shapes": "{'x': {0: Dim('batch', min=0)}}", "_discovered_shape_constraints": "{'batch': {'s77'}, 's77': {'batch'}}"]
>
experiment (float[batch,1024] x) => (float[batch,1024] output_0) 
   <float init1_s_ =  {0}, float sum_1, bool gt, float[1] "sum_1::RSh1", float[1] "init1_s_::RSh1", int64[1] init7_s1_1, float[batch,1024] getitem>
{
   [sum] sum_1 = ReduceSum <keepdims: int = 0> (x)
   [gt_Scalar3] gt = Greater (sum_1, init1_s_)
   [cond] output_0 = If (gt) <else_branch: graph = experiment () => ( "cond#0") {
      [init2cst2] init7_s2_1_10242 = Constant <value: tensor = int64[2] init7_s2_1_1024 {1,1024}> ()
      [cos2] cos2 = Cos (x)
      [ones2] ones2 = ConstantOfShape <value: tensor = float[1] {1}> (init7_s2_1_10242)
      [add_Tensor2] "cond#0" = Add (cos2, ones2)
   }, then_branch: graph = experiment () => ( "cond#0") {
      [init2cst32] init7_s1_10242 = Constant <value: tensor = int64[1] init7_s1_1024 {1024}> ()
      [sin2] sin2 = Sin (x)
      [sym_size_int2] "x::Shape:12" = Shape <end: int = 1, start: int = 0> (x)
      [_mkshape_sym_size_int2] "_onx_concat_sym_size_int::UnSq02" = Concat <axis: int = 0> ("x::Shape:12", init7_s1_10242)
      [ones32] ones32 = ConstantOfShape <value: tensor = float[1] {1}> ("_onx_concat_sym_size_int::UnSq02")
      [sub_Tensor2] "cond#0" = Sub (sin2, ones32)
   }>
}

dynamo-ir

  • inputs: #1[(T1s1024x1024,)]

  • shapes: dict(x:{0:Dim(batch)})

<
   ir_version: 10,
   opset_import: ["" : 18],
   producer_name: "pytorch",
   producer_version: "2.9.0.dev20250701+cu126"
>
main_graph (float[batch,1024] x) => (float[batch,1024] getitem) 
   <float[1,1024] ones_2 =  {1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1}, float[1,1024] ones_2, float sum_1, float scalar_tensor_default, bool gt>
{
   [node_sum_1] sum_1 = ReduceSum <noop_with_empty_axes: int = 0, keepdims: int = 0> (x)
   [node_Constant_1] scalar_tensor_default = Constant <value: tensor = float scalar_tensor_default {0}> ()
   [node_gt] gt = Greater (sum_1, scalar_tensor_default)
   [node_cond__0] getitem = If (gt) <then_branch: graph = true_graph_0 () => (float[?,1024] sub_3_true_graph_0) 
      <int64[1] val_0_2, int64 sym_size_int, float[batch,1024] sin, int64[1] val_1, int64[1] val_2, int64[1] val_3, int64[2] val_4, float val_7, float[?,?] ones>
{
      [node_Shape_0] val_0_2 = Shape <end: int = 1, start: int = 0> (x)
      [node_sym_size_int] sym_size_int = Squeeze (val_0_2)
      [node_sin] sin = Sin (x)
      [node_Constant_1] val_1 = Constant <value: tensor = int64[1] {-1}> ()
      [node_Reshape_2] val_2 = Reshape <allowzero: int = 0> (sym_size_int, val_1)
      [node_Constant_3] val_3 = Constant <value: tensor = int64[1] {1024}> ()
      [node_Concat_4] val_4 = Concat <axis: int = 0> (val_2, val_3)
      [node_Constant_4] val_7 = Constant <value: tensor = float val_7 {1}> ()
      [node_ones] ones = Expand (val_7, val_4)
      [node_sub_3] sub_3_true_graph_0 = Sub (sin, ones)
   }, else_branch: graph = false_graph_0 () => (float[batch,1024] add_6_false_graph_0) 
      <float[batch,1024] cos>
{
      [node_cos] cos = Cos (x)
      [node_add_6] add_6_false_graph_0 = Add (cos, ones_2)
   }>
}

ControlFlowCondIdentity_153832

forward

def forward(self, x, y):

    def branch_cond_then_1(x):
        x = torch.abs(x) + 1
        return x

    def branch_cond_else_1(x):
        return x  # fails but succeeds with x.clone()

    x = torch.cond(x.sum() > 0, branch_cond_then_1, branch_cond_else_1, [x])
    return x + y

custom

FAILED

Cond doesn't work unless it is captured completely with torch.compile. Scroll up to find out what causes the graph break.

from user code:
   File "~/vv/this312/lib/python3.12/site-packages/torch/_higher_order_ops/cond.py", line 187, in _cond_op_wrapper
    return cond_op(*args, **kwargs)

Set TORCHDYNAMO_VERBOSE=1 for the internal stack trace (please do this especially if you're reporting a bug to PyTorch). For even more developer context, set TORCH_LOGS="+dynamo"

dynamo-ir

FAILED

Failed to export the model with torch.export. This is step 1/3 of exporting the model to ONNX. Next steps:
- Modify the model code for `torch.export.export` to succeed. Refer to https://pytorch.org/docs/stable/generated/exportdb/index.html for more information.
- Debug `torch.export.export` and summit a PR to PyTorch.
- Create an issue in the PyTorch GitHub repository against the *torch.export* component and attach the full error stack as well as reproduction scripts.

## Exception summary

<class 'torch._dynamo.exc.Unsupported'>: Encountered aliasing during higher order op tracing
  Explanation: Higher order ops do not support aliasing. Found in cond
  Hint: Replace `return input` with `return input.clone()` to avoid aliasing.
  Hint: Consider using the debug context to change user code to avoid aliasing.
  Hint: Please open an issue.

  Developer debug context: Input-to-output aliasing detected at nodes l_args_3_0_ and l_args_3_0_ in
     graph():
        %l_args_3_0_ : torch._subclasses.fake_tensor.FakeTensor [num_users=1] = placeholder[target=l_args_3_0_]
        return (l_args_3_0_,)

 For more details about this graph break, please visit: https://compile-graph-break-site.vercel.app/gb/GB0040
⬆️
<class 'torch._dynamo.exc.UncapturedHigherOrderOpError'>: Cond doesn't work unless it is captured completely with torch.compile. Scroll up to find out what causes the graph break.

from user code:
   File "~/vv/this312/lib/python3.12/site-packages/torch/_higher_order_ops/cond.py", line 187, in _cond_op_wrapper
    return cond_op(*args, **kwargs)

Set TORCHDYNAMO_VERBOSE=1 for the internal stack trace (please do this especially if you're reporting a bug to PyTorch). For even more developer context, set TORCH_LOGS="+dynamo"


(Refer to the full stack trace above for more information.)

ControlFlowCondNestedModule

forward

def forward(self, x):
    def true_fn(x):
        return self.submodule(x)

    def false_fn(x):
        return x - self.weight

    y = torch.cond(x.sum() > 0, true_fn, false_fn, [x])
    return y

custom

  • inputs: #1[(T7s2,)]

  • shapes: dict(x:{0:Dim(batch)})

<
   ir_version: 8,
   opset_import: ["" : 18, "local_functions.0" : 1, "local_functions" : 1],
   metadata_props: ["large_model": "False", "inline": "True", "external_threshold": "1024", "function_options": "FunctionOptions()", "optimize": "True", "n_initializers": "3", "n_large_initializers": "0", "size_initializers": "16", "size_large_initializers": "0", "n_nodes": "3", "n_nodes_other_domain": "0", "mask_outputs": "[True]", "input_args": "(T7s2[-1,2:A0.5],)", "input_kwargs": "None", "optimizations": "OptimizationOptions(constant_folding={'Transpose', 'Cast', 'Reshape', 'Concat'}, patterns=[BatchNormalizationPattern(), BatchNormalizationTrainingPattern(), CastLayerNormalizationCastPattern(), CastPattern(), CastCastBinaryPattern(), CastOpCastPattern(), ClipClipPattern(), ComputationCastOpCastPattern(), ConcatGatherPattern(), ConvBiasNullPattern(), DropoutPattern(), ExpandPattern(), ExpandBroadcastPattern(), ExpandSwapPattern(), GeluPattern(), IdentityPattern(), LayerNormalizationPattern(), LayerNormalizationScalePattern(), LeakyReluPattern(), MulMulMulScalarPattern(), ReduceReshapePattern(), ReduceSumNormalizePattern(), ReshapePattern(), ReshapeMatMulReshapePattern(), Reshape2Of3Pattern(), ReshapeReshapeBinaryPattern(), MatMulAddPattern(), GemmTransposePattern(), MatMulReshape2Of3Pattern(), MulMulMatMulPattern(), ReshapeReshapePattern(), RotaryConcatPartPattern(), SameChildrenPattern(), SequenceConstructAtPattern(), SliceSlicePattern(), SlicesSplitPattern(), SoftmaxCrossEntropyLossCastPattern(), SplitConcatPattern(), SqueezeAddPattern(), SqueezeUnsqueezePattern(), Sub1MulPattern(), SwitchOrderBinaryPattern(), SwitchReshapeActivationPattern(), TransposeEqualReshapePattern(), TransposeMatMulPattern(), TransposeReshapeMatMulPattern(), TransposeReshapeTransposePattern(), TransposeTransposePattern(), UnsqueezeEqualPattern(), UnsqueezeUnsqueezePattern()])", "dynamic_shapes": "{'x': {0: Dim('batch', min=0)}}", "_discovered_shape_constraints": "{'batch': {'s77'}, 's77': {'batch'}}"]
>
experiment (int64[batch] x) => (float[d_output_0_0] output_0) 
   <int64 init7_s_0 =  {0}, float[1] weight =  {42}, float[1] "submodule.weight" =  {100}, int64 sum_1, bool gt, int64[1] "sum_1::RSh1", int64[1] "init7_s_0::RSh1", int64[1] init7_s1_1, float[?] getitem, float[1] p_weight, float[1] p_submodule_weight>
{
   [sum] sum_1 = ReduceSum <keepdims: int = 0> (x)
   [gt_Scalar3] gt = Greater (sum_1, init7_s_0)
   [cond] output_0 = If (gt) <else_branch: graph = experiment () => ( "cond#0") {
      [Opset2] "x::C12" = Cast <to: int = 1> (x)
      [sub_Tensor2] "cond#0" = Sub ("x::C12", weight)
   }, then_branch: graph = experiment () => ( "cond#0") {
      [init2cst2] init7_s_1002 = Constant <value: tensor = int64 init7_s_100 {100}> ()
      [abs2] abs_12 = Abs (x)
      [sum22] sum_122 = ReduceSum <keepdims: int = 0> (abs_12)
      [gt_Scalar322] gt22 = Greater (sum_122, init7_s_1002)
      [cond22] "cond#0" = If (gt22) <else_branch: graph = experiment () => ( "cond#0") {
         [Opset32] "x::C132" = Cast <to: int = 1> (x)
         [div_Tensor2] "cond#0" = Div ("x::C132", "submodule.weight")
      }, then_branch: graph = experiment () => ( "cond#0") {
         [mul_Tensor3] "x::C142" = Cast <to: int = 1> (x)
         [mul_Tensor22] "cond#0" = Mul ("x::C142", "submodule.weight")
      }>
   }>
}

dynamo-ir

  • inputs: #1[(T7s2,)]

  • shapes: dict(x:{0:Dim(batch)})

<
   ir_version: 10,
   opset_import: ["" : 18],
   producer_name: "pytorch",
   producer_version: "2.9.0.dev20250701+cu126"
>
main_graph (int64[batch] x) => (float[batch] getitem) 
   <float[1] weight =  {42}, float[1] "submodule.weight" =  {100}, float[1] weight, float[1] "submodule.weight", int64 sum_1, int64 val_0, bool gt>
{
   [node_sum_1] sum_1 = ReduceSum <noop_with_empty_axes: int = 0, keepdims: int = 0> (x)
   [node_Constant_0] val_0 = Constant <value: tensor = int64 {0}> ()
   [node_gt] gt = Greater (sum_1, val_0)
   [node_cond__0] getitem = If (gt) <then_branch: graph = true_graph_0 () => ( getitem_true_graph_0) 
      <int64[batch] abs_1, int64 sum_1_2, int64 val_0_2, bool gt_2>
{
      [node_abs_1] abs_1 = Abs (x)
      [node_sum_1_2] sum_1_2 = ReduceSum <noop_with_empty_axes: int = 0, keepdims: int = 0> (abs_1)
      [node_Constant_0_2] val_0_2 = Constant <value: tensor = int64 {100}> ()
      [node_gt_2] gt_2 = Greater (sum_1_2, val_0_2)
      [node_cond__0_2] getitem_true_graph_0 = If (gt_2) <then_branch: graph = true_graph_0__true_graph_0 () => (float[batch] mul_1_true_graph_0__true_graph_0) 
         <float[batch] convert_element_type_default>
{
         [node_convert_element_type_default] convert_element_type_default = Cast <to: int = 1> (x)
         [node_mul_1] mul_1_true_graph_0__true_graph_0 = Mul (convert_element_type_default, "submodule.weight")
      }, else_branch: graph = true_graph_0__false_graph_0 () => (float[batch] div_true_graph_0__false_graph_0) 
         <float[batch] convert_element_type_default_2>
{
         [node_convert_element_type_default_2] convert_element_type_default_2 = Cast <to: int = 1> (x)
         [node_div] div_true_graph_0__false_graph_0 = Div (convert_element_type_default_2, "submodule.weight")
      }>
   }, else_branch: graph = false_graph_0 () => (float[batch] sub_1_false_graph_0) 
      <float[batch] convert_element_type_default_3>
{
      [node_convert_element_type_default_3] convert_element_type_default_3 = Cast <to: int = 1> (x)
      [node_sub_1] sub_1_false_graph_0 = Sub (convert_element_type_default_3, weight)
   }>
}

ControlFlowCondNonZero

forward

def forward(self, input_ids, image_features, vocab_size):
    def then_branch(input_ids, image_features, vocab_size):
        input_shape = input_ids.size()
        input_ids = input_ids.view(-1, input_shape[-1])

        condition = (input_ids < 0) & (input_ids > -int(1e9))
        positions = torch.nonzero(condition, as_tuple=True)
        input_ids = input_ids.clamp_min(0).clamp_max(vocab_size)
        return (input_ids, positions[0], positions[1])

    def else_branch(input_ids, image_features, vocab_size):
        r = torch.where(torch.zeros((1, 1), dtype=torch.bool))
        return (input_ids, r[0], r[1])

    a, b, c = torch.cond(
        image_features.numel() > 0,
        then_branch,
        else_branch,
        [input_ids, image_features, vocab_size],
    )
    return a, b, c

custom

FAILED

Expect operands to be a tuple of possibly nested dict/list/tuple that only consists of tensor leaves, but got [FakeTensor(..., size=(s72, 12), dtype=torch.int64), FakeTensor(..., size=(s28, s11)), 1025].

dynamo-ir

FAILED

Failed to export the model with torch.export. This is step 1/3 of exporting the model to ONNX. Next steps:
- Modify the model code for `torch.export.export` to succeed. Refer to https://pytorch.org/docs/stable/generated/exportdb/index.html for more information.
- Debug `torch.export.export` and summit a PR to PyTorch.
- Create an issue in the PyTorch GitHub repository against the *torch.export* component and attach the full error stack as well as reproduction scripts.

## Exception summary

<class 'RuntimeError'>: Expect operands to be a tuple of possibly nested dict/list/tuple that only consists of tensor leaves, but got [FakeTensor(..., size=(s72, 12), dtype=torch.int64), FakeTensor(..., size=(s28, s11)), 1025].

(Refer to the full stack trace above for more information.)

ControlFlowNestCond

forward

def forward(self, x):
    def true_fn2(x):
        def true_fn1(x):
            return torch.sin(x)

        def false_fn1(x):
            return torch.cos(x)

        return torch.cond(x.sum() < 0, true_fn1, false_fn1, [x])

    def false_fn2(x):
        return -x

    return torch.cond(x.sum() > 0, true_fn2, false_fn2, [x])

custom

  • inputs: #1[(T1s5x3,)]

  • shapes: dict(x:{0:Dim(batch)})

<
   ir_version: 8,
   opset_import: ["" : 18, "local_functions.0" : 1, "local_functions" : 1],
   metadata_props: ["large_model": "False", "inline": "True", "external_threshold": "1024", "function_options": "FunctionOptions()", "optimize": "True", "n_initializers": "1", "n_large_initializers": "0", "size_initializers": "4", "size_large_initializers": "0", "n_nodes": "3", "n_nodes_other_domain": "0", "mask_outputs": "[True]", "input_args": "(T1s5x3[0.2221928834915161,0.9708185791969299:A0.5694047530492147],)", "input_kwargs": "None", "optimizations": "OptimizationOptions(constant_folding={'Transpose', 'Cast', 'Reshape', 'Concat'}, patterns=[BatchNormalizationPattern(), BatchNormalizationTrainingPattern(), CastLayerNormalizationCastPattern(), CastPattern(), CastCastBinaryPattern(), CastOpCastPattern(), ClipClipPattern(), ComputationCastOpCastPattern(), ConcatGatherPattern(), ConvBiasNullPattern(), DropoutPattern(), ExpandPattern(), ExpandBroadcastPattern(), ExpandSwapPattern(), GeluPattern(), IdentityPattern(), LayerNormalizationPattern(), LayerNormalizationScalePattern(), LeakyReluPattern(), MulMulMulScalarPattern(), ReduceReshapePattern(), ReduceSumNormalizePattern(), ReshapePattern(), ReshapeMatMulReshapePattern(), Reshape2Of3Pattern(), ReshapeReshapeBinaryPattern(), MatMulAddPattern(), GemmTransposePattern(), MatMulReshape2Of3Pattern(), MulMulMatMulPattern(), ReshapeReshapePattern(), RotaryConcatPartPattern(), SameChildrenPattern(), SequenceConstructAtPattern(), SliceSlicePattern(), SlicesSplitPattern(), SoftmaxCrossEntropyLossCastPattern(), SplitConcatPattern(), SqueezeAddPattern(), SqueezeUnsqueezePattern(), Sub1MulPattern(), SwitchOrderBinaryPattern(), SwitchReshapeActivationPattern(), TransposeEqualReshapePattern(), TransposeMatMulPattern(), TransposeReshapeMatMulPattern(), TransposeReshapeTransposePattern(), TransposeTransposePattern(), UnsqueezeEqualPattern(), UnsqueezeUnsqueezePattern()])", "dynamic_shapes": "{'x': {0: Dim('batch', min=0)}}", "_discovered_shape_constraints": "{'batch': {'s77'}, 's77': {'batch'}}"]
>
experiment (float[batch,3] x) => (float[batch,3] output_0) 
   <float init1_s_ =  {0}, float sum_1, bool gt, float[1] "sum_1::RSh1", float[1] "init1_s_::RSh1", int64[1] init7_s1_1, float[batch,3] getitem>
{
   [sum] sum_1 = ReduceSum <keepdims: int = 0> (x)
   [gt_Scalar3] gt = Greater (sum_1, init1_s_)
   [cond] output_0 = If (gt) <else_branch: graph = experiment () => ( "cond#0") {
      [neg2] "cond#0" = Neg (x)
   }, then_branch: graph = experiment () => ( "cond#0") {
      [init2cst2] init1_s_22 = Constant <value: tensor = float init1_s_ {0}> ()
      [sum22] sum_122 = ReduceSum <keepdims: int = 0> (x)
      [lt_Scalar32] lt2 = Less (sum_122, init1_s_22)
      [cond22] "cond#0" = If (lt2) <else_branch: graph = experiment () => ( "cond#0") {
         [cos2] "cond#0" = Cos (x)
      }, then_branch: graph = experiment () => ( "cond#0") {
         [sin2] "cond#0" = Sin (x)
      }>
   }>
}

dynamo-ir

  • inputs: #1[(T1s5x3,)]

  • shapes: dict(x:{0:Dim(batch)})

<
   ir_version: 10,
   opset_import: ["" : 18],
   producer_name: "pytorch",
   producer_version: "2.9.0.dev20250701+cu126"
>
main_graph (float[batch,3] x) => (float[batch,3] getitem) 
   <float sum_1, float scalar_tensor_default, bool gt>
{
   [node_sum_1] sum_1 = ReduceSum <noop_with_empty_axes: int = 0, keepdims: int = 0> (x)
   [node_Constant_1] scalar_tensor_default = Constant <value: tensor = float scalar_tensor_default {0}> ()
   [node_gt] gt = Greater (sum_1, scalar_tensor_default)
   [node_cond__0] getitem = If (gt) <then_branch: graph = true_graph_0 () => ( getitem_true_graph_0) 
      <float sum_1_2, float scalar_tensor_default_2, bool lt>
{
      [node_sum_1_2] sum_1_2 = ReduceSum <noop_with_empty_axes: int = 0, keepdims: int = 0> (x)
      [node_Constant_1] scalar_tensor_default_2 = Constant <value: tensor = float scalar_tensor_default_2 {0}> ()
      [node_lt] lt = Less (sum_1_2, scalar_tensor_default_2)
      [node_cond__0_2] getitem_true_graph_0 = If (lt) <then_branch: graph = true_graph_0__true_graph_0 () => (float[batch,3] sin_true_graph_0__true_graph_0) {
         [node_sin] sin_true_graph_0__true_graph_0 = Sin (x)
      }, else_branch: graph = true_graph_0__false_graph_0 () => (float[batch,3] cos_true_graph_0__false_graph_0) {
         [node_cos] cos_true_graph_0__false_graph_0 = Cos (x)
      }>
   }, else_branch: graph = false_graph_0 () => (float[batch,3] neg_false_graph_0) {
      [node_neg] neg_false_graph_0 = Neg (x)
   }>
}

ControlFlowScan

forward

def forward(self, x):
    init = torch.zeros_like(x[0])
    carry, out = torch.ops.higher_order.scan(
        ControlFlowScan.add, [init], [x], additional_inputs=[]
    )
    return carry

custom

  • inputs: #1[(T1s3x3,)]

  • shapes: dict(x:{0:Dim(batch)})

<
   ir_version: 8,
   opset_import: ["" : 18, "local_functions" : 1],
   metadata_props: ["large_model": "False", "inline": "True", "external_threshold": "1024", "function_options": "FunctionOptions()", "optimize": "True", "n_initializers": "1", "n_large_initializers": "0", "size_initializers": "8", "size_large_initializers": "0", "n_nodes": "2", "n_nodes_other_domain": "0", "mask_outputs": "[True]", "input_args": "(T1s3x3[1.0,9.0:A5.0],)", "input_kwargs": "None", "optimizations": "OptimizationOptions(constant_folding={'Transpose', 'Cast', 'Reshape', 'Concat'}, patterns=[BatchNormalizationPattern(), BatchNormalizationTrainingPattern(), CastLayerNormalizationCastPattern(), CastPattern(), CastCastBinaryPattern(), CastOpCastPattern(), ClipClipPattern(), ComputationCastOpCastPattern(), ConcatGatherPattern(), ConvBiasNullPattern(), DropoutPattern(), ExpandPattern(), ExpandBroadcastPattern(), ExpandSwapPattern(), GeluPattern(), IdentityPattern(), LayerNormalizationPattern(), LayerNormalizationScalePattern(), LeakyReluPattern(), MulMulMulScalarPattern(), ReduceReshapePattern(), ReduceSumNormalizePattern(), ReshapePattern(), ReshapeMatMulReshapePattern(), Reshape2Of3Pattern(), ReshapeReshapeBinaryPattern(), MatMulAddPattern(), GemmTransposePattern(), MatMulReshape2Of3Pattern(), MulMulMatMulPattern(), ReshapeReshapePattern(), RotaryConcatPartPattern(), SameChildrenPattern(), SequenceConstructAtPattern(), SliceSlicePattern(), SlicesSplitPattern(), SoftmaxCrossEntropyLossCastPattern(), SplitConcatPattern(), SqueezeAddPattern(), SqueezeUnsqueezePattern(), Sub1MulPattern(), SwitchOrderBinaryPattern(), SwitchReshapeActivationPattern(), TransposeEqualReshapePattern(), TransposeMatMulPattern(), TransposeReshapeMatMulPattern(), TransposeReshapeTransposePattern(), TransposeTransposePattern(), UnsqueezeEqualPattern(), UnsqueezeUnsqueezePattern()])", "dynamic_shapes": "{'x': {0: Dim('batch', min=0)}}", "_discovered_shape_constraints": "{'batch': {'s77'}, 's77': {'batch'}}"]
>
experiment (float[batch,3] x) => (float[3] output_0) 
   <int64[1] init7_s1_3 =  {3}, float[3] zeros_like, float[3] "scan#0", float[batch,3] getitem_1, float[batch,3] "scan#1", int64 init7_s_0, float[3] select, float[3] getitem>
{
   [zeros_likeD2] zeros_like = ConstantOfShape <value: tensor = float[1] {0}> (init7_s1_3)
   [scan] output_0, "scan#1" = Scan (zeros_like, x) <body: graph = experiment ( init_0_zeros_like,  scan_0_x) => ( output_0,  output_1) {
      [add_Tensor2] output_0 = Add (init_0_zeros_like, scan_0_x)
      [".output22"] output_1 = Identity (output_0)
   }, num_scan_inputs: int = 1, scan_input_directions: ints = [0], scan_output_axes: ints = [0], scan_output_directions: ints = [0]>
}

dynamo-ir

FAILED

Failed to decompose the FX graph for ONNX compatibility. This is step 2/3 of exporting the model to ONNX. Next steps:
- Create an issue in the PyTorch GitHub repository against the *torch.export* component and attach the full error stack as well as reproduction scripts.
- Create an error report with `torch.onnx.export(..., report=True)`, and save the ExportedProgram as a pt2 file. Create an issue in the PyTorch GitHub repository against the *onnx* component. Attach the error report and the pt2 model.

## Exception summary

<class 'RuntimeError'>: scan might be aliasing the input or the output!

While executing %scan : [num_users=2] = call_function[target=torch.ops.higher_order.scan](args = (%scan_combine_graph_0, [%zeros_like], [%x], ()), kwargs = {})
GraphModule: class GraphModule(torch.nn.Module):
    def forward(self, x):
        x: "f32[s77, 3][3, 1]"; 

        x, = fx_pytree.tree_flatten_spec(([x], {}), self._in_spec)
         # File: ~/github/onnx-diagnostic/onnx_diagnostic/torch_export_patches/eval/model_cases.py:386 in forward, code: init = torch.zeros_like(x[0])
        select: "f32[3][1]" = torch.ops.aten.select.int(x, 0, 0)
        zeros_like: "f32[3][1]" = torch.ops.aten.zeros_like.default(select, pin_memory = False);  select = None
    
         # File: ~/github/onnx-diagnostic/onnx_diagnostic/torch_export_patches/eval/model_cases.py:387 in forward, code: carry, out = torch.ops.higher_order.scan(
        scan_combine_graph_0 = self.scan_combine_graph_0
        scan = torch.ops.higher_order.scan(scan_combine_graph_0, [zeros_like], [x], ());  scan_combine_graph_0 = zeros_like = x = None
        getitem: "f32[3][1]" = scan[0]
        getitem_1: "f32[s77, 3][3, 1]" = scan[1];  scan = getitem_1 = None
        return pytree.tree_unflatten((getitem,), self._out_spec)
    
    class scan_combine_graph_0(torch.nn.Module):
        def forward(self, carry_1: "f32[3][1]", y_1: "f32[3][1]"):
             # File: ~/github/onnx-diagnostic/onnx_diagnostic/torch_export_patches/eval/model_cases.py:387 in forward, code: carry, out = torch.ops.higher_order.scan(
            add: "f32[3][1]" = torch.ops.aten.add.Tensor(carry_1, y_1);  carry_1 = y_1 = None
            return [add, add]
        

Original traceback:
File "~/github/onnx-diagnostic/onnx_diagnostic/torch_export_patches/eval/model_cases.py", line 387, in forward
    carry, out = torch.ops.higher_order.scan(

(Refer to the full stack trace above for more information.)

ControlFlowScan2Carried

forward

def forward(self, x):
    init1 = torch.zeros_like(x[0])
    init2 = torch.ones_like(x[0])
    carry1, carry2, out1, out2 = torch.ops.higher_order.scan(
        ControlFlowScan2Carried.add,
        [init1, init2],
        [x, x * 2],
        # dim=0,  # 01/31/2025, not supported anymore
        additional_inputs=[],
    )
    return carry1, carry2, out1, out2

custom

  • inputs: #1[(T1s3x4,)]

  • shapes: dict(x:{0:Dim(batch)})

<
   ir_version: 8,
   opset_import: ["" : 18, "local_functions" : 1],
   metadata_props: ["large_model": "False", "inline": "True", "external_threshold": "1024", "function_options": "FunctionOptions()", "optimize": "True", "n_initializers": "2", "n_large_initializers": "0", "size_initializers": "12", "size_large_initializers": "0", "n_nodes": "4", "n_nodes_other_domain": "0", "mask_outputs": "[True, True, True, True]", "input_args": "(T1s3x4[-1.0,9.0:A3.5],)", "input_kwargs": "None", "optimizations": "OptimizationOptions(constant_folding={'Transpose', 'Cast', 'Reshape', 'Concat'}, patterns=[BatchNormalizationPattern(), BatchNormalizationTrainingPattern(), CastLayerNormalizationCastPattern(), CastPattern(), CastCastBinaryPattern(), CastOpCastPattern(), ClipClipPattern(), ComputationCastOpCastPattern(), ConcatGatherPattern(), ConvBiasNullPattern(), DropoutPattern(), ExpandPattern(), ExpandBroadcastPattern(), ExpandSwapPattern(), GeluPattern(), IdentityPattern(), LayerNormalizationPattern(), LayerNormalizationScalePattern(), LeakyReluPattern(), MulMulMulScalarPattern(), ReduceReshapePattern(), ReduceSumNormalizePattern(), ReshapePattern(), ReshapeMatMulReshapePattern(), Reshape2Of3Pattern(), ReshapeReshapeBinaryPattern(), MatMulAddPattern(), GemmTransposePattern(), MatMulReshape2Of3Pattern(), MulMulMatMulPattern(), ReshapeReshapePattern(), RotaryConcatPartPattern(), SameChildrenPattern(), SequenceConstructAtPattern(), SliceSlicePattern(), SlicesSplitPattern(), SoftmaxCrossEntropyLossCastPattern(), SplitConcatPattern(), SqueezeAddPattern(), SqueezeUnsqueezePattern(), Sub1MulPattern(), SwitchOrderBinaryPattern(), SwitchReshapeActivationPattern(), TransposeEqualReshapePattern(), TransposeMatMulPattern(), TransposeReshapeMatMulPattern(), TransposeReshapeTransposePattern(), TransposeTransposePattern(), UnsqueezeEqualPattern(), UnsqueezeUnsqueezePattern()])", "dynamic_shapes": "{'x': {0: Dim('batch', min=0)}}", "_discovered_shape_constraints": "{'batch': {'s77'}, 's77': {'batch'}}"]
>
experiment (float[batch,4] x) => (float[4] output_0, float[4] output_1, float[batch,4] output_2, float[batch,4] output_3) 
   <int64[1] init7_s1_4 =  {4}, float[1] "init1_s_::RSh1" =  {2}, float[4] getitem_1, int64[1] init7_s1_1, int64 init7_s_0, float init1_s_, float[4] select, float[batch,4] _onx_mul_x, float[4] "scan#0", float[4] "scan#1", float[batch,4] mul, float[4] zeros_like, float[batch,4] getitem_2, float[batch,4] getitem_3, float[4] ones_like, float[4] getitem, float[4] select_1, float[batch,4] "scan#3", float[batch,4] "scan#2">
{
   [zeros_likeD2] zeros_like = ConstantOfShape <value: tensor = float[1] {0}> (init7_s1_4)
   [ones_likeD2] ones_like = ConstantOfShape <value: tensor = float[1] {1}> (init7_s1_4)
   [mul_Tensor2] _onx_mul_x = Mul (x, "init1_s_::RSh1")
   [scan] output_0, output_1, output_2, output_3 = Scan (zeros_like, ones_like, x, _onx_mul_x) <body: graph = experiment ( init_0_zeros_like,  init_1_ones_like,  scan_0_x,  scan_1_mul) => ( output_0,  output_1,  output_2,  output_3) {
      [add_Tensor2] output_0 = Add (init_0_zeros_like, scan_0_x)
      [mul_Tensor42] output_1 = Mul (init_1_ones_like, scan_1_mul)
      [".output322"] output_2 = Identity (output_0)
      [".output422"] output_3 = Identity (output_1)
   }, num_scan_inputs: int = 2, scan_input_directions: ints = [0, 0], scan_output_axes: ints = [0, 0], scan_output_directions: ints = [0, 0]>
}

dynamo-ir

FAILED

Failed to decompose the FX graph for ONNX compatibility. This is step 2/3 of exporting the model to ONNX. Next steps:
- Create an issue in the PyTorch GitHub repository against the *torch.export* component and attach the full error stack as well as reproduction scripts.
- Create an error report with `torch.onnx.export(..., report=True)`, and save the ExportedProgram as a pt2 file. Create an issue in the PyTorch GitHub repository against the *onnx* component. Attach the error report and the pt2 model.

## Exception summary

<class 'RuntimeError'>: scan might be aliasing the input or the output!

While executing %scan : [num_users=4] = call_function[target=torch.ops.higher_order.scan](args = (%scan_combine_graph_0, [%zeros_like, %ones_like], [%x, %mul], ()), kwargs = {})
GraphModule: class GraphModule(torch.nn.Module):
    def forward(self, x):
        x: "f32[s77, 4][4, 1]"; 

        x, = fx_pytree.tree_flatten_spec(([x], {}), self._in_spec)
         # File: ~/github/onnx-diagnostic/onnx_diagnostic/torch_export_patches/eval/model_cases.py:404 in forward, code: init1 = torch.zeros_like(x[0])
        select: "f32[4][1]" = torch.ops.aten.select.int(x, 0, 0)
        zeros_like: "f32[4][1]" = torch.ops.aten.zeros_like.default(select, pin_memory = False);  select = None
    
         # File: ~/github/onnx-diagnostic/onnx_diagnostic/torch_export_patches/eval/model_cases.py:405 in forward, code: init2 = torch.ones_like(x[0])
        select_1: "f32[4][1]" = torch.ops.aten.select.int(x, 0, 0)
        ones_like: "f32[4][1]" = torch.ops.aten.ones_like.default(select_1, pin_memory = False);  select_1 = None
    
         # File: ~/github/onnx-diagnostic/onnx_diagnostic/torch_export_patches/eval/model_cases.py:409 in forward, code: [x, x * 2],
        mul: "f32[s77, 4][4, 1]" = torch.ops.aten.mul.Tensor(x, 2)
    
         # File: ~/github/onnx-diagnostic/onnx_diagnostic/torch_export_patches/eval/model_cases.py:406 in forward, code: carry1, carry2, out1, out2 = torch.ops.higher_order.scan(
        scan_combine_graph_0 = self.scan_combine_graph_0
        scan = torch.ops.higher_order.scan(scan_combine_graph_0, [zeros_like, ones_like], [x, mul], ());  scan_combine_graph_0 = zeros_like = ones_like = x = mul = None
        getitem: "f32[4][1]" = scan[0]
        getitem_1: "f32[4][1]" = scan[1]
        getitem_2: "f32[s77, 4][4, 1]" = scan[2]
        getitem_3: "f32[s77, 4][4, 1]" = scan[3];  scan = None
        return pytree.tree_unflatten((getitem, getitem_1, getitem_2, getitem_3), self._out_spec)
    
    class scan_combine_graph_0(torch.nn.Module):
        def forward(self, carry1_1: "f32[4][1]", carry2_1: "f32[4][1]", y1_1: "f32[4][1]", y2_1: "f32[4][1]"):
             # File: ~/github/onnx-diagnostic/onnx_diagnostic/torch_export_patches/eval/model_cases.py:406 in forward, code: carry1, carry2, out1, out2 = torch.ops.higher_order.scan(
            add: "f32[4][1]" = torch.ops.aten.add.Tensor(carry1_1, y1_1);  carry1_1 = y1_1 = None
            mul: "f32[4][1]" = torch.ops.aten.mul.Tensor(carry2_1, y2_1);  carry2_1 = y2_1 = None
            return [add, mul, add, mul]
        

Original traceback:
File "~/github/onnx-diagnostic/onnx_diagnostic/torch_export_patches/eval/model_cases.py", line 406, in forward
    carry1, carry2, out1, out2 = torch.ops.higher_order.scan(

(Refer to the full stack trace above for more information.)

ControlFlowScanCDist

forward

def forward(self, x):
    carry, out = torch.ops.higher_order.scan(
        ControlFlowScanCDist.dist,
        [x],
        [x],
        # dim=0,  # 01/31/2025, not supported anymore
        additional_inputs=[],
    )
    return out

custom

  • inputs: #1[(T1s3x4,)]

  • shapes: dict(x:{0:Dim(batch)})

<
   ir_version: 8,
   opset_import: ["" : 18, "local_functions" : 1],
   metadata_props: ["large_model": "False", "inline": "True", "external_threshold": "1024", "function_options": "FunctionOptions()", "optimize": "True", "n_initializers": "0", "n_large_initializers": "0", "size_initializers": "0", "size_large_initializers": "0", "n_nodes": "1", "n_nodes_other_domain": "0", "mask_outputs": "[True]", "input_args": "(T1s3x4[-1.0,9.0:A3.5],)", "input_kwargs": "None", "optimizations": "OptimizationOptions(constant_folding={'Transpose', 'Cast', 'Reshape', 'Concat'}, patterns=[BatchNormalizationPattern(), BatchNormalizationTrainingPattern(), CastLayerNormalizationCastPattern(), CastPattern(), CastCastBinaryPattern(), CastOpCastPattern(), ClipClipPattern(), ComputationCastOpCastPattern(), ConcatGatherPattern(), ConvBiasNullPattern(), DropoutPattern(), ExpandPattern(), ExpandBroadcastPattern(), ExpandSwapPattern(), GeluPattern(), IdentityPattern(), LayerNormalizationPattern(), LayerNormalizationScalePattern(), LeakyReluPattern(), MulMulMulScalarPattern(), ReduceReshapePattern(), ReduceSumNormalizePattern(), ReshapePattern(), ReshapeMatMulReshapePattern(), Reshape2Of3Pattern(), ReshapeReshapeBinaryPattern(), MatMulAddPattern(), GemmTransposePattern(), MatMulReshape2Of3Pattern(), MulMulMatMulPattern(), ReshapeReshapePattern(), RotaryConcatPartPattern(), SameChildrenPattern(), SequenceConstructAtPattern(), SliceSlicePattern(), SlicesSplitPattern(), SoftmaxCrossEntropyLossCastPattern(), SplitConcatPattern(), SqueezeAddPattern(), SqueezeUnsqueezePattern(), Sub1MulPattern(), SwitchOrderBinaryPattern(), SwitchReshapeActivationPattern(), TransposeEqualReshapePattern(), TransposeMatMulPattern(), TransposeReshapeMatMulPattern(), TransposeReshapeTransposePattern(), TransposeTransposePattern(), UnsqueezeEqualPattern(), UnsqueezeUnsqueezePattern()])", "dynamic_shapes": "{'x': {0: Dim('batch', min=0)}}", "_discovered_shape_constraints": "{'batch': {'s77'}, 's77': {'batch'}}"]
>
experiment (float[batch,4] x) => (float[batch,batch] output_0) 
   <float[batch,4] "scan#0", float[batch,batch] getitem_1, float[batch,batch] "scan#1", float[batch,4] getitem>
{
   [scan] "scan#0", output_0 = Scan (x, x) <body: graph = experiment ( init_0_x,  scan_0_x) => ( output_0,  output_1) {
      [init2cst4] "init7_s2_1_-12" = Constant <value: tensor = int64[2] "init7_s2_1_-1" {1,-1}> ()
      [init2cst22] init7_s1_12 = Constant <value: tensor = int64[1] init7_s1_1 {1}> ()
      [init2cst32] init1_s_2 = Constant <value: tensor = float init1_s_ {0.5}> ()
      [reshape2] reshape2 = Reshape (scan_0_x, "init7_s2_1_-12")
      [sub_Tensor2] sub2 = Sub (init_0_x, reshape2)
      [mul_Tensor2] mul2 = Mul (sub2, sub2)
      [sum2] sum_12 = ReduceSum <keepdims: int = 0> (mul2, init7_s1_12)
      [pow_Tensor_Scalar2] output_1 = Pow (sum_12, init1_s_2)
      [".output22"] output_0 = Identity (init_0_x)
   }, num_scan_inputs: int = 1, scan_input_directions: ints = [0], scan_output_axes: ints = [0], scan_output_directions: ints = [0]>
}

dynamo-ir

  • inputs: #1[(T1s3x4,)]

  • shapes: dict(x:{0:Dim(batch)})

<
   ir_version: 10,
   opset_import: ["" : 18],
   producer_name: "pytorch",
   producer_version: "2.9.0.dev20250701+cu126"
>
main_graph (float[batch,4] x) => (float[batch,batch] getitem_1) 
   <float[batch,4] scan__0>
{
   [node_scan__1] scan__0, getitem_1 = Scan (x, x) <body: graph = scan_combine_graph_0 (float[s77,4] x_scan_combine_graph_0__subgraph_in, float[4] x_scan_combine_graph_0__subgraph_in) => (float[s77,4] clone_scan_combine_graph_0, float[s77] pow_1_scan_combine_graph_0) 
      <int64[2] val_1, float[1,4] view, float[s77,4] sub_1, float[s77,4] mul_4, int64[1] val_5, float[s77] sum_1, float val_6>
{
      [node_Constant_4] val_1 = Constant <value: tensor = int64[2] val_1 {1,-1}> ()
      [node_view] view = Reshape <allowzero: int = 1> (x_scan_combine_graph_0__subgraph_in, val_1)
      [node_sub_1] sub_1 = Sub (x_scan_combine_graph_0__subgraph_in, view)
      [node_mul_4] mul_4 = Mul (sub_1, sub_1)
      [node_Constant_7] val_5 = Constant <value: tensor = int64[1] val_5 {1}> ()
      [node_sum_1] sum_1 = ReduceSum <noop_with_empty_axes: int = 0, keepdims: int = 0> (mul_4, val_5)
      [node_Constant_6] val_6 = Constant <value: tensor = float {0.5}> ()
      [node_pow_1] pow_1_scan_combine_graph_0 = Pow (sum_1, val_6)
      [node_clone] clone_scan_combine_graph_0 = Identity (x_scan_combine_graph_0__subgraph_in)
   }, num_scan_inputs: int = 1, scan_input_directions: ints = [0], scan_output_directions: ints = [0]>
}

FAILED

[ONNXRuntimeError] : 1 : FAIL : Error: Duplicate definition-site for (x_scan_combine_graph_0__subgraph_in).

ControlFlowScanCDist2

forward

def forward(self, x):
    z = torch.tensor([0], dtype=torch.float32)
    y = x.clone()
    out = torch.ops.higher_order.scan(
        ControlFlowScanCDist2.dist,
        [z],
        [x],
        # dim=0,  # 01/31/2025, not supported anymore
        additional_inputs=[y],
    )
    return out[1]

custom

  • inputs: #1[(T1s3x4,)]

  • shapes: dict(x:{0:Dim(batch)})

<
   ir_version: 8,
   opset_import: ["" : 18, "local_functions" : 1],
   metadata_props: ["large_model": "False", "inline": "True", "external_threshold": "1024", "function_options": "FunctionOptions()", "optimize": "True", "n_initializers": "1", "n_large_initializers": "0", "size_initializers": "4", "size_large_initializers": "0", "n_nodes": "2", "n_nodes_other_domain": "0", "mask_outputs": "[True]", "input_args": "(T1s3x4[-1.0,9.0:A3.5],)", "input_kwargs": "None", "optimizations": "OptimizationOptions(constant_folding={'Transpose', 'Cast', 'Reshape', 'Concat'}, patterns=[BatchNormalizationPattern(), BatchNormalizationTrainingPattern(), CastLayerNormalizationCastPattern(), CastPattern(), CastCastBinaryPattern(), CastOpCastPattern(), ClipClipPattern(), ComputationCastOpCastPattern(), ConcatGatherPattern(), ConvBiasNullPattern(), DropoutPattern(), ExpandPattern(), ExpandBroadcastPattern(), ExpandSwapPattern(), GeluPattern(), IdentityPattern(), LayerNormalizationPattern(), LayerNormalizationScalePattern(), LeakyReluPattern(), MulMulMulScalarPattern(), ReduceReshapePattern(), ReduceSumNormalizePattern(), ReshapePattern(), ReshapeMatMulReshapePattern(), Reshape2Of3Pattern(), ReshapeReshapeBinaryPattern(), MatMulAddPattern(), GemmTransposePattern(), MatMulReshape2Of3Pattern(), MulMulMatMulPattern(), ReshapeReshapePattern(), RotaryConcatPartPattern(), SameChildrenPattern(), SequenceConstructAtPattern(), SliceSlicePattern(), SlicesSplitPattern(), SoftmaxCrossEntropyLossCastPattern(), SplitConcatPattern(), SqueezeAddPattern(), SqueezeUnsqueezePattern(), Sub1MulPattern(), SwitchOrderBinaryPattern(), SwitchReshapeActivationPattern(), TransposeEqualReshapePattern(), TransposeMatMulPattern(), TransposeReshapeMatMulPattern(), TransposeReshapeTransposePattern(), TransposeTransposePattern(), UnsqueezeEqualPattern(), UnsqueezeUnsqueezePattern()])", "dynamic_shapes": "{'x': {0: Dim('batch', min=0)}}", "_discovered_shape_constraints": "{'batch': {'s77'}, 's77': {'batch'}}"]
>
experiment (float[batch,4] x) => (float[batch,batch] output_0) 
   <float[1] c_lifted_tensor_0 =  {0}, float[batch,4] hidden_input_scan_0_clone, float[batch,4] clone, float[1] "scan#0", float[1] detach_, float[batch,batch] getitem_1, float[batch,batch] "scan#1", float[1] getitem, float[1] lift_fresh_copy>
{
   [_DONOTREMOVE_Scan_hidden_input_0] hidden_input_scan_0_clone = Identity (x)
   [scan] "scan#0", output_0 = Scan (c_lifted_tensor_0, x) <body: graph = experiment ( init_0_detach_,  scan_0_x) => ( output_0,  output_1) {
      [init2cst3] "init7_s2_1_-12" = Constant <value: tensor = int64[2] "init7_s2_1_-1" {1,-1}> ()
      [init2cst22] init7_s1_12 = Constant <value: tensor = int64[1] init7_s1_1 {1}> ()
      [reshape2] reshape2 = Reshape (scan_0_x, "init7_s2_1_-12")
      [sub_Tensor2] sub2 = Sub (hidden_input_scan_0_clone, reshape2)
      [mul_Tensor2] mul2 = Mul (sub2, sub2)
      [sum2] sum_12 = ReduceSum <keepdims: int = 0> (mul2, init7_s1_12)
      [sqrt2] output_1 = Sqrt (sum_12)
      [".output22"] output_0 = Identity (init_0_detach_)
   }, num_scan_inputs: int = 1, scan_input_directions: ints = [0], scan_output_axes: ints = [0], scan_output_directions: ints = [0]>
}

dynamo-ir

  • inputs: #1[(T1s3x4,)]

  • shapes: dict(x:{0:Dim(batch)})

<
   ir_version: 10,
   opset_import: ["" : 18],
   producer_name: "pytorch",
   producer_version: "2.9.0.dev20250701+cu126"
>
main_graph (float[batch,4] x) => (float[batch,batch] getitem_1) 
   <float[1] clone, float[1] scan__0>
{
   [node_Constant_0] clone = Constant <value: tensor = float[1] clone {0}> ()
   [node_scan__1] scan__0, getitem_1 = Scan (clone, x) <body: graph = scan_combine_graph_0 (float[1] clone_scan_combine_graph_0__subgraph_in, float[4] x_scan_combine_graph_0__subgraph_in) => (float[1] clone_scan_combine_graph_0, float[batch] sqrt_scan_combine_graph_0) 
      <int64[2] val_1, float[1,4] view, float[batch,4] sub_1, float[batch,4] mul_4, int64[1] val_5, float[batch] sum_1>
{
      [node_Constant_4] val_1 = Constant <value: tensor = int64[2] val_1 {1,-1}> ()
      [node_view] view = Reshape <allowzero: int = 1> (x_scan_combine_graph_0__subgraph_in, val_1)
      [node_sub_1] sub_1 = Sub (x, view)
      [node_mul_4] mul_4 = Mul (sub_1, sub_1)
      [node_Constant_7] val_5 = Constant <value: tensor = int64[1] val_5 {1}> ()
      [node_sum_1] sum_1 = ReduceSum <noop_with_empty_axes: int = 0, keepdims: int = 0> (mul_4, val_5)
      [node_sqrt] sqrt_scan_combine_graph_0 = Sqrt (sum_1)
      [node_clone_2] clone_scan_combine_graph_0 = Identity (clone_scan_combine_graph_0__subgraph_in)
   }, num_scan_inputs: int = 1, scan_input_directions: ints = [0], scan_output_directions: ints = [0]>
}

ControlFlowScanCDistXY

forward

def forward(self, x, y):
    carry, out = torch.ops.higher_order.scan(
        ControlFlowScanCDistXY.dist,
        [y],
        [x],
        # dim=0,  # 01/31/2025, not supported anymore
        additional_inputs=[],
    )
    return out

custom

  • inputs: #2[(T1s3x4,T1s5x4),(T1s13x14,T1s15x14)]

  • shapes: dict(x:{0:Dim(x_rows),1:Dim(dim)},y:{0:Dim(y_rows),1:Dim(dim)})

<
   ir_version: 8,
   opset_import: ["" : 18, "local_functions" : 1],
   metadata_props: ["large_model": "False", "inline": "True", "external_threshold": "1024", "function_options": "FunctionOptions()", "optimize": "True", "n_initializers": "0", "n_large_initializers": "0", "size_initializers": "0", "size_large_initializers": "0", "n_nodes": "1", "n_nodes_other_domain": "0", "mask_outputs": "[True]", "input_args": "(T1s3x4[-1.077669382095337,1.4364134073257446:A-0.06034492305479944],T1s5x4[-2.107510566711426,2.5612235069274902:A0.26090582795441153])", "input_kwargs": "None", "optimizations": "OptimizationOptions(constant_folding={'Transpose', 'Cast', 'Reshape', 'Concat'}, patterns=[BatchNormalizationPattern(), BatchNormalizationTrainingPattern(), CastLayerNormalizationCastPattern(), CastPattern(), CastCastBinaryPattern(), CastOpCastPattern(), ClipClipPattern(), ComputationCastOpCastPattern(), ConcatGatherPattern(), ConvBiasNullPattern(), DropoutPattern(), ExpandPattern(), ExpandBroadcastPattern(), ExpandSwapPattern(), GeluPattern(), IdentityPattern(), LayerNormalizationPattern(), LayerNormalizationScalePattern(), LeakyReluPattern(), MulMulMulScalarPattern(), ReduceReshapePattern(), ReduceSumNormalizePattern(), ReshapePattern(), ReshapeMatMulReshapePattern(), Reshape2Of3Pattern(), ReshapeReshapeBinaryPattern(), MatMulAddPattern(), GemmTransposePattern(), MatMulReshape2Of3Pattern(), MulMulMatMulPattern(), ReshapeReshapePattern(), RotaryConcatPartPattern(), SameChildrenPattern(), SequenceConstructAtPattern(), SliceSlicePattern(), SlicesSplitPattern(), SoftmaxCrossEntropyLossCastPattern(), SplitConcatPattern(), SqueezeAddPattern(), SqueezeUnsqueezePattern(), Sub1MulPattern(), SwitchOrderBinaryPattern(), SwitchReshapeActivationPattern(), TransposeEqualReshapePattern(), TransposeMatMulPattern(), TransposeReshapeMatMulPattern(), TransposeReshapeTransposePattern(), TransposeTransposePattern(), UnsqueezeEqualPattern(), UnsqueezeUnsqueezePattern()])", "dynamic_shapes": "{'x': {0: Dim('x_rows', min=0), 1: Dim('dim', min=0)},
 'y': {0: Dim('y_rows', min=0), 1: Dim('dim', min=0)}}", "_discovered_shape_constraints": "{'dim': {'s94', 's27'},
 's17': {'y_rows'},
 's27': {'dim'},
 's77': {'x_rows'},
 's94': {'dim'},
 'x_rows': {'s77'},
 'y_rows': {'s17'}}"]
>
experiment (float[x_rows,dim] x, float[y_rows,dim] y) => (float[x_rows,y_rows] output_0) 
   <float[y_rows,dim] "scan#0", float[x_rows,y_rows] getitem_1, float[x_rows,y_rows] "scan#1", float[y_rows,dim] getitem>
{
   [scan] "scan#0", output_0 = Scan (y, x) <body: graph = experiment ( init_0_y,  scan_0_x) => ( output_0,  output_1) {
      [init2cst3] "init7_s2_1_-12" = Constant <value: tensor = int64[2] "init7_s2_1_-1" {1,-1}> ()
      [init2cst22] init7_s1_12 = Constant <value: tensor = int64[1] init7_s1_1 {1}> ()
      [reshape2] reshape2 = Reshape (scan_0_x, "init7_s2_1_-12")
      [sub_Tensor2] sub2 = Sub (init_0_y, reshape2)
      [mul_Tensor2] mul2 = Mul (sub2, sub2)
      [sum2] sum_12 = ReduceSum <keepdims: int = 0> (mul2, init7_s1_12)
      [sqrt2] output_1 = Sqrt (sum_12)
      [".output22"] output_0 = Identity (init_0_y)
   }, num_scan_inputs: int = 1, scan_input_directions: ints = [0], scan_output_axes: ints = [0], scan_output_directions: ints = [0]>
}

dynamo-ir

  • inputs: #2[(T1s3x4,T1s5x4),(T1s13x14,T1s15x14)]

  • shapes: dict(x:{0:Dim(x_rows),1:Dim(dim)},y:{0:Dim(y_rows),1:Dim(dim)})

<
   ir_version: 10,
   opset_import: ["" : 18],
   producer_name: "pytorch",
   producer_version: "2.9.0.dev20250701+cu126"
>
main_graph (float[x_rows,dim] x, float[y_rows,dim] y) => (float[x_rows,y_rows] getitem_1) 
   <float[y_rows,dim] scan__0>
{
   [node_scan__1] scan__0, getitem_1 = Scan (y, x) <body: graph = scan_combine_graph_0 (float[s17,s27] y_scan_combine_graph_0__subgraph_in, float[s27] x_scan_combine_graph_0__subgraph_in) => (float[s17,s27] clone_scan_combine_graph_0, float[s17] sqrt_scan_combine_graph_0) 
      <int64[2] val_1_2, float[1,?] view, float[s17,?] sub_4, float[s17,?] mul_7, int64[1] val_5, float[s17] sum_1>
{
      [node_Constant_4] val_1_2 = Constant <value: tensor = int64[2] val_1_2 {1,-1}> ()
      [node_view] view = Reshape <allowzero: int = 1> (x_scan_combine_graph_0__subgraph_in, val_1_2)
      [node_sub_4] sub_4 = Sub (y_scan_combine_graph_0__subgraph_in, view)
      [node_mul_7] mul_7 = Mul (sub_4, sub_4)
      [node_Constant_7] val_5 = Constant <value: tensor = int64[1] val_5 {1}> ()
      [node_sum_1] sum_1 = ReduceSum <noop_with_empty_axes: int = 0, keepdims: int = 0> (mul_7, val_5)
      [node_sqrt] sqrt_scan_combine_graph_0 = Sqrt (sum_1)
      [node_clone] clone_scan_combine_graph_0 = Identity (y_scan_combine_graph_0__subgraph_in)
   }, num_scan_inputs: int = 1, scan_input_directions: ints = [0], scan_output_directions: ints = [0]>
}

ControlFlowScanDecomposition_151564

forward

def forward(self, images, position):
    return self.select_when_exporting(self.dummy_loop, self.dummy_loop_with_scan)(
        images, position
    )

custom

  • inputs: #1[(T1s5x6,T7s5)]

  • shapes: dict(images:{0:DYNAMIC,1:DYNAMIC},position:{0:DYNAMIC})

<
   ir_version: 8,
   opset_import: ["" : 18, "local_functions" : 1],
   metadata_props: ["large_model": "False", "inline": "True", "external_threshold": "1024", "function_options": "FunctionOptions()", "optimize": "True", "n_initializers": "0", "n_large_initializers": "0", "size_initializers": "0", "size_large_initializers": "0", "n_nodes": "1", "n_nodes_other_domain": "0", "mask_outputs": "[True]", "input_args": "(T1s5x6[-1.8678492307662964,1.425422191619873:A-0.22642420046031475],T7s5[1,5:A3.0])", "input_kwargs": "None", "optimizations": "OptimizationOptions(constant_folding={'Transpose', 'Cast', 'Reshape', 'Concat'}, patterns=[BatchNormalizationPattern(), BatchNormalizationTrainingPattern(), CastLayerNormalizationCastPattern(), CastPattern(), CastCastBinaryPattern(), CastOpCastPattern(), ClipClipPattern(), ComputationCastOpCastPattern(), ConcatGatherPattern(), ConvBiasNullPattern(), DropoutPattern(), ExpandPattern(), ExpandBroadcastPattern(), ExpandSwapPattern(), GeluPattern(), IdentityPattern(), LayerNormalizationPattern(), LayerNormalizationScalePattern(), LeakyReluPattern(), MulMulMulScalarPattern(), ReduceReshapePattern(), ReduceSumNormalizePattern(), ReshapePattern(), ReshapeMatMulReshapePattern(), Reshape2Of3Pattern(), ReshapeReshapeBinaryPattern(), MatMulAddPattern(), GemmTransposePattern(), MatMulReshape2Of3Pattern(), MulMulMatMulPattern(), ReshapeReshapePattern(), RotaryConcatPartPattern(), SameChildrenPattern(), SequenceConstructAtPattern(), SliceSlicePattern(), SlicesSplitPattern(), SoftmaxCrossEntropyLossCastPattern(), SplitConcatPattern(), SqueezeAddPattern(), SqueezeUnsqueezePattern(), Sub1MulPattern(), SwitchOrderBinaryPattern(), SwitchReshapeActivationPattern(), TransposeEqualReshapePattern(), TransposeMatMulPattern(), TransposeReshapeMatMulPattern(), TransposeReshapeTransposePattern(), TransposeTransposePattern(), UnsqueezeEqualPattern(), UnsqueezeUnsqueezePattern()])", "dynamic_shapes": "{'images': {0: Dim('DYN0', min=0), 1: Dim('DYN1', min=0)},
 'position': {0: Dim('DYN2', min=0)}}", "_discovered_shape_constraints": "{'DYN0': {'s34'},
 'DYN1': {'s90'},
 'DYN2': {'s71'},
 's34': {'DYN0'},
 's71': {'DYN2'},
 's90': {'DYN1'}}"]
>
experiment (float[batch,channel] images, int64[batch_1] position) => (float[batch,channel] output_0) 
   <float[batch,channel] "scan#0", float[batch,channel] getitem>
{
   [scan] output_0 = Scan (images, position) <body: graph = experiment ( scan_0_images,  scan_1_position) => ( output_0) {
      [init2cst2] init7_s1_02 = Constant <value: tensor = int64[1] init7_s1_0 {0}> ()
      [get_dynamic_dimension_a_item2] "item::UnSq03" = Unsqueeze (scan_1_position, init7_s1_02)
      [sym_size_int2] "padded_1::Shape:12" = Shape <end: int = 1, start: int = 0> (scan_0_images)
      [zerosA22] zeros2 = ConstantOfShape <value: tensor = float[1] {0}> ("padded_1::Shape:12")
      [slide_Tensor2] slice_12 = Slice (scan_0_images, init7_s1_02, "item::UnSq03", init7_s1_02)
      [setitem_rk022] "zeros::Shape:2" = Shape (zeros2)
      [setitem_rk032] _onx_slice_zeros2 = Slice (zeros2, "item::UnSq03", "zeros::Shape:2", init7_s1_02)
      [setitem_rk042] output_0 = Concat <axis: int = 0> (slice_12, _onx_slice_zeros2)
   }, num_scan_inputs: int = 2, scan_input_directions: ints = [0, 0], scan_output_axes: ints = [0], scan_output_directions: ints = [0]>
}

dynamo-ir

FAILED

Failed to decompose the FX graph for ONNX compatibility. This is step 2/3 of exporting the model to ONNX. Next steps:
- Create an issue in the PyTorch GitHub repository against the *torch.export* component and attach the full error stack as well as reproduction scripts.
- Create an error report with `torch.onnx.export(..., report=True)`, and save the ExportedProgram as a pt2 file. Create an issue in the PyTorch GitHub repository against the *onnx* component. Attach the error report and the pt2 model.

## Exception summary

<class 'torch.fx.experimental.symbolic_shapes.GuardOnDataDependentSymNode'>: Could not guard on data-dependent expression u1 < 0 (unhinted: u1 < 0).  (Size-like symbols: none)

Caused by: (_decomp/decompositions.py:745 in slice_forward)
For more information, run with TORCH_LOGS="dynamic"
For extended logs when we create symbols, also add TORCHDYNAMO_EXTENDED_DEBUG_CREATE_SYMBOL="u1"
If you suspect the guard was triggered from C++, add TORCHDYNAMO_EXTENDED_DEBUG_CPP=1
For more debugging help, see https://docs.google.com/document/d/1HSuTTVvYH1pTew89Rtpeu84Ht3nQEFTYhAX3Ypa_xJs/edit?usp=sharing

For C++ stack trace, run with TORCHDYNAMO_EXTENDED_DEBUG_CPP=1

While executing %scan : [num_users=1] = call_function[target=torch.ops.higher_order.scan](args = (%scan_combine_graph_0, [], [%images, %position], ()), kwargs = {})
GraphModule: class GraphModule(torch.nn.Module):
    def forward(self, images, position):
        images: "f32[s34, s90][s90, 1]"; position: "i64[s71][1]"; 

        images, position, = fx_pytree.tree_flatten_spec(([images, position], {}), self._in_spec)
         # File: ~/github/onnx-diagnostic/onnx_diagnostic/torch_export_patches/eval/model_cases.py:568 in forward, code: return self.select_when_exporting(self.dummy_loop, self.dummy_loop_with_scan)(
        scan_combine_graph_0 = self.scan_combine_graph_0
        scan = torch.ops.higher_order.scan(scan_combine_graph_0, [], [images, position], ());  scan_combine_graph_0 = images = position = None
        getitem: "f32[s34, s90][s90, 1]" = scan[0];  scan = None
        return pytree.tree_unflatten((getitem,), self._out_spec)
    
    class scan_combine_graph_0(torch.nn.Module):
        def forward(self, padded_1: "f32[s90][1]", p_1: "i64[][]"):
             # File: ~/github/onnx-diagnostic/onnx_diagnostic/torch_export_patches/eval/model_cases.py:568 in forward, code: return self.select_when_exporting(self.dummy_loop, self.dummy_loop_with_scan)(
            sym_size_int: "Sym(s90)" = torch.ops.aten.sym_size.int(padded_1, 0)
            zeros: "f32[s90][1]" = torch.ops.aten.zeros.default([sym_size_int], device = device(type='cpu'), pin_memory = False);  sym_size_int = None
            item: "Sym(u0)" = torch.ops.aten.item.default(p_1);  p_1 = None
            slice_1: "f32[u0][1]" = torch.ops.aten.slice.Tensor(padded_1, 0, 0, item);  padded_1 = None
            slice_2: "f32[u0][1]" = torch.ops.aten.slice.Tensor(zeros, 0, 0, item);  item = None
            copy_: "f32[u0][1]" = torch.ops.aten.copy_.default(slice_2, slice_1);  slice_2 = slice_1 = copy_ = None
            return (zeros,)
        

Original traceback:
File "~/github/onnx-diagnostic/onnx_diagnostic/torch_export_patches/eval/model_cases.py", line 568, in forward
    return self.select_when_exporting(self.dummy_loop, self.dummy_loop_with_scan)(

(Refer to the full stack trace above for more information.)

ControlFlowScanInplace_153705

forward

def forward(self, x, y):
    def loop_body_1(z, iv, x, y):
        z = z.clone()
        i = iv.item()
        z[i, :] = ((x[i, :] - y) ** 2).sum(dim=-1)
        return [z, iv]

    z = torch.empty((x.shape[0], y.shape[0]))
    r = torch.ops.higher_order.scan(
        loop_body_1, [z], [torch.arange(x.shape[0], dtype=torch.int64)], [x, y]
    )
    return r[0]

custom

FAILED

only integers, slices (`:`), ellipsis (`...`), None and long or byte Variables are valid indices (got SymInt)

dynamo-ir

FAILED

Failed to decompose the FX graph for ONNX compatibility. This is step 2/3 of exporting the model to ONNX. Next steps:
- Create an issue in the PyTorch GitHub repository against the *torch.export* component and attach the full error stack as well as reproduction scripts.
- Create an error report with `torch.onnx.export(..., report=True)`, and save the ExportedProgram as a pt2 file. Create an issue in the PyTorch GitHub repository against the *onnx* component. Attach the error report and the pt2 model.

## Exception summary

<class 'RuntimeError'>: scan might be aliasing the input or the output!

While executing %scan : [num_users=2] = call_function[target=torch.ops.higher_order.scan](args = (%scan_combine_graph_0, [%empty], [%arange], (%x, %y)), kwargs = {})
GraphModule: class GraphModule(torch.nn.Module):
    def forward(self, x, y):
        x: "f32[s77, s27][s27, 1]"; y: "f32[s17, s27][s27, 1]"; 

        x, y, = fx_pytree.tree_flatten_spec(([x, y], {}), self._in_spec)
         # 
        sym_size_int_2: "Sym(s77)" = torch.ops.aten.sym_size.int(x, 0)
        sym_size_int_3: "Sym(s27)" = torch.ops.aten.sym_size.int(x, 1)
        sym_size_int_4: "Sym(s17)" = torch.ops.aten.sym_size.int(y, 0)
        sym_size_int_5: "Sym(s27)" = torch.ops.aten.sym_size.int(y, 1)
        eq: "Sym(True)" = sym_size_int_3 == sym_size_int_5;  sym_size_int_3 = sym_size_int_5 = None
        _assert_scalar_default = torch.ops.aten._assert_scalar.default(eq, "Runtime assertion failed for expression Eq(s27, s94) on node 'eq'");  eq = _assert_scalar_default = None
    
         # File: ~/github/onnx-diagnostic/onnx_diagnostic/torch_export_patches/eval/model_cases.py:517 in forward, code: z = torch.empty((x.shape[0], y.shape[0]))
        empty: "f32[s77, s17][s17, 1]" = torch.ops.aten.empty.memory_format([sym_size_int_2, sym_size_int_4], device = device(type='cpu'), pin_memory = False);  sym_size_int_4 = None
    
         # File: ~/github/onnx-diagnostic/onnx_diagnostic/torch_export_patches/eval/model_cases.py:519 in forward, code: loop_body_1, [z], [torch.arange(x.shape[0], dtype=torch.int64)], [x, y]
        arange: "i64[s77][1]" = torch.ops.aten.arange.default(sym_size_int_2, dtype = torch.int64, device = device(type='cpu'), pin_memory = False);  sym_size_int_2 = None
    
         # File: ~/github/onnx-diagnostic/onnx_diagnostic/torch_export_patches/eval/model_cases.py:518 in forward, code: r = torch.ops.higher_order.scan(
        scan_combine_graph_0 = self.scan_combine_graph_0
        scan = torch.ops.higher_order.scan(scan_combine_graph_0, [empty], [arange], (x, y));  scan_combine_graph_0 = empty = arange = x = y = None
        getitem: "f32[s77, s17][s17, 1]" = scan[0]
        getitem_1: "i64[s77][1]" = scan[1];  scan = getitem_1 = None
        return pytree.tree_unflatten((getitem,), self._out_spec)
    
    class scan_combine_graph_0(torch.nn.Module):
        def forward(self, z_1: "f32[s77, s17][s17, 1]", iv_1: "i64[][]", x_1: "f32[s77, s27][s27, 1]", y_1: "f32[s17, s27][s27, 1]"):
             # File: ~/github/onnx-diagnostic/onnx_diagnostic/torch_export_patches/eval/model_cases.py:518 in forward, code: r = torch.ops.higher_order.scan(
            clone: "f32[s77, s17][s17, 1]" = torch.ops.aten.clone.default(z_1);  z_1 = None
            select: "f32[s27][1]" = torch.ops.aten.select.int(x_1, 0, 0);  x_1 = None
            slice_1: "f32[s27][1]" = torch.ops.aten.slice.Tensor(select, 0, 0, 9223372036854775807);  select = None
            sub: "f32[s17, s27][s27, 1]" = torch.ops.aten.sub.Tensor(slice_1, y_1);  slice_1 = y_1 = None
            pow_1: "f32[s17, s27][s27, 1]" = torch.ops.aten.pow.Tensor_Scalar(sub, 2);  sub = None
            sum_1: "f32[s17][1]" = torch.ops.aten.sum.dim_IntList(pow_1, [-1]);  pow_1 = None
            select_1: "f32[s17][1]" = torch.ops.aten.select.int(clone, 0, 0)
            slice_2: "f32[s17][1]" = torch.ops.aten.slice.Tensor(select_1, 0, 0, 9223372036854775807);  select_1 = None
            copy_: "f32[s17][1]" = torch.ops.aten.copy_.default(slice_2, sum_1);  slice_2 = sum_1 = copy_ = None
            return [clone, iv_1]
        

Original traceback:
File "~/github/onnx-diagnostic/onnx_diagnostic/torch_export_patches/eval/model_cases.py", line 518, in forward
    r = torch.ops.higher_order.scan(

(Refer to the full stack trace above for more information.)

CreateFromShape

forward

def forward(self, x):
    y = torch.ones((x.shape[0], x.shape[1] + 1))
    return y

custom

  • inputs: #2[(T1s4x4,),(T1s5x5,)]

  • shapes: dict(x:{0:Dim(dx),1:Dim(dy)})

<
   ir_version: 8,
   opset_import: ["" : 18],
   metadata_props: ["large_model": "False", "inline": "True", "external_threshold": "1024", "function_options": "FunctionOptions()", "optimize": "True", "n_initializers": "2", "n_large_initializers": "0", "size_initializers": "16", "size_large_initializers": "0", "n_nodes": "7", "n_nodes_other_domain": "0", "mask_outputs": "[True]", "input_args": "(T1s4x4[0.028293073177337646,0.9380697011947632:A0.5150020532310009],)", "input_kwargs": "None", "optimizations": "OptimizationOptions(constant_folding={'Transpose', 'Cast', 'Reshape', 'Concat'}, patterns=[BatchNormalizationPattern(), BatchNormalizationTrainingPattern(), CastLayerNormalizationCastPattern(), CastPattern(), CastCastBinaryPattern(), CastOpCastPattern(), ClipClipPattern(), ComputationCastOpCastPattern(), ConcatGatherPattern(), ConvBiasNullPattern(), DropoutPattern(), ExpandPattern(), ExpandBroadcastPattern(), ExpandSwapPattern(), GeluPattern(), IdentityPattern(), LayerNormalizationPattern(), LayerNormalizationScalePattern(), LeakyReluPattern(), MulMulMulScalarPattern(), ReduceReshapePattern(), ReduceSumNormalizePattern(), ReshapePattern(), ReshapeMatMulReshapePattern(), Reshape2Of3Pattern(), ReshapeReshapeBinaryPattern(), MatMulAddPattern(), GemmTransposePattern(), MatMulReshape2Of3Pattern(), MulMulMatMulPattern(), ReshapeReshapePattern(), RotaryConcatPartPattern(), SameChildrenPattern(), SequenceConstructAtPattern(), SliceSlicePattern(), SlicesSplitPattern(), SoftmaxCrossEntropyLossCastPattern(), SplitConcatPattern(), SqueezeAddPattern(), SqueezeUnsqueezePattern(), Sub1MulPattern(), SwitchOrderBinaryPattern(), SwitchReshapeActivationPattern(), TransposeEqualReshapePattern(), TransposeMatMulPattern(), TransposeReshapeMatMulPattern(), TransposeReshapeTransposePattern(), TransposeTransposePattern(), UnsqueezeEqualPattern(), UnsqueezeUnsqueezePattern()])", "dynamic_shapes": "{'x': {0: Dim('dx', min=0), 1: Dim('dy', min=0)}}", "_discovered_shape_constraints": "{'dx': {'s77'}, 'dy': {'s27'}, 's27': {'dy'}, 's77': {'dx'}}", "_known_value_shapes": "{
  _onx_add_sym_size_int_3: dy+1,
  _onx_concat_sym_size_int_2::UnSq0: ('dx', 'add'),
  add::UnSq0: ('add',),
  sym_size_int_2: dx,
  sym_size_int_2::UnSq0: ('dx',),
  sym_size_int_3: dy,
  x::Shape1:2: ('dy',),
  x::Shape:1: ('dx',),
}"]
>
experiment (float[dx,dy] x) => (float[dx,dy+1] output_0) 
   <int64 init7_s_1 =  {1}, int64[1] init7_s1_0 =  {0}, int64[1] "init7_s_1::RSh1", int64 _onx_add_sym_size_int_3, int64 sym_size_int_3, int64[1] "sym_size_int_3::RSh1", float[dx,dy+1] ones, int64[1] "x::Shape1:2", int64[1] "add::UnSq0", int64[1] "sym_size_int_2::UnSq0", int64[1] init7_s1_1, int64 add, int64[1] "x::Shape:1", int64 sym_size_int_2>
{
   [sym_size_int] "x::Shape:1" = Shape <end: int = 1, start: int = 0> (x)
   [sym_size_int3] "x::Shape1:2" = Shape <end: int = 2, start: int = 1> (x)
   [sym_size_int4] sym_size_int_3 = Squeeze ("x::Shape1:2")
   [add3] _onx_add_sym_size_int_3 = Add (sym_size_int_3, init7_s_1)
   [_mkshape1_add] "add::UnSq0" = Unsqueeze (_onx_add_sym_size_int_3, init7_s1_0)
   [_mkshape_add] "_onx_concat_sym_size_int_2::UnSq0" = Concat <axis: int = 0> ("x::Shape:1", "add::UnSq0")
   [ones] output_0 = ConstantOfShape <value: tensor = float[1] {1}> ("_onx_concat_sym_size_int_2::UnSq0")
}

dynamo-ir

  • inputs: #2[(T1s4x4,),(T1s5x5,)]

  • shapes: dict(x:{0:Dim(dx),1:Dim(dy)})

<
   ir_version: 10,
   opset_import: ["" : 18],
   producer_name: "pytorch",
   producer_version: "2.9.0.dev20250701+cu126"
>
main_graph (float[dx,dy] x) => (float[dx,dy + 1] ones) 
   <int64[1] val_0, int64[1] val_1, int64 sym_size_int_3, int64 val_2, int64 add, int64[1] val_5, int64[1] val_6, int64[2] val_7, float val_10>
{
   [node_Shape_0] val_0 = Shape <end: int = 1, start: int = 0> (x)
   [node_Shape_1] val_1 = Shape <end: int = 2, start: int = 1> (x)
   [node_sym_size_int_3] sym_size_int_3 = Squeeze (val_1)
   [node_Constant_2] val_2 = Constant <value: tensor = int64 {1}> ()
   [node_add] add = Add (sym_size_int_3, val_2)
   [node_Constant_5] val_5 = Constant <value: tensor = int64[1] {-1}> ()
   [node_Reshape_6] val_6 = Reshape <allowzero: int = 0> (add, val_5)
   [node_Concat_7] val_7 = Concat <axis: int = 0> (val_0, val_6)
   [node_Constant_13] val_10 = Constant <value: tensor = float val_10 {1}> ()
   [node_ones] ones = Expand (val_10, val_7)
}

CreateFromShapeThroughFunction

forward

def forward(self, x):
    dy1 = CreateFromShapeThroughFunction.add_one(x.shape[1])
    y = torch.ones((x.shape[0], dy1))
    return y

custom

  • inputs: #2[(T1s4x4,),(T1s5x5,)]

  • shapes: dict(x:{0:Dim(dx),1:Dim(dy)})

<
   ir_version: 8,
   opset_import: ["" : 18],
   metadata_props: ["large_model": "False", "inline": "True", "external_threshold": "1024", "function_options": "FunctionOptions()", "optimize": "True", "n_initializers": "2", "n_large_initializers": "0", "size_initializers": "16", "size_large_initializers": "0", "n_nodes": "7", "n_nodes_other_domain": "0", "mask_outputs": "[True]", "input_args": "(T1s4x4[0.021807074546813965,0.965708315372467:A0.5018488876521587],)", "input_kwargs": "None", "optimizations": "OptimizationOptions(constant_folding={'Transpose', 'Cast', 'Reshape', 'Concat'}, patterns=[BatchNormalizationPattern(), BatchNormalizationTrainingPattern(), CastLayerNormalizationCastPattern(), CastPattern(), CastCastBinaryPattern(), CastOpCastPattern(), ClipClipPattern(), ComputationCastOpCastPattern(), ConcatGatherPattern(), ConvBiasNullPattern(), DropoutPattern(), ExpandPattern(), ExpandBroadcastPattern(), ExpandSwapPattern(), GeluPattern(), IdentityPattern(), LayerNormalizationPattern(), LayerNormalizationScalePattern(), LeakyReluPattern(), MulMulMulScalarPattern(), ReduceReshapePattern(), ReduceSumNormalizePattern(), ReshapePattern(), ReshapeMatMulReshapePattern(), Reshape2Of3Pattern(), ReshapeReshapeBinaryPattern(), MatMulAddPattern(), GemmTransposePattern(), MatMulReshape2Of3Pattern(), MulMulMatMulPattern(), ReshapeReshapePattern(), RotaryConcatPartPattern(), SameChildrenPattern(), SequenceConstructAtPattern(), SliceSlicePattern(), SlicesSplitPattern(), SoftmaxCrossEntropyLossCastPattern(), SplitConcatPattern(), SqueezeAddPattern(), SqueezeUnsqueezePattern(), Sub1MulPattern(), SwitchOrderBinaryPattern(), SwitchReshapeActivationPattern(), TransposeEqualReshapePattern(), TransposeMatMulPattern(), TransposeReshapeMatMulPattern(), TransposeReshapeTransposePattern(), TransposeTransposePattern(), UnsqueezeEqualPattern(), UnsqueezeUnsqueezePattern()])", "dynamic_shapes": "{'x': {0: Dim('dx', min=0), 1: Dim('dy', min=0)}}", "_discovered_shape_constraints": "{'dx': {'s77'}, 'dy': {'s27'}, 's27': {'dy'}, 's77': {'dx'}}", "_known_value_shapes": "{
  _onx_add_sym_size_int_3: dy+1,
  _onx_concat_sym_size_int_2::UnSq0: ('dx', 'add'),
  add::UnSq0: ('add',),
  sym_size_int_2: dx,
  sym_size_int_2::UnSq0: ('dx',),
  sym_size_int_3: dy,
  x::Shape1:2: ('dy',),
  x::Shape:1: ('dx',),
}"]
>
experiment (float[dx,dy] x) => (float[dx,dy+1] output_0) 
   <int64 init7_s_1 =  {1}, int64[1] init7_s1_0 =  {0}, int64[1] "init7_s_1::RSh1", int64 _onx_add_sym_size_int_3, int64 sym_size_int_3, int64[1] "sym_size_int_3::RSh1", float[dx,dy+1] ones, int64[1] "x::Shape1:2", int64[1] "add::UnSq0", int64[1] "sym_size_int_2::UnSq0", int64[1] init7_s1_1, int64 add, int64[1] "x::Shape:1", int64 sym_size_int_2>
{
   [sym_size_int] "x::Shape:1" = Shape <end: int = 1, start: int = 0> (x)
   [sym_size_int3] "x::Shape1:2" = Shape <end: int = 2, start: int = 1> (x)
   [sym_size_int4] sym_size_int_3 = Squeeze ("x::Shape1:2")
   [add3] _onx_add_sym_size_int_3 = Add (sym_size_int_3, init7_s_1)
   [_mkshape1_add] "add::UnSq0" = Unsqueeze (_onx_add_sym_size_int_3, init7_s1_0)
   [_mkshape_add] "_onx_concat_sym_size_int_2::UnSq0" = Concat <axis: int = 0> ("x::Shape:1", "add::UnSq0")
   [ones] output_0 = ConstantOfShape <value: tensor = float[1] {1}> ("_onx_concat_sym_size_int_2::UnSq0")
}

dynamo-ir

  • inputs: #2[(T1s4x4,),(T1s5x5,)]

  • shapes: dict(x:{0:Dim(dx),1:Dim(dy)})

<
   ir_version: 10,
   opset_import: ["" : 18],
   producer_name: "pytorch",
   producer_version: "2.9.0.dev20250701+cu126"
>
main_graph (float[dx,dy] x) => (float[dx,dy + 1] ones) 
   <int64[1] val_0, int64[1] val_1, int64 sym_size_int_3, int64 val_2, int64 add, int64[1] val_5, int64[1] val_6, int64[2] val_7, float val_10>
{
   [node_Shape_0] val_0 = Shape <end: int = 1, start: int = 0> (x)
   [node_Shape_1] val_1 = Shape <end: int = 2, start: int = 1> (x)
   [node_sym_size_int_3] sym_size_int_3 = Squeeze (val_1)
   [node_Constant_2] val_2 = Constant <value: tensor = int64 {1}> ()
   [node_add] add = Add (sym_size_int_3, val_2)
   [node_Constant_5] val_5 = Constant <value: tensor = int64[1] {-1}> ()
   [node_Reshape_6] val_6 = Reshape <allowzero: int = 0> (add, val_5)
   [node_Concat_7] val_7 = Concat <axis: int = 0> (val_0, val_6)
   [node_Constant_13] val_10 = Constant <value: tensor = float val_10 {1}> ()
   [node_ones] ones = Expand (val_10, val_7)
}

CropLastDimensionWithTensorContent

forward

def forward(self, x, shape):
    return x[..., : shape[0]]

custom

FAILED

Could not extract specialized integer from data-dependent expression u0 (unhinted: u0).  (Size-like symbols: none)

Caused by: (_export/non_strict_utils.py:1054 in __torch_function__)
For more information, run with TORCH_LOGS="dynamic"
For extended logs when we create symbols, also add TORCHDYNAMO_EXTENDED_DEBUG_CREATE_SYMBOL="u0"
If you suspect the guard was triggered from C++, add TORCHDYNAMO_EXTENDED_DEBUG_CPP=1
For more debugging help, see https://docs.google.com/document/d/1HSuTTVvYH1pTew89Rtpeu84Ht3nQEFTYhAX3Ypa_xJs/edit?usp=sharing

For C++ stack trace, run with TORCHDYNAMO_EXTENDED_DEBUG_CPP=1

The following call raised this error:
  File "~/github/onnx-diagnostic/onnx_diagnostic/torch_export_patches/eval/model_cases.py", line 818, in forward
    return x[..., : shape[0]]


The error above occurred when calling torch.export.export. If you would like to view some more information about this error, and get a list of all other errors that may occur in your export call, you can replace your `export()` call with `draft_export()`.

dynamo-ir

  • inputs: #2[(T1s3x4x4,T7s1),(T1s6x4x4,T7s1)]

  • shapes: dict(x:{0:Dim(batch)},shape:{})

<
   ir_version: 10,
   opset_import: ["" : 18],
   producer_name: "pytorch",
   producer_version: "2.9.0.dev20250701+cu126"
>
main_graph (float[batch,4,4] x, int64[1] shape) => (float[batch,4,2] slice_1) 
   <int64[1] val_7, int64[1] val_10, int64[1] val_14>
{
   [node_Constant_17] val_7 = Constant <value: tensor = int64[1] val_7 {0}> ()
   [node_Constant_20] val_10 = Constant <value: tensor = int64[1] val_10 {2}> ()
   [node_Constant_14] val_14 = Constant <value_ints: ints = [1]> ()
   [node_slice_1] slice_1 = Slice (x, val_7, val_10, val_10, val_14)
}

FAILED

diff.1

CropLastDimensionWithTensorShape

forward

def forward(self, x, y):
    return x[..., : y.shape[0]]

custom

  • inputs: #2[(T1s3x4x4,T1s2),(T1s6x4x4,T1s3)]

  • shapes: dict(x:{0:Dim(batch)},y:{0:Dim(crop)})

<
   ir_version: 8,
   opset_import: ["" : 18],
   metadata_props: ["large_model": "False", "inline": "True", "external_threshold": "1024", "function_options": "FunctionOptions()", "optimize": "True", "n_initializers": "2", "n_large_initializers": "0", "size_initializers": "16", "size_large_initializers": "0", "n_nodes": "2", "n_nodes_other_domain": "0", "mask_outputs": "[True]", "input_args": "(T1s3x4x4[0.003647148609161377,0.9746490716934204:A0.49173058196902275],T1s2[0.036011695861816406,0.23263895511627197:A0.1343253254890442])", "input_kwargs": "None", "optimizations": "OptimizationOptions(constant_folding={'Transpose', 'Cast', 'Reshape', 'Concat'}, patterns=[BatchNormalizationPattern(), BatchNormalizationTrainingPattern(), CastLayerNormalizationCastPattern(), CastPattern(), CastCastBinaryPattern(), CastOpCastPattern(), ClipClipPattern(), ComputationCastOpCastPattern(), ConcatGatherPattern(), ConvBiasNullPattern(), DropoutPattern(), ExpandPattern(), ExpandBroadcastPattern(), ExpandSwapPattern(), GeluPattern(), IdentityPattern(), LayerNormalizationPattern(), LayerNormalizationScalePattern(), LeakyReluPattern(), MulMulMulScalarPattern(), ReduceReshapePattern(), ReduceSumNormalizePattern(), ReshapePattern(), ReshapeMatMulReshapePattern(), Reshape2Of3Pattern(), ReshapeReshapeBinaryPattern(), MatMulAddPattern(), GemmTransposePattern(), MatMulReshape2Of3Pattern(), MulMulMatMulPattern(), ReshapeReshapePattern(), RotaryConcatPartPattern(), SameChildrenPattern(), SequenceConstructAtPattern(), SliceSlicePattern(), SlicesSplitPattern(), SoftmaxCrossEntropyLossCastPattern(), SplitConcatPattern(), SqueezeAddPattern(), SqueezeUnsqueezePattern(), Sub1MulPattern(), SwitchOrderBinaryPattern(), SwitchReshapeActivationPattern(), TransposeEqualReshapePattern(), TransposeMatMulPattern(), TransposeReshapeMatMulPattern(), TransposeReshapeTransposePattern(), TransposeTransposePattern(), UnsqueezeEqualPattern(), UnsqueezeUnsqueezePattern()])", "dynamic_shapes": "{'x': {0: Dim('batch', min=0)}, 'y': {0: Dim('crop', min=1, max=3)}}", "_discovered_shape_constraints": "{'batch': {'s77'}, 'crop': {'s17'}, 's17': {'crop'}, 's77': {'batch'}}", "_known_value_shapes": "{
  sym_size_int_2: crop,
  sym_size_int_2::UnSq0: ('crop',),
  y::Shape:1: ('crop',),
}"]
>
experiment (float[batch,4,4] x, float[crop] y) => (float[batch,4,crop] output_0) 
   <int64[1] init7_s1_0 =  {0}, int64[1] init7_s1_2 =  {2}, int64[1] "y::Shape:1", int64[1] "sym_size_int_2::UnSq0", float[batch,4,crop] slice_1, int64 sym_size_int_2>
{
   [sym_size_int] "y::Shape:1" = Shape <end: int = 1, start: int = 0> (y)
   [slide_Tensor] output_0 = Slice (x, init7_s1_0, "y::Shape:1", init7_s1_2)
}

dynamo-ir

  • inputs: #2[(T1s3x4x4,T1s2),(T1s6x4x4,T1s3)]

  • shapes: dict(x:{0:Dim(batch)},y:{0:Dim(crop)})

<
   ir_version: 10,
   opset_import: ["" : 18],
   producer_name: "pytorch",
   producer_version: "2.9.0.dev20250701+cu126"
>
main_graph (float[batch,4,4] x, float[crop] y) => (float[batch,4,crop] slice_1) 
   <int64[1] val_0, int64[1] val_4, int64[1] val_11, int64[1] val_12>
{
   [node_Shape_0] val_0 = Shape <end: int = 1, start: int = 0> (y)
   [node_Constant_15] val_4 = Constant <value: tensor = int64[1] val_4 {0}> ()
   [node_Constant_19] val_11 = Constant <value: tensor = int64[1] val_11 {2}> ()
   [node_Constant_12] val_12 = Constant <value_ints: ints = [1]> ()
   [node_slice_1] slice_1 = Slice (x, val_4, val_0, val_11, val_12)
}

InplaceAdd

forward

def forward(self, x):
    x += self.bias
    return x

custom

  • inputs: #2[(T1s3x4,),(T1s5x4,)]

  • shapes: dict(x:{0:Dim(batch)})

<
   ir_version: 8,
   opset_import: ["" : 18],
   metadata_props: ["large_model": "False", "inline": "True", "external_threshold": "1024", "function_options": "FunctionOptions()", "optimize": "True", "n_initializers": "1", "n_large_initializers": "0", "size_initializers": "16", "size_large_initializers": "0", "n_nodes": "1", "n_nodes_other_domain": "0", "mask_outputs": "[True]", "input_args": "(T1s3x4[6.029128551483154,6.933902263641357:A6.421988209088643],)", "input_kwargs": "None", "optimizations": "OptimizationOptions(constant_folding={'Transpose', 'Cast', 'Reshape', 'Concat'}, patterns=[BatchNormalizationPattern(), BatchNormalizationTrainingPattern(), CastLayerNormalizationCastPattern(), CastPattern(), CastCastBinaryPattern(), CastOpCastPattern(), ClipClipPattern(), ComputationCastOpCastPattern(), ConcatGatherPattern(), ConvBiasNullPattern(), DropoutPattern(), ExpandPattern(), ExpandBroadcastPattern(), ExpandSwapPattern(), GeluPattern(), IdentityPattern(), LayerNormalizationPattern(), LayerNormalizationScalePattern(), LeakyReluPattern(), MulMulMulScalarPattern(), ReduceReshapePattern(), ReduceSumNormalizePattern(), ReshapePattern(), ReshapeMatMulReshapePattern(), Reshape2Of3Pattern(), ReshapeReshapeBinaryPattern(), MatMulAddPattern(), GemmTransposePattern(), MatMulReshape2Of3Pattern(), MulMulMatMulPattern(), ReshapeReshapePattern(), RotaryConcatPartPattern(), SameChildrenPattern(), SequenceConstructAtPattern(), SliceSlicePattern(), SlicesSplitPattern(), SoftmaxCrossEntropyLossCastPattern(), SplitConcatPattern(), SqueezeAddPattern(), SqueezeUnsqueezePattern(), Sub1MulPattern(), SwitchOrderBinaryPattern(), SwitchReshapeActivationPattern(), TransposeEqualReshapePattern(), TransposeMatMulPattern(), TransposeReshapeMatMulPattern(), TransposeReshapeTransposePattern(), TransposeTransposePattern(), UnsqueezeEqualPattern(), UnsqueezeUnsqueezePattern()])", "dynamic_shapes": "{'x': {0: Dim('batch', min=0)}}", "_discovered_shape_constraints": "{'batch': {'s77'}, 's77': {'batch'}}"]
>
experiment (float[batch,4] x) => (float[batch,4] output_0) 
   <float[1,4] c_bias =  {1,1,1,1}, float[batch,4] add_>
{
   [add__Tensor] output_0 = Add (x, c_bias)
}

dynamo-ir

  • inputs: #2[(T1s3x4,),(T1s5x4,)]

  • shapes: dict(x:{0:Dim(batch)})

<
   ir_version: 10,
   opset_import: ["" : 18],
   producer_name: "pytorch",
   producer_version: "2.9.0.dev20250701+cu126"
>
main_graph (float[batch,4] x) => (float[batch,4] add_3) 
   <float[1,4] bias =  {1,1,1,1}, float[1,4] bias>
{
   [node_add_3] add_3 = Add (x, bias)
}

InplaceAdd2

forward

def forward(self, x):
    x.add_(self.bias)
    return x

custom

  • inputs: #2[(T1s3x4,),(T1s5x4,)]

  • shapes: dict(x:{0:Dim(batch)})

<
   ir_version: 8,
   opset_import: ["" : 18],
   metadata_props: ["large_model": "False", "inline": "True", "external_threshold": "1024", "function_options": "FunctionOptions()", "optimize": "True", "n_initializers": "1", "n_large_initializers": "0", "size_initializers": "16", "size_large_initializers": "0", "n_nodes": "1", "n_nodes_other_domain": "0", "mask_outputs": "[True]", "input_args": "(T1s3x4[8.156997680664062,8.949931144714355:A8.431713183720907],)", "input_kwargs": "None", "optimizations": "OptimizationOptions(constant_folding={'Transpose', 'Cast', 'Reshape', 'Concat'}, patterns=[BatchNormalizationPattern(), BatchNormalizationTrainingPattern(), CastLayerNormalizationCastPattern(), CastPattern(), CastCastBinaryPattern(), CastOpCastPattern(), ClipClipPattern(), ComputationCastOpCastPattern(), ConcatGatherPattern(), ConvBiasNullPattern(), DropoutPattern(), ExpandPattern(), ExpandBroadcastPattern(), ExpandSwapPattern(), GeluPattern(), IdentityPattern(), LayerNormalizationPattern(), LayerNormalizationScalePattern(), LeakyReluPattern(), MulMulMulScalarPattern(), ReduceReshapePattern(), ReduceSumNormalizePattern(), ReshapePattern(), ReshapeMatMulReshapePattern(), Reshape2Of3Pattern(), ReshapeReshapeBinaryPattern(), MatMulAddPattern(), GemmTransposePattern(), MatMulReshape2Of3Pattern(), MulMulMatMulPattern(), ReshapeReshapePattern(), RotaryConcatPartPattern(), SameChildrenPattern(), SequenceConstructAtPattern(), SliceSlicePattern(), SlicesSplitPattern(), SoftmaxCrossEntropyLossCastPattern(), SplitConcatPattern(), SqueezeAddPattern(), SqueezeUnsqueezePattern(), Sub1MulPattern(), SwitchOrderBinaryPattern(), SwitchReshapeActivationPattern(), TransposeEqualReshapePattern(), TransposeMatMulPattern(), TransposeReshapeMatMulPattern(), TransposeReshapeTransposePattern(), TransposeTransposePattern(), UnsqueezeEqualPattern(), UnsqueezeUnsqueezePattern()])", "dynamic_shapes": "{'x': {0: Dim('batch', min=0)}}", "_discovered_shape_constraints": "{'batch': {'s77'}, 's77': {'batch'}}"]
>
experiment (float[batch,4] x) => (float[batch,4] output_0) 
   <float[1,4] c_bias =  {1,1,1,1}, float[batch,4] add_>
{
   [add__Tensor] output_0 = Add (x, c_bias)
}

dynamo-ir

  • inputs: #2[(T1s3x4,),(T1s5x4,)]

  • shapes: dict(x:{0:Dim(batch)})

<
   ir_version: 10,
   opset_import: ["" : 18],
   producer_name: "pytorch",
   producer_version: "2.9.0.dev20250701+cu126"
>
main_graph (float[batch,4] x) => (float[batch,4] add_3) 
   <float[1,4] bias =  {1,1,1,1}, float[1,4] bias>
{
   [node_add_3] add_3 = Add (x, bias)
}

InplaceAdd_Mul

forward

def forward(self, x):
    x.add_(self.bias)
    return x * 2

custom

  • inputs: #2[(T1s3x4,),(T1s5x4,)]

  • shapes: dict(x:{0:Dim(batch)})

<
   ir_version: 8,
   opset_import: ["" : 18],
   metadata_props: ["large_model": "False", "inline": "True", "external_threshold": "1024", "function_options": "FunctionOptions()", "optimize": "True", "n_initializers": "2", "n_large_initializers": "0", "size_initializers": "20", "size_large_initializers": "0", "n_nodes": "2", "n_nodes_other_domain": "0", "mask_outputs": "[True]", "input_args": "(T1s3x4[8.001041412353516,8.841524124145508:A8.332729657491049],)", "input_kwargs": "None", "optimizations": "OptimizationOptions(constant_folding={'Transpose', 'Cast', 'Reshape', 'Concat'}, patterns=[BatchNormalizationPattern(), BatchNormalizationTrainingPattern(), CastLayerNormalizationCastPattern(), CastPattern(), CastCastBinaryPattern(), CastOpCastPattern(), ClipClipPattern(), ComputationCastOpCastPattern(), ConcatGatherPattern(), ConvBiasNullPattern(), DropoutPattern(), ExpandPattern(), ExpandBroadcastPattern(), ExpandSwapPattern(), GeluPattern(), IdentityPattern(), LayerNormalizationPattern(), LayerNormalizationScalePattern(), LeakyReluPattern(), MulMulMulScalarPattern(), ReduceReshapePattern(), ReduceSumNormalizePattern(), ReshapePattern(), ReshapeMatMulReshapePattern(), Reshape2Of3Pattern(), ReshapeReshapeBinaryPattern(), MatMulAddPattern(), GemmTransposePattern(), MatMulReshape2Of3Pattern(), MulMulMatMulPattern(), ReshapeReshapePattern(), RotaryConcatPartPattern(), SameChildrenPattern(), SequenceConstructAtPattern(), SliceSlicePattern(), SlicesSplitPattern(), SoftmaxCrossEntropyLossCastPattern(), SplitConcatPattern(), SqueezeAddPattern(), SqueezeUnsqueezePattern(), Sub1MulPattern(), SwitchOrderBinaryPattern(), SwitchReshapeActivationPattern(), TransposeEqualReshapePattern(), TransposeMatMulPattern(), TransposeReshapeMatMulPattern(), TransposeReshapeTransposePattern(), TransposeTransposePattern(), UnsqueezeEqualPattern(), UnsqueezeUnsqueezePattern()])", "dynamic_shapes": "{'x': {0: Dim('batch', min=0)}}", "_discovered_shape_constraints": "{'batch': {'s77'}, 's77': {'batch'}}"]
>
experiment (float[batch,4] x) => (float[batch,4] output_0) 
   <float[1,4] c_bias =  {1,1,1,1}, float[1] "init1_s_::RSh1" =  {2}, float[batch,4] add_, float[batch,4] _onx_mul_add_, int64[1] init7_s1_1, float init1_s_, float[batch,4] mul>
{
   [add__Tensor] add_ = Add (x, c_bias)
   [mul_Tensor2] output_0 = Mul (add_, "init1_s_::RSh1")
}

dynamo-ir

  • inputs: #2[(T1s3x4,),(T1s5x4,)]

  • shapes: dict(x:{0:Dim(batch)})

<
   ir_version: 10,
   opset_import: ["" : 18],
   producer_name: "pytorch",
   producer_version: "2.9.0.dev20250701+cu126"
>
main_graph (float[batch,4] x) => (float[batch,4] mul_4) 
   <float[1,4] bias =  {1,1,1,1}, float[1,4] bias, float[batch,4] add_3, float scalar_tensor_default>
{
   [node_add_3] add_3 = Add (x, bias)
   [node_Constant_1] scalar_tensor_default = Constant <value: tensor = float scalar_tensor_default {2}> ()
   [node_mul_4] mul_4 = Mul (add_3, scalar_tensor_default)
}

InplaceCloneAdd

forward

def forward(self, x):
    x = x.clone()
    x.add_(self.bias)
    return x

custom

  • inputs: #2[(T1s3x4,),(T1s5x4,)]

  • shapes: dict(x:{0:Dim(batch)})

<
   ir_version: 8,
   opset_import: ["" : 18],
   metadata_props: ["large_model": "False", "inline": "True", "external_threshold": "1024", "function_options": "FunctionOptions()", "optimize": "True", "n_initializers": "1", "n_large_initializers": "0", "size_initializers": "16", "size_large_initializers": "0", "n_nodes": "1", "n_nodes_other_domain": "0", "mask_outputs": "[True]", "input_args": "(T1s3x4[0.026060938835144043,0.8989080190658569:A0.3850300113360087],)", "input_kwargs": "None", "optimizations": "OptimizationOptions(constant_folding={'Transpose', 'Cast', 'Reshape', 'Concat'}, patterns=[BatchNormalizationPattern(), BatchNormalizationTrainingPattern(), CastLayerNormalizationCastPattern(), CastPattern(), CastCastBinaryPattern(), CastOpCastPattern(), ClipClipPattern(), ComputationCastOpCastPattern(), ConcatGatherPattern(), ConvBiasNullPattern(), DropoutPattern(), ExpandPattern(), ExpandBroadcastPattern(), ExpandSwapPattern(), GeluPattern(), IdentityPattern(), LayerNormalizationPattern(), LayerNormalizationScalePattern(), LeakyReluPattern(), MulMulMulScalarPattern(), ReduceReshapePattern(), ReduceSumNormalizePattern(), ReshapePattern(), ReshapeMatMulReshapePattern(), Reshape2Of3Pattern(), ReshapeReshapeBinaryPattern(), MatMulAddPattern(), GemmTransposePattern(), MatMulReshape2Of3Pattern(), MulMulMatMulPattern(), ReshapeReshapePattern(), RotaryConcatPartPattern(), SameChildrenPattern(), SequenceConstructAtPattern(), SliceSlicePattern(), SlicesSplitPattern(), SoftmaxCrossEntropyLossCastPattern(), SplitConcatPattern(), SqueezeAddPattern(), SqueezeUnsqueezePattern(), Sub1MulPattern(), SwitchOrderBinaryPattern(), SwitchReshapeActivationPattern(), TransposeEqualReshapePattern(), TransposeMatMulPattern(), TransposeReshapeMatMulPattern(), TransposeReshapeTransposePattern(), TransposeTransposePattern(), UnsqueezeEqualPattern(), UnsqueezeUnsqueezePattern()])", "dynamic_shapes": "{'x': {0: Dim('batch', min=0)}}", "_discovered_shape_constraints": "{'batch': {'s77'}, 's77': {'batch'}}"]
>
experiment (float[batch,4] x) => (float[batch,4] output_0) 
   <float[1,4] c_bias =  {1,1,1,1}, float[batch,4] clone, float[batch,4] add_>
{
   [add__Tensor] output_0 = Add (x, c_bias)
}

dynamo-ir

  • inputs: #2[(T1s3x4,),(T1s5x4,)]

  • shapes: dict(x:{0:Dim(batch)})

<
   ir_version: 10,
   opset_import: ["" : 18],
   producer_name: "pytorch",
   producer_version: "2.9.0.dev20250701+cu126"
>
main_graph (float[batch,4] x) => (float[batch,4] add_6) 
   <float[1,4] bias =  {1,1,1,1}, float[1,4] bias>
{
   [node_add_6] add_6 = Add (x, bias)
}

InplaceSetItemEllipsis_1

forward

def forward(self, index, update):
    copy = self.params.clone()
    copy[..., index] = update
    return copy

custom

FAILED

L['update'].size()[0] = 8192 is not equal to L['index'].size()[0] = 4

The error above occurred when calling torch.export.export. If you would like to view some more information about this error, and get a list of all other errors that may occur in your export call, you can replace your `export()` call with `draft_export()`.

dynamo-ir

FAILED

Failed to convert the exported program to an ONNX model. This is step 3/3 of exporting the model to ONNX. Next steps:
- If there is a missing ONNX function, implement it and register it to the registry.
- If there is an internal error during ONNX conversion, debug the error and summit a PR to PyTorch.
- Create an error report with `torch.onnx.export(..., report=True)`, and save the ExportedProgram as a pt2 file. Create an issue in the PyTorch GitHub repository against the *onnx* component. Attach the error report and the pt2 model.

## Exception summary

<class 'TypeError'>: int() argument must be a string, a bytes-like object or a real number, not 'SymbolicDim'
⬆️
<class 'torch.onnx._internal.exporter._errors.GraphConstructionError'>: Error processing Python constants for operator '::Expand'. named_inputs={'input': SymbolicTensor(name='anonymous:132313144740592', producer=anonymous_node:132316699979952, index=0), 'shape': (SymbolicDim(s5), SymbolicDim(s32))}, named_attrs={}, opset=, op_signature=''::Expand(input: T, shape: shape) -> (T) where T=UINT64 | COMPLEX64 | UINT8 | DOUBLE | UINT16 | BFLOAT16 | INT16 | STRING | FLOAT | INT32 | FLOAT16 | UINT32 | COMPLEX128 | INT8 | INT64 | BOOL, shape=INT64.
⬆️
<class 'torch.onnx._internal.exporter._errors.GraphConstructionError'>: Error calling operator 'Expand' with args (SymbolicTensor(name='anonymous:132313144740592', producer=anonymous_node:132316699979952, index=0), (SymbolicDim(s5), SymbolicDim(s32))) and kwargs {}.
⬆️
<class 'torch.onnx._internal.exporter._errors.GraphConstructionError'>: Error when calling function 'TracedOnnxFunction(<function aten_index_put at 0x7856941954e0>)' with args '[SymbolicTensor(name='clone', type=Tensor(FLOAT), shape=Shape([1, 8192, 4]), producer='node_clone', index=0), [None, None, SymbolicTensor(name='index', type=Tensor(INT64), shape=Shape([SymbolicDim(s91)]))], SymbolicTensor(name='update', type=Tensor(FLOAT), shape=Shape([SymbolicDim(s5), SymbolicDim(s32)]))]' and kwargs '{}'
⬆️
<class 'torch.onnx._internal.exporter._errors.ConversionError'>: Error when translating node %index_put : [num_users=1] = call_function[target=torch.ops.aten.index_put.default](args = (%clone, [None, None, %index], %update), kwargs = {}). See the stack trace for more information.

(Refer to the full stack trace above for more information.)

InplaceSetItemEllipsis_2

forward

def forward(self, index, update):
    copy = self.params.clone()
    copy[..., index] = update
    return copy

custom

FAILED

L['update'].size()[0] = 8192 is not equal to L['index'].size()[0] = 4

The error above occurred when calling torch.export.export. If you would like to view some more information about this error, and get a list of all other errors that may occur in your export call, you can replace your `export()` call with `draft_export()`.

dynamo-ir

FAILED

Failed to convert the exported program to an ONNX model. This is step 3/3 of exporting the model to ONNX. Next steps:
- If there is a missing ONNX function, implement it and register it to the registry.
- If there is an internal error during ONNX conversion, debug the error and summit a PR to PyTorch.
- Create an error report with `torch.onnx.export(..., report=True)`, and save the ExportedProgram as a pt2 file. Create an issue in the PyTorch GitHub repository against the *onnx* component. Attach the error report and the pt2 model.

## Exception summary

<class 'TypeError'>: int() argument must be a string, a bytes-like object or a real number, not 'SymbolicDim'
⬆️
<class 'torch.onnx._internal.exporter._errors.GraphConstructionError'>: Error processing Python constants for operator '::Expand'. named_inputs={'input': SymbolicTensor(name='anonymous:132313147898640', producer=anonymous_node:132313170417504, index=0), 'shape': (SymbolicDim(s5), SymbolicDim(s32))}, named_attrs={}, opset=, op_signature=''::Expand(input: T, shape: shape) -> (T) where T=UINT64 | COMPLEX64 | UINT8 | DOUBLE | UINT16 | BFLOAT16 | INT16 | STRING | FLOAT | INT32 | FLOAT16 | UINT32 | COMPLEX128 | INT8 | INT64 | BOOL, shape=INT64.
⬆️
<class 'torch.onnx._internal.exporter._errors.GraphConstructionError'>: Error calling operator 'Expand' with args (SymbolicTensor(name='anonymous:132313147898640', producer=anonymous_node:132313170417504, index=0), (SymbolicDim(s5), SymbolicDim(s32))) and kwargs {}.
⬆️
<class 'torch.onnx._internal.exporter._errors.GraphConstructionError'>: Error when calling function 'TracedOnnxFunction(<function aten_index_put at 0x7856941954e0>)' with args '[SymbolicTensor(name='clone', type=Tensor(FLOAT), shape=Shape([1, 8192, 6]), producer='node_clone', index=0), [None, None, SymbolicTensor(name='index', type=Tensor(INT64), shape=Shape([SymbolicDim(s91)]))], SymbolicTensor(name='update', type=Tensor(FLOAT), shape=Shape([SymbolicDim(s5), SymbolicDim(s32)]))]' and kwargs '{}'
⬆️
<class 'torch.onnx._internal.exporter._errors.ConversionError'>: Error when translating node %index_put : [num_users=1] = call_function[target=torch.ops.aten.index_put.default](args = (%clone, [None, None, %index], %update), kwargs = {}). See the stack trace for more information.

(Refer to the full stack trace above for more information.)

InplaceSetItemMask

forward

def forward(self, x):
    mask = x.to(bool)
    x[mask] = 2
    return x

custom

  • inputs: #2[(T1s2x3x3,),(T1s3x3x3,)]

  • shapes: dict(x:{0:Dim(batch)})

<
   ir_version: 8,
   opset_import: ["" : 18],
   metadata_props: ["large_model": "False", "inline": "True", "external_threshold": "1024", "function_options": "FunctionOptions()", "optimize": "True", "n_initializers": "1", "n_large_initializers": "0", "size_initializers": "4", "size_large_initializers": "0", "n_nodes": "2", "n_nodes_other_domain": "0", "mask_outputs": "[True]", "input_args": "(T1s2x3x3[2.0,2.0:A2.0],)", "input_kwargs": "None", "optimizations": "OptimizationOptions(constant_folding={'Transpose', 'Cast', 'Reshape', 'Concat'}, patterns=[BatchNormalizationPattern(), BatchNormalizationTrainingPattern(), CastLayerNormalizationCastPattern(), CastPattern(), CastCastBinaryPattern(), CastOpCastPattern(), ClipClipPattern(), ComputationCastOpCastPattern(), ConcatGatherPattern(), ConvBiasNullPattern(), DropoutPattern(), ExpandPattern(), ExpandBroadcastPattern(), ExpandSwapPattern(), GeluPattern(), IdentityPattern(), LayerNormalizationPattern(), LayerNormalizationScalePattern(), LeakyReluPattern(), MulMulMulScalarPattern(), ReduceReshapePattern(), ReduceSumNormalizePattern(), ReshapePattern(), ReshapeMatMulReshapePattern(), Reshape2Of3Pattern(), ReshapeReshapeBinaryPattern(), MatMulAddPattern(), GemmTransposePattern(), MatMulReshape2Of3Pattern(), MulMulMatMulPattern(), ReshapeReshapePattern(), RotaryConcatPartPattern(), SameChildrenPattern(), SequenceConstructAtPattern(), SliceSlicePattern(), SlicesSplitPattern(), SoftmaxCrossEntropyLossCastPattern(), SplitConcatPattern(), SqueezeAddPattern(), SqueezeUnsqueezePattern(), Sub1MulPattern(), SwitchOrderBinaryPattern(), SwitchReshapeActivationPattern(), TransposeEqualReshapePattern(), TransposeMatMulPattern(), TransposeReshapeMatMulPattern(), TransposeReshapeTransposePattern(), TransposeTransposePattern(), UnsqueezeEqualPattern(), UnsqueezeUnsqueezePattern()])", "dynamic_shapes": "{'x': {0: Dim('batch', min=0)}}", "_discovered_shape_constraints": "{'batch': {'s77'}, 's77': {'batch'}}"]
>
experiment (float[batch,3,3] x) => (float[batch,3,3] output_0) 
   <float c_lifted_tensor_0 =  {2}, bool[batch,3,3] to, float[batch,3,3] index_put_, float lift_fresh_copy>
{
   [to_dtype] to = Cast <to: int = 9> (x)
   [index_put_1b__where] output_0 = Where (to, c_lifted_tensor_0, x)
}

dynamo-ir

  • inputs: #2[(T1s2x3x3,),(T1s3x3x3,)]

  • shapes: dict(x:{0:Dim(batch)})

<
   ir_version: 10,
   opset_import: ["" : 18],
   producer_name: "pytorch",
   producer_version: "2.9.0.dev20250701+cu126"
>
main_graph (float[batch,3,3] x) => (float[batch,3,3] index_put) 
   <bool[batch,3,3] _to_copy, float clone>
{
   [node__to_copy] _to_copy = Cast <to: int = 9> (x)
   [node_Constant_0] clone = Constant <value: tensor = float clone {2}> ()
   [node_index_put] index_put = Where (_to_copy, clone, x)
}

InplaceSetItemSquare

forward

def forward(self, x):
    x[:2, :3] = 1
    return x

custom

  • inputs: #2[(T1s5x5,),(T1s7x5,)]

  • shapes: dict(x:{0:Dim(batch)})

<
   ir_version: 8,
   opset_import: ["" : 18],
   metadata_props: ["large_model": "False", "inline": "True", "external_threshold": "1024", "function_options": "FunctionOptions()", "optimize": "True", "n_initializers": "6", "n_large_initializers": "0", "size_initializers": "88", "size_large_initializers": "0", "n_nodes": "11", "n_nodes_other_domain": "0", "mask_outputs": "[False, True]", "input_args": "(T1s5x5[0.04932647943496704,1.0:A0.5737357687950134],)", "input_kwargs": "None", "optimizations": "OptimizationOptions(constant_folding={'Transpose', 'Cast', 'Reshape', 'Concat'}, patterns=[BatchNormalizationPattern(), BatchNormalizationTrainingPattern(), CastLayerNormalizationCastPattern(), CastPattern(), CastCastBinaryPattern(), CastOpCastPattern(), ClipClipPattern(), ComputationCastOpCastPattern(), ConcatGatherPattern(), ConvBiasNullPattern(), DropoutPattern(), ExpandPattern(), ExpandBroadcastPattern(), ExpandSwapPattern(), GeluPattern(), IdentityPattern(), LayerNormalizationPattern(), LayerNormalizationScalePattern(), LeakyReluPattern(), MulMulMulScalarPattern(), ReduceReshapePattern(), ReduceSumNormalizePattern(), ReshapePattern(), ReshapeMatMulReshapePattern(), Reshape2Of3Pattern(), ReshapeReshapeBinaryPattern(), MatMulAddPattern(), GemmTransposePattern(), MatMulReshape2Of3Pattern(), MulMulMatMulPattern(), ReshapeReshapePattern(), RotaryConcatPartPattern(), SameChildrenPattern(), SequenceConstructAtPattern(), SliceSlicePattern(), SlicesSplitPattern(), SoftmaxCrossEntropyLossCastPattern(), SplitConcatPattern(), SqueezeAddPattern(), SqueezeUnsqueezePattern(), Sub1MulPattern(), SwitchOrderBinaryPattern(), SwitchReshapeActivationPattern(), TransposeEqualReshapePattern(), TransposeMatMulPattern(), TransposeReshapeMatMulPattern(), TransposeReshapeTransposePattern(), TransposeTransposePattern(), UnsqueezeEqualPattern(), UnsqueezeUnsqueezePattern()])", "dynamic_shapes": "{'x': {0: Dim('batch', min=0)}}", "_discovered_shape_constraints": "{'batch': {'s77'}, 's77': {'batch'}}", "_known_value_shapes": "{
  _onx_gather_x::Shape:: ('batch',),
  slice_2::Shape:: (2, 3),
  x::Shape:: ('batch', 5),
}"]
>
experiment (float[batch,5] x) => (float[batch,5] output_1) 
   <int64[1] init7_s1_0 =  {0}, int64[1] init7_s1_2 =  {2}, int64[1] init7_s1_1 =  {1}, int64[2] "init7_s2_-1_1" =  {-1,1}, int64[3,1] "init7_s3_0_1_2::RSh-1x1" =  {0,1,2}, float[3,2] "fill::T10" =  {1,1,1,1,1,1}, float[batch,5] output_0, float[5,2] "slice_3::T10", int64[2] "x::Shape:", float[5,2] "_onx_scatternd_slice_3::T10", int64[?] _onx_range_init7_s1_0, float c_lifted_tensor_0, float clone, int64[1] "_onx_gather_x::Shape:", int64[?,?] "_onx_slice_range_init7_s1_0::RSh-1x1", float[2,5] slice_scatter, float[2,5] slice_3, int64[3] init7_s3_0_1_2, int64[1] init7_s1_3, float[batch,5] slice_scatter_1, float[2,3] fill, float[2,3] slice_2, int64[2] "slice_2::Shape:", int64[?] _onx_slice_range_init7_s1_0, float[2,5] slice_1>
{
   [slide_Tensor3] slice_3 = Slice (x, init7_s1_0, init7_s1_2, init7_s1_0)
   [slice_scatter_static2] "slice_3::T10" = Transpose <perm: ints = [1, 0]> (slice_3)
   [slice_scatter_static4] "_onx_scatternd_slice_3::T10" = ScatterND ("slice_3::T10", "init7_s3_0_1_2::RSh-1x1", "fill::T10")
   [slice_scatter_static5] slice_scatter = Transpose <perm: ints = [1, 0]> ("_onx_scatternd_slice_3::T10")
   [slice_scatter_dynamic] "x::Shape:" = Shape (x)
   [slice_scatter_dynamic2] "_onx_gather_x::Shape:" = Gather ("x::Shape:", init7_s1_0)
   [slice_scatter_dynamic3] _onx_range_init7_s1_0 = Range (init7_s1_0, "_onx_gather_x::Shape:", init7_s1_1)
   [slice_scatter_dynamic4] _onx_slice_range_init7_s1_0 = Slice (_onx_range_init7_s1_0, init7_s1_0, init7_s1_2, init7_s1_0, init7_s1_1)
   [slice_scatter_dynamic5] "_onx_slice_range_init7_s1_0::RSh-1x1" = Reshape (_onx_slice_range_init7_s1_0, "init7_s2_-1_1")
   [slice_scatter_dynamic6] output_0 = ScatterND (x, "_onx_slice_range_init7_s1_0::RSh-1x1", slice_scatter)
   [".output2"] output_1 = Identity (output_0)
}

dynamo-ir

  • inputs: #2[(T1s5x5,),(T1s7x5,)]

  • shapes: dict(x:{0:Dim(batch)})

<
   ir_version: 10,
   opset_import: ["" : 18],
   producer_name: "pytorch",
   producer_version: "2.9.0.dev20250701+cu126"
>
main_graph (float[batch,5] x) => (float[batch,5] slice_scatter_1) 
   <int64 val_0, int64 val_19, int64[1] val_26, int64[1] val_29, int64[1] val_33, float[2,5] slice_3, int64[1] val_42, int64[3,1] val_43, float[3,2] val_44, float[5,2] val_45, float[5,2] val_46, float[2,5] slice_scatter, int64[1] val_47, int64[2] val_48, int64 val_49, int64[?] val_50, int64[1] val_53, int64[?] val_54, int64[?,1] val_55>
{
   [node_Constant_0] val_0 = Constant <value: tensor = int64 {0}> ()
   [node_Constant_19] val_19 = Constant <value: tensor = int64 {1}> ()
   [node_Constant_81] val_26 = Constant <value: tensor = int64[1] val_26 {0}> ()
   [node_Constant_84] val_29 = Constant <value: tensor = int64[1] val_29 {2}> ()
   [node_Constant_33] val_33 = Constant <value_ints: ints = [1]> ()
   [node_slice_3] slice_3 = Slice (x, val_26, val_29, val_26, val_33)
   [node_Constant_42] val_42 = Constant <value: tensor = int64[1] {-1}> ()
   [node_Constant_95] val_43 = Constant <value: tensor = int64[3,1] val_43 {0,1,2}> ()
   [node_Constant_96] val_44 = Constant <value: tensor = float[3,2] val_44 {1,1,1,1,1,1}> ()
   [node_Transpose_45] val_45 = Transpose <perm: ints = [1, 0]> (slice_3)
   [node_ScatterND_46] val_46 = ScatterND <reduction: string = "none"> (val_45, val_43, val_44)
   [node_slice_scatter] slice_scatter = Transpose <perm: ints = [1, 0]> (val_46)
   [node_Constant_47] val_47 = Constant <value_ints: ints = [0]> ()
   [node_Shape_48] val_48 = Shape <start: int = 0> (x)
   [node_Gather_49] val_49 = Gather <axis: int = 0> (val_48, val_0)
   [node_Range_50] val_50 = Range (val_0, val_49, val_19)
   [node_Constant_99] val_53 = Constant <value: tensor = int64[1] val_53 {1}> ()
   [node_Slice_54] val_54 = Slice (val_50, val_26, val_29, val_47, val_53)
   [node_Unsqueeze_55] val_55 = Unsqueeze (val_54, val_42)
   [node_slice_scatter_1] slice_scatter_1 = ScatterND <reduction: string = "none"> (x, val_55, slice_scatter)
}

InplaceSetItemSquareAdd

forward

def forward(self, x):
    x[:2, :3] = 1
    return x + 2

custom

  • inputs: #2[(T1s5x5,),(T1s7x5,)]

  • shapes: dict(x:{0:Dim(batch)})

<
   ir_version: 8,
   opset_import: ["" : 18],
   metadata_props: ["large_model": "False", "inline": "True", "external_threshold": "1024", "function_options": "FunctionOptions()", "optimize": "True", "n_initializers": "7", "n_large_initializers": "0", "size_initializers": "92", "size_large_initializers": "0", "n_nodes": "11", "n_nodes_other_domain": "0", "mask_outputs": "[False, True]", "input_args": "(T1s5x5[0.06415653228759766,1.0:A0.6122418832778931],)", "input_kwargs": "None", "optimizations": "OptimizationOptions(constant_folding={'Transpose', 'Cast', 'Reshape', 'Concat'}, patterns=[BatchNormalizationPattern(), BatchNormalizationTrainingPattern(), CastLayerNormalizationCastPattern(), CastPattern(), CastCastBinaryPattern(), CastOpCastPattern(), ClipClipPattern(), ComputationCastOpCastPattern(), ConcatGatherPattern(), ConvBiasNullPattern(), DropoutPattern(), ExpandPattern(), ExpandBroadcastPattern(), ExpandSwapPattern(), GeluPattern(), IdentityPattern(), LayerNormalizationPattern(), LayerNormalizationScalePattern(), LeakyReluPattern(), MulMulMulScalarPattern(), ReduceReshapePattern(), ReduceSumNormalizePattern(), ReshapePattern(), ReshapeMatMulReshapePattern(), Reshape2Of3Pattern(), ReshapeReshapeBinaryPattern(), MatMulAddPattern(), GemmTransposePattern(), MatMulReshape2Of3Pattern(), MulMulMatMulPattern(), ReshapeReshapePattern(), RotaryConcatPartPattern(), SameChildrenPattern(), SequenceConstructAtPattern(), SliceSlicePattern(), SlicesSplitPattern(), SoftmaxCrossEntropyLossCastPattern(), SplitConcatPattern(), SqueezeAddPattern(), SqueezeUnsqueezePattern(), Sub1MulPattern(), SwitchOrderBinaryPattern(), SwitchReshapeActivationPattern(), TransposeEqualReshapePattern(), TransposeMatMulPattern(), TransposeReshapeMatMulPattern(), TransposeReshapeTransposePattern(), TransposeTransposePattern(), UnsqueezeEqualPattern(), UnsqueezeUnsqueezePattern()])", "dynamic_shapes": "{'x': {0: Dim('batch', min=0)}}", "_discovered_shape_constraints": "{'batch': {'s77'}, 's77': {'batch'}}", "_known_value_shapes": "{
  _onx_gather_x::Shape:: ('batch',),
  slice_2::Shape:: (2, 3),
  x::Shape:: ('batch', 5),
}"]
>
experiment (float[batch,5] x) => (float[batch,5] output_1) 
   <int64[1] init7_s1_0 =  {0}, int64[1] init7_s1_2 =  {2}, int64[1] init7_s1_1 =  {1}, int64[2] "init7_s2_-1_1" =  {-1,1}, int64[3,1] "init7_s3_0_1_2::RSh-1x1" =  {0,1,2}, float[3,2] "fill::T10" =  {1,1,1,1,1,1}, float[1] "init1_s_::RSh1" =  {2}, float[batch,5] output_0, float[5,2] "slice_3::T10", float init1_s_, int64[2] "x::Shape:", float[5,2] "_onx_scatternd_slice_3::T10", int64[?] _onx_range_init7_s1_0, float c_lifted_tensor_0, float clone, int64[1] "_onx_gather_x::Shape:", int64[?,?] "_onx_slice_range_init7_s1_0::RSh-1x1", float[2,5] slice_scatter, float[2,5] slice_3, float[batch,5] add, int64[3] init7_s3_0_1_2, int64[1] init7_s1_3, float[batch,5] slice_scatter_1, float[2,3] fill, float[2,3] slice_2, int64[2] "slice_2::Shape:", int64[?] _onx_slice_range_init7_s1_0, float[2,5] slice_1>
{
   [slide_Tensor3] slice_3 = Slice (x, init7_s1_0, init7_s1_2, init7_s1_0)
   [slice_scatter_static2] "slice_3::T10" = Transpose <perm: ints = [1, 0]> (slice_3)
   [slice_scatter_static4] "_onx_scatternd_slice_3::T10" = ScatterND ("slice_3::T10", "init7_s3_0_1_2::RSh-1x1", "fill::T10")
   [slice_scatter_static5] slice_scatter = Transpose <perm: ints = [1, 0]> ("_onx_scatternd_slice_3::T10")
   [slice_scatter_dynamic] "x::Shape:" = Shape (x)
   [slice_scatter_dynamic2] "_onx_gather_x::Shape:" = Gather ("x::Shape:", init7_s1_0)
   [slice_scatter_dynamic3] _onx_range_init7_s1_0 = Range (init7_s1_0, "_onx_gather_x::Shape:", init7_s1_1)
   [slice_scatter_dynamic4] _onx_slice_range_init7_s1_0 = Slice (_onx_range_init7_s1_0, init7_s1_0, init7_s1_2, init7_s1_0, init7_s1_1)
   [slice_scatter_dynamic5] "_onx_slice_range_init7_s1_0::RSh-1x1" = Reshape (_onx_slice_range_init7_s1_0, "init7_s2_-1_1")
   [slice_scatter_dynamic6] output_0 = ScatterND (x, "_onx_slice_range_init7_s1_0::RSh-1x1", slice_scatter)
   [add_Tensor] output_1 = Add (output_0, "init1_s_::RSh1")
}

dynamo-ir

  • inputs: #2[(T1s5x5,),(T1s7x5,)]

  • shapes: dict(x:{0:Dim(batch)})

<
   ir_version: 10,
   opset_import: ["" : 18],
   producer_name: "pytorch",
   producer_version: "2.9.0.dev20250701+cu126"
>
main_graph (float[batch,5] x) => (float[batch,5] add) 
   <int64 val_0, int64 val_19, int64[1] val_26, int64[1] val_29, int64[1] val_33, float[2,5] slice_3, int64[1] val_42, int64[3,1] val_43, float[3,2] val_44, float[5,2] val_45, float[5,2] val_46, float[2,5] slice_scatter, int64[1] val_47, int64[2] val_48, int64 val_49, int64[?] val_50, int64[1] val_53, int64[?] val_54, int64[?,1] val_55, float[batch,5] slice_scatter_1, float scalar_tensor_default>
{
   [node_Constant_0] val_0 = Constant <value: tensor = int64 {0}> ()
   [node_Constant_19] val_19 = Constant <value: tensor = int64 {1}> ()
   [node_Constant_81] val_26 = Constant <value: tensor = int64[1] val_26 {0}> ()
   [node_Constant_84] val_29 = Constant <value: tensor = int64[1] val_29 {2}> ()
   [node_Constant_33] val_33 = Constant <value_ints: ints = [1]> ()
   [node_slice_3] slice_3 = Slice (x, val_26, val_29, val_26, val_33)
   [node_Constant_42] val_42 = Constant <value: tensor = int64[1] {-1}> ()
   [node_Constant_95] val_43 = Constant <value: tensor = int64[3,1] val_43 {0,1,2}> ()
   [node_Constant_96] val_44 = Constant <value: tensor = float[3,2] val_44 {1,1,1,1,1,1}> ()
   [node_Transpose_45] val_45 = Transpose <perm: ints = [1, 0]> (slice_3)
   [node_ScatterND_46] val_46 = ScatterND <reduction: string = "none"> (val_45, val_43, val_44)
   [node_slice_scatter] slice_scatter = Transpose <perm: ints = [1, 0]> (val_46)
   [node_Constant_47] val_47 = Constant <value_ints: ints = [0]> ()
   [node_Shape_48] val_48 = Shape <start: int = 0> (x)
   [node_Gather_49] val_49 = Gather <axis: int = 0> (val_48, val_0)
   [node_Range_50] val_50 = Range (val_0, val_49, val_19)
   [node_Constant_99] val_53 = Constant <value: tensor = int64[1] val_53 {1}> ()
   [node_Slice_54] val_54 = Slice (val_50, val_26, val_29, val_47, val_53)
   [node_Unsqueeze_55] val_55 = Unsqueeze (val_54, val_42)
   [node_slice_scatter_1] slice_scatter_1 = ScatterND <reduction: string = "none"> (x, val_55, slice_scatter)
   [node_Constant_100] scalar_tensor_default = Constant <value: tensor = float scalar_tensor_default {2}> ()
   [node_add] add = Add (slice_scatter_1, scalar_tensor_default)
}

InplaceSetItemSquareAdd2

forward

def forward(self, x):
    x[:2, :3] = 1
    return x + 2, x + 3

custom

  • inputs: #2[(T1s5x5,),(T1s7x5,)]

  • shapes: dict(x:{0:Dim(batch)})

<
   ir_version: 8,
   opset_import: ["" : 18],
   metadata_props: ["large_model": "False", "inline": "True", "external_threshold": "1024", "function_options": "FunctionOptions()", "optimize": "True", "n_initializers": "8", "n_large_initializers": "0", "size_initializers": "96", "size_large_initializers": "0", "n_nodes": "12", "n_nodes_other_domain": "0", "mask_outputs": "[False, True, True]", "input_args": "(T1s5x5[0.011756062507629395,1.0:A0.6521409487724305],)", "input_kwargs": "None", "optimizations": "OptimizationOptions(constant_folding={'Transpose', 'Cast', 'Reshape', 'Concat'}, patterns=[BatchNormalizationPattern(), BatchNormalizationTrainingPattern(), CastLayerNormalizationCastPattern(), CastPattern(), CastCastBinaryPattern(), CastOpCastPattern(), ClipClipPattern(), ComputationCastOpCastPattern(), ConcatGatherPattern(), ConvBiasNullPattern(), DropoutPattern(), ExpandPattern(), ExpandBroadcastPattern(), ExpandSwapPattern(), GeluPattern(), IdentityPattern(), LayerNormalizationPattern(), LayerNormalizationScalePattern(), LeakyReluPattern(), MulMulMulScalarPattern(), ReduceReshapePattern(), ReduceSumNormalizePattern(), ReshapePattern(), ReshapeMatMulReshapePattern(), Reshape2Of3Pattern(), ReshapeReshapeBinaryPattern(), MatMulAddPattern(), GemmTransposePattern(), MatMulReshape2Of3Pattern(), MulMulMatMulPattern(), ReshapeReshapePattern(), RotaryConcatPartPattern(), SameChildrenPattern(), SequenceConstructAtPattern(), SliceSlicePattern(), SlicesSplitPattern(), SoftmaxCrossEntropyLossCastPattern(), SplitConcatPattern(), SqueezeAddPattern(), SqueezeUnsqueezePattern(), Sub1MulPattern(), SwitchOrderBinaryPattern(), SwitchReshapeActivationPattern(), TransposeEqualReshapePattern(), TransposeMatMulPattern(), TransposeReshapeMatMulPattern(), TransposeReshapeTransposePattern(), TransposeTransposePattern(), UnsqueezeEqualPattern(), UnsqueezeUnsqueezePattern()])", "dynamic_shapes": "{'x': {0: Dim('batch', min=0)}}", "_discovered_shape_constraints": "{'batch': {'s77'}, 's77': {'batch'}}", "_known_value_shapes": "{
  _onx_gather_x::Shape:: ('batch',),
  slice_2::Shape:: (2, 3),
  x::Shape:: ('batch', 5),
}"]
>
experiment (float[batch,5] x) => (float[batch,5] output_1, float[batch,5] output_2) 
   <int64[1] init7_s1_0 =  {0}, int64[1] init7_s1_2 =  {2}, int64[1] init7_s1_1 =  {1}, int64[2] "init7_s2_-1_1" =  {-1,1}, int64[3,1] "init7_s3_0_1_2::RSh-1x1" =  {0,1,2}, float[3,2] "fill::T10" =  {1,1,1,1,1,1}, float[1] "init1_s_::RSh1" =  {2}, float[1] "init1_s_2::RSh1" =  {3}, float[batch,5] output_0, float[5,2] "slice_3::T10", float init1_s_, int64[2] "x::Shape:", float[5,2] "_onx_scatternd_slice_3::T10", int64[?] _onx_range_init7_s1_0, float c_lifted_tensor_0, float clone, int64[1] "_onx_gather_x::Shape:", int64[?,?] "_onx_slice_range_init7_s1_0::RSh-1x1", float[2,5] slice_scatter, float[2,5] slice_3, float[batch,5] add, float[batch,5] add_4, int64[3] init7_s3_0_1_2, int64[1] init7_s1_3, float[batch,5] slice_scatter_1, float[2,3] fill, float[2,3] slice_2, int64[2] "slice_2::Shape:", int64[?] _onx_slice_range_init7_s1_0, float init1_s_2, float[2,5] slice_1>
{
   [slide_Tensor3] slice_3 = Slice (x, init7_s1_0, init7_s1_2, init7_s1_0)
   [slice_scatter_static2] "slice_3::T10" = Transpose <perm: ints = [1, 0]> (slice_3)
   [slice_scatter_static4] "_onx_scatternd_slice_3::T10" = ScatterND ("slice_3::T10", "init7_s3_0_1_2::RSh-1x1", "fill::T10")
   [slice_scatter_static5] slice_scatter = Transpose <perm: ints = [1, 0]> ("_onx_scatternd_slice_3::T10")
   [slice_scatter_dynamic] "x::Shape:" = Shape (x)
   [slice_scatter_dynamic2] "_onx_gather_x::Shape:" = Gather ("x::Shape:", init7_s1_0)
   [slice_scatter_dynamic3] _onx_range_init7_s1_0 = Range (init7_s1_0, "_onx_gather_x::Shape:", init7_s1_1)
   [slice_scatter_dynamic4] _onx_slice_range_init7_s1_0 = Slice (_onx_range_init7_s1_0, init7_s1_0, init7_s1_2, init7_s1_0, init7_s1_1)
   [slice_scatter_dynamic5] "_onx_slice_range_init7_s1_0::RSh-1x1" = Reshape (_onx_slice_range_init7_s1_0, "init7_s2_-1_1")
   [slice_scatter_dynamic6] output_0 = ScatterND (x, "_onx_slice_range_init7_s1_0::RSh-1x1", slice_scatter)
   [add_Tensor] output_1 = Add (output_0, "init1_s_::RSh1")
   [add_Tensor2] output_2 = Add (output_0, "init1_s_2::RSh1")
}

dynamo-ir

  • inputs: #2[(T1s5x5,),(T1s7x5,)]

  • shapes: dict(x:{0:Dim(batch)})

<
   ir_version: 10,
   opset_import: ["" : 18],
   producer_name: "pytorch",
   producer_version: "2.9.0.dev20250701+cu126"
>
main_graph (float[batch,5] x) => (float[batch,5] add, float[batch,5] add_4) 
   <int64 val_0, int64 val_19, int64[1] val_26, int64[1] val_29, int64[1] val_33, float[2,5] slice_3, int64[1] val_42, int64[3,1] val_43, float[3,2] val_44, float[5,2] val_45, float[5,2] val_46, float[2,5] slice_scatter, int64[1] val_47, int64[2] val_48, int64 val_49, int64[?] val_50, int64[1] val_53, int64[?] val_54, int64[?,1] val_55, float[batch,5] slice_scatter_1, float scalar_tensor_default, float scalar_tensor_default_1>
{
   [node_Constant_0] val_0 = Constant <value: tensor = int64 {0}> ()
   [node_Constant_19] val_19 = Constant <value: tensor = int64 {1}> ()
   [node_Constant_81] val_26 = Constant <value: tensor = int64[1] val_26 {0}> ()
   [node_Constant_84] val_29 = Constant <value: tensor = int64[1] val_29 {2}> ()
   [node_Constant_33] val_33 = Constant <value_ints: ints = [1]> ()
   [node_slice_3] slice_3 = Slice (x, val_26, val_29, val_26, val_33)
   [node_Constant_42] val_42 = Constant <value: tensor = int64[1] {-1}> ()
   [node_Constant_95] val_43 = Constant <value: tensor = int64[3,1] val_43 {0,1,2}> ()
   [node_Constant_96] val_44 = Constant <value: tensor = float[3,2] val_44 {1,1,1,1,1,1}> ()
   [node_Transpose_45] val_45 = Transpose <perm: ints = [1, 0]> (slice_3)
   [node_ScatterND_46] val_46 = ScatterND <reduction: string = "none"> (val_45, val_43, val_44)
   [node_slice_scatter] slice_scatter = Transpose <perm: ints = [1, 0]> (val_46)
   [node_Constant_47] val_47 = Constant <value_ints: ints = [0]> ()
   [node_Shape_48] val_48 = Shape <start: int = 0> (x)
   [node_Gather_49] val_49 = Gather <axis: int = 0> (val_48, val_0)
   [node_Range_50] val_50 = Range (val_0, val_49, val_19)
   [node_Constant_99] val_53 = Constant <value: tensor = int64[1] val_53 {1}> ()
   [node_Slice_54] val_54 = Slice (val_50, val_26, val_29, val_47, val_53)
   [node_Unsqueeze_55] val_55 = Unsqueeze (val_54, val_42)
   [node_slice_scatter_1] slice_scatter_1 = ScatterND <reduction: string = "none"> (x, val_55, slice_scatter)
   [node_Constant_100] scalar_tensor_default = Constant <value: tensor = float scalar_tensor_default {2}> ()
   [node_add] add = Add (slice_scatter_1, scalar_tensor_default)
   [node_Constant_101] scalar_tensor_default_1 = Constant <value: tensor = float scalar_tensor_default_1 {3}> ()
   [node_add_4] add_4 = Add (slice_scatter_1, scalar_tensor_default_1)
}

SignatureFloat1

forward

def forward(self, x, alpha: float = 2.0):
    return torch.sigmoid(self.linear(x)) - self.buff * alpha

custom

  • inputs: #2[(T1s4x3,float),(T1s8x3,float)]

  • shapes: ({0:Dim(batch)},None)

<
   ir_version: 8,
   opset_import: ["" : 18],
   metadata_props: ["large_model": "False", "inline": "True", "external_threshold": "1024", "function_options": "FunctionOptions()", "optimize": "True", "n_initializers": "3", "n_large_initializers": "0", "size_initializers": "20", "size_large_initializers": "0", "n_nodes": "3", "n_nodes_other_domain": "0", "mask_outputs": "[True]", "input_args": "(T1s4x3[10.0,21.0:A15.5],float=1.5)", "input_kwargs": "None", "optimizations": "OptimizationOptions(constant_folding={'Transpose', 'Cast', 'Reshape', 'Concat'}, patterns=[BatchNormalizationPattern(), BatchNormalizationTrainingPattern(), CastLayerNormalizationCastPattern(), CastPattern(), CastCastBinaryPattern(), CastOpCastPattern(), ClipClipPattern(), ComputationCastOpCastPattern(), ConcatGatherPattern(), ConvBiasNullPattern(), DropoutPattern(), ExpandPattern(), ExpandBroadcastPattern(), ExpandSwapPattern(), GeluPattern(), IdentityPattern(), LayerNormalizationPattern(), LayerNormalizationScalePattern(), LeakyReluPattern(), MulMulMulScalarPattern(), ReduceReshapePattern(), ReduceSumNormalizePattern(), ReshapePattern(), ReshapeMatMulReshapePattern(), Reshape2Of3Pattern(), ReshapeReshapeBinaryPattern(), MatMulAddPattern(), GemmTransposePattern(), MatMulReshape2Of3Pattern(), MulMulMatMulPattern(), ReshapeReshapePattern(), RotaryConcatPartPattern(), SameChildrenPattern(), SequenceConstructAtPattern(), SliceSlicePattern(), SlicesSplitPattern(), SoftmaxCrossEntropyLossCastPattern(), SplitConcatPattern(), SqueezeAddPattern(), SqueezeUnsqueezePattern(), Sub1MulPattern(), SwitchOrderBinaryPattern(), SwitchReshapeActivationPattern(), TransposeEqualReshapePattern(), TransposeMatMulPattern(), TransposeReshapeMatMulPattern(), TransposeReshapeTransposePattern(), TransposeTransposePattern(), UnsqueezeEqualPattern(), UnsqueezeUnsqueezePattern()])", "dynamic_shapes": "({0: Dim('batch', min=1, max=1024)}, None)", "_discovered_shape_constraints": "{'batch': {'s77'}, 's77': {'batch'}}"]
>
experiment (float[batch,3] x, float alpha) => (float[batch,1] output_0) 
   <float[1] mul =  {0.75}, float[1,3] "GemmTransposePattern--p_linear_weight::T10" =  {-0.488475,0.0411319,-0.2934}, float[1] "linear.bias" =  {0.322306}, float[batch,1] _onx_matmul_x, int64[1] init7_s1_1, float init1_s_, float[1] b_buff, float[1,3] p_linear_weight, float[batch,1] linear, float[1] "init1_s_::RSh1", int64[2] "init7_s2_1_-1", float[batch,1] sigmoid, float[batch,1] sub, float[1,3] "linear.weight", float[1] _onx_mul_b_buff, float[1] p_linear_bias, float[3,1] "p_linear_weight::T10">
{
   ["GemmTransposePattern--MatMulAddPattern--Opset2"] linear = Gemm <transB: int = 1> (x, "GemmTransposePattern--p_linear_weight::T10", "linear.bias")
   [sigmoid] sigmoid = Sigmoid (linear)
   [sub_Tensor] output_0 = Sub (sigmoid, mul)
}

FAILED

diff.1

dynamo-ir

  • inputs: #2[(T1s4x3,float),(T1s8x3,float)]

  • shapes: ({0:Dim(batch)},None)

<
   ir_version: 10,
   opset_import: ["" : 18],
   producer_name: "pytorch",
   producer_version: "2.9.0.dev20250701+cu126"
>
main_graph (float[s77,3] x) => (float[s77,1] sub_2) 
   <float[1,3] "linear.weight" =  {0.442887,-0.540267,0.247224}, float[1] "linear.bias" =  {-0.530752}, float[1,3] "linear.weight", float[1] "linear.bias", float[s77,1] linear, float[s77,1] sigmoid, float[1] mul_2>
{
   [node_linear] linear = Gemm <beta: float = 1, transB: int = 1, alpha: float = 1, transA: int = 0> (x, "linear.weight", "linear.bias")
   [node_sigmoid] sigmoid = Sigmoid (linear)
   [node_Constant_1] mul_2 = Constant <value: tensor = float[1] mul_2 {0.75}> ()
   [node_sub_2] sub_2 = Sub (sigmoid, mul_2)
}

FAILED

Input mismatch, inputs[0]=(T1r2,float) but names=['x'], model=SignatureFloat1, export='dynamo-ir'

SignatureInt1

forward

def forward(self, x, i: int = 2):
    return torch.sigmoid(self.linear(x)) - self.buff + x[:, i : i + 1]

custom

  • inputs: #2[(T1s4x3,int),(T1s8x3,int)]

  • shapes: ({0:Dim(batch)},None)

<
   ir_version: 8,
   opset_import: ["" : 18],
   metadata_props: ["large_model": "False", "inline": "True", "external_threshold": "1024", "function_options": "FunctionOptions()", "optimize": "True", "n_initializers": "5", "n_large_initializers": "0", "size_initializers": "36", "size_large_initializers": "0", "n_nodes": "5", "n_nodes_other_domain": "0", "mask_outputs": "[True]", "input_args": "(T1s4x3[10.0,21.0:A15.5],int=1)", "input_kwargs": "None", "optimizations": "OptimizationOptions(constant_folding={'Transpose', 'Cast', 'Reshape', 'Concat'}, patterns=[BatchNormalizationPattern(), BatchNormalizationTrainingPattern(), CastLayerNormalizationCastPattern(), CastPattern(), CastCastBinaryPattern(), CastOpCastPattern(), ClipClipPattern(), ComputationCastOpCastPattern(), ConcatGatherPattern(), ConvBiasNullPattern(), DropoutPattern(), ExpandPattern(), ExpandBroadcastPattern(), ExpandSwapPattern(), GeluPattern(), IdentityPattern(), LayerNormalizationPattern(), LayerNormalizationScalePattern(), LeakyReluPattern(), MulMulMulScalarPattern(), ReduceReshapePattern(), ReduceSumNormalizePattern(), ReshapePattern(), ReshapeMatMulReshapePattern(), Reshape2Of3Pattern(), ReshapeReshapeBinaryPattern(), MatMulAddPattern(), GemmTransposePattern(), MatMulReshape2Of3Pattern(), MulMulMatMulPattern(), ReshapeReshapePattern(), RotaryConcatPartPattern(), SameChildrenPattern(), SequenceConstructAtPattern(), SliceSlicePattern(), SlicesSplitPattern(), SoftmaxCrossEntropyLossCastPattern(), SplitConcatPattern(), SqueezeAddPattern(), SqueezeUnsqueezePattern(), Sub1MulPattern(), SwitchOrderBinaryPattern(), SwitchReshapeActivationPattern(), TransposeEqualReshapePattern(), TransposeMatMulPattern(), TransposeReshapeMatMulPattern(), TransposeReshapeTransposePattern(), TransposeTransposePattern(), UnsqueezeEqualPattern(), UnsqueezeUnsqueezePattern()])", "dynamic_shapes": "({0: Dim('batch', min=1, max=1024)}, None)", "_discovered_shape_constraints": "{'batch': {'s77'}, 's77': {'batch'}}"]
>
experiment (float[batch,3] x, int64 i) => (float[batch,1] output_0) 
   <float[1] b_buff =  {0.5}, int64[1] init7_s1_1 =  {1}, int64[1] init7_s1_2 =  {2}, float[1,3] "GemmTransposePattern--p_linear_weight::T10" =  {-0.287984,0.108205,0.00799918}, float[1] "linear.bias" =  {-0.153188}, float[batch,1] _onx_matmul_x, float[batch,1] add, float[1,3] p_linear_weight, float[batch,1] linear, int64[2] "init7_s2_1_-1", float[batch,1] sigmoid, float[batch,1] slice_2, float[batch,1] sub, float[1,3] "linear.weight", float[1] p_linear_bias, float[3,1] "p_linear_weight::T10", float[batch,3] slice_1>
{
   ["GemmTransposePattern--MatMulAddPattern--Opset2"] linear = Gemm <transB: int = 1> (x, "GemmTransposePattern--p_linear_weight::T10", "linear.bias")
   [sigmoid] sigmoid = Sigmoid (linear)
   [sub_Tensor] sub = Sub (sigmoid, b_buff)
   [slide_Tensor] slice_2 = Slice (x, init7_s1_1, init7_s1_2, init7_s1_1)
   [add_Tensor] output_0 = Add (sub, slice_2)
}

FAILED

diff.1

dynamo-ir

  • inputs: #2[(T1s4x3,int),(T1s8x3,int)]

  • shapes: ({0:Dim(batch)},None)

<
   ir_version: 10,
   opset_import: ["" : 18],
   producer_name: "pytorch",
   producer_version: "2.9.0.dev20250701+cu126"
>
main_graph (float[s77,3] x) => (float[s77,1] add_15) 
   <float[1,3] "linear.weight" =  {0.303618,-0.0922485,-0.57155}, float[1] "linear.bias" =  {0.0983027}, float[1] buff =  {0.5}, float[1,3] "linear.weight", float[1] "linear.bias", float[1] buff, float[s77,1] linear, float[s77,1] sigmoid, float[s77,1] sub_2, int64[1] val_10, int64[1] val_14, int64[1] val_18, float[s77,1] slice_2>
{
   [node_linear] linear = Gemm <beta: float = 1, transB: int = 1, alpha: float = 1, transA: int = 0> (x, "linear.weight", "linear.bias")
   [node_sigmoid] sigmoid = Sigmoid (linear)
   [node_sub_2] sub_2 = Sub (sigmoid, buff)
   [node_Constant_24] val_10 = Constant <value: tensor = int64[1] val_10 {1}> ()
   [node_Constant_27] val_14 = Constant <value: tensor = int64[1] val_14 {2}> ()
   [node_Constant_18] val_18 = Constant <value_ints: ints = [1]> ()
   [node_slice_2] slice_2 = Slice (x, val_10, val_14, val_10, val_18)
   [node_add_15] add_15 = Add (sub_2, slice_2)
}

FAILED

Input mismatch, inputs[0]=(T1r2,int) but names=['x'], model=SignatureInt1, export='dynamo-ir'

SignatureInt2

forward

def forward(self, x, i: int = 2):
    return torch.sigmoid(self.linear(x)) - self.buff + x[:, i]

custom

  • inputs: #1[(T1s4x3,int)]

  • shapes: dict(x:{0:Dim(batch)},i:None)

<
   ir_version: 8,
   opset_import: ["" : 18],
   metadata_props: ["large_model": "False", "inline": "True", "external_threshold": "1024", "function_options": "FunctionOptions()", "optimize": "True", "n_initializers": "4", "n_large_initializers": "0", "size_initializers": "28", "size_large_initializers": "0", "n_nodes": "5", "n_nodes_other_domain": "0", "mask_outputs": "[True]", "input_args": "(T1s4x3[10.0,21.0:A15.5],int=1)", "input_kwargs": "None", "optimizations": "OptimizationOptions(constant_folding={'Transpose', 'Cast', 'Reshape', 'Concat'}, patterns=[BatchNormalizationPattern(), BatchNormalizationTrainingPattern(), CastLayerNormalizationCastPattern(), CastPattern(), CastCastBinaryPattern(), CastOpCastPattern(), ClipClipPattern(), ComputationCastOpCastPattern(), ConcatGatherPattern(), ConvBiasNullPattern(), DropoutPattern(), ExpandPattern(), ExpandBroadcastPattern(), ExpandSwapPattern(), GeluPattern(), IdentityPattern(), LayerNormalizationPattern(), LayerNormalizationScalePattern(), LeakyReluPattern(), MulMulMulScalarPattern(), ReduceReshapePattern(), ReduceSumNormalizePattern(), ReshapePattern(), ReshapeMatMulReshapePattern(), Reshape2Of3Pattern(), ReshapeReshapeBinaryPattern(), MatMulAddPattern(), GemmTransposePattern(), MatMulReshape2Of3Pattern(), MulMulMatMulPattern(), ReshapeReshapePattern(), RotaryConcatPartPattern(), SameChildrenPattern(), SequenceConstructAtPattern(), SliceSlicePattern(), SlicesSplitPattern(), SoftmaxCrossEntropyLossCastPattern(), SplitConcatPattern(), SqueezeAddPattern(), SqueezeUnsqueezePattern(), Sub1MulPattern(), SwitchOrderBinaryPattern(), SwitchReshapeActivationPattern(), TransposeEqualReshapePattern(), TransposeMatMulPattern(), TransposeReshapeMatMulPattern(), TransposeReshapeTransposePattern(), TransposeTransposePattern(), UnsqueezeEqualPattern(), UnsqueezeUnsqueezePattern()])", "dynamic_shapes": "{'i': None, 'x': {0: Dim('batch', min=0)}}", "_discovered_shape_constraints": "{'batch': {'s77'}, 's77': {'batch'}}"]
>
experiment (float[batch,3] x, int64 i) => (float[batch,batch] output_0) 
   <float[1] b_buff =  {0.5}, int64 init7_s_1 =  {1}, float[1,3] "GemmTransposePattern--p_linear_weight::T10" =  {-0.430331,0.400767,0.422815}, float[1] "linear.bias" =  {-0.433012}, float[batch,1] _onx_matmul_x, float[batch] select, float[batch,batch] add, float[1,3] p_linear_weight, float[batch,1] linear, int64[2] "init7_s2_1_-1", float[batch,1] sigmoid, float[batch,1] sub, float[1,3] "linear.weight", float[1] p_linear_bias, float[3,1] "p_linear_weight::T10", float[batch,3] slice_1>
{
   ["GemmTransposePattern--MatMulAddPattern--Opset2"] linear = Gemm <transB: int = 1> (x, "GemmTransposePattern--p_linear_weight::T10", "linear.bias")
   [sigmoid] sigmoid = Sigmoid (linear)
   [sub_Tensor] sub = Sub (sigmoid, b_buff)
   [select_int] select = Gather <axis: int = 1> (x, init7_s_1)
   [add_Tensor] output_0 = Add (sub, select)
}

dynamo-ir

  • inputs: #1[(T1s4x3,int)]

  • shapes: dict(x:{0:Dim(batch)},i:None)

<
   ir_version: 10,
   opset_import: ["" : 18],
   producer_name: "pytorch",
   producer_version: "2.9.0.dev20250701+cu126"
>
main_graph (float[s77,3] x) => (float[s77,s77] add_14) 
   <float[1,3] "linear.weight" =  {-0.47754,-0.0452508,-0.016264}, float[1] "linear.bias" =  {-0.496291}, float[1] buff =  {0.5}, float[1,3] "linear.weight", float[1] "linear.bias", float[1] buff, float[s77,1] linear, float[s77,1] sigmoid, float[s77,1] sub_2, int64 val_7, float[s77] select>
{
   [node_linear] linear = Gemm <beta: float = 1, transB: int = 1, alpha: float = 1, transA: int = 0> (x, "linear.weight", "linear.bias")
   [node_sigmoid] sigmoid = Sigmoid (linear)
   [node_sub_2] sub_2 = Sub (sigmoid, buff)
   [node_Constant_7] val_7 = Constant <value: tensor = int64 {1}> ()
   [node_select] select = Gather <axis: int = 1> (x, val_7)
   [node_add_14] add_14 = Add (sub_2, select)
}

FAILED

Input mismatch, inputs[0]=(T1r2,int) but names=['x'], model=SignatureInt2, export='dynamo-ir'

SignatureListFixedLength

forward

def forward(self, x, lx: list):
    return (
        torch.sigmoid(self.linear(x)) - self.buff + lx[0] * lx[1].sum(axis=1, keepdim=True)
    )

custom

  • inputs: #2[(T1s4x3,#2[T1s4x1,T1s4x2]),(T1s8x3,#2[T1s8x1,T1s8x2])]

  • shapes: dict(x:{0:Dim(batch)},lx:#2[{0:Dim(batch)},{0:Dim(batch)}])

<
   ir_version: 8,
   opset_import: ["" : 18],
   metadata_props: ["large_model": "False", "inline": "True", "external_threshold": "1024", "function_options": "FunctionOptions()", "optimize": "True", "n_initializers": "4", "n_large_initializers": "0", "size_initializers": "28", "size_large_initializers": "0", "n_nodes": "6", "n_nodes_other_domain": "0", "mask_outputs": "[True]", "input_args": "(T1s4x3[10.0,21.0:A15.5],#2[T1s4x1[10.0,13.0:A11.5],T1s4x2[10.0,17.0:A13.5]])", "input_kwargs": "None", "optimizations": "OptimizationOptions(constant_folding={'Transpose', 'Cast', 'Reshape', 'Concat'}, patterns=[BatchNormalizationPattern(), BatchNormalizationTrainingPattern(), CastLayerNormalizationCastPattern(), CastPattern(), CastCastBinaryPattern(), CastOpCastPattern(), ClipClipPattern(), ComputationCastOpCastPattern(), ConcatGatherPattern(), ConvBiasNullPattern(), DropoutPattern(), ExpandPattern(), ExpandBroadcastPattern(), ExpandSwapPattern(), GeluPattern(), IdentityPattern(), LayerNormalizationPattern(), LayerNormalizationScalePattern(), LeakyReluPattern(), MulMulMulScalarPattern(), ReduceReshapePattern(), ReduceSumNormalizePattern(), ReshapePattern(), ReshapeMatMulReshapePattern(), Reshape2Of3Pattern(), ReshapeReshapeBinaryPattern(), MatMulAddPattern(), GemmTransposePattern(), MatMulReshape2Of3Pattern(), MulMulMatMulPattern(), ReshapeReshapePattern(), RotaryConcatPartPattern(), SameChildrenPattern(), SequenceConstructAtPattern(), SliceSlicePattern(), SlicesSplitPattern(), SoftmaxCrossEntropyLossCastPattern(), SplitConcatPattern(), SqueezeAddPattern(), SqueezeUnsqueezePattern(), Sub1MulPattern(), SwitchOrderBinaryPattern(), SwitchReshapeActivationPattern(), TransposeEqualReshapePattern(), TransposeMatMulPattern(), TransposeReshapeMatMulPattern(), TransposeReshapeTransposePattern(), TransposeTransposePattern(), UnsqueezeEqualPattern(), UnsqueezeUnsqueezePattern()])", "dynamic_shapes": "{'lx': [{0: Dim('batch', min=0)}, {0: Dim('batch', min=0)}],
 'x': {0: Dim('batch', min=0)}}", "_discovered_shape_constraints": "{'batch': {'s53', 's50', 's77'},
 's50': {'batch', 's53'},
 's53': {'batch'},
 's77': {'batch'}}"]
>
experiment (float[batch,3] x, float[batch,1] lx_0, float[batch,2] lx_1) => (float[batch,1] output_0) 
   <float[1] b_buff =  {0.5}, int64[1] init7_s1_1 =  {1}, float[1,3] "GemmTransposePattern--p_linear_weight::T10" =  {-0.405241,-0.17025,-0.320338}, float[1] "linear.bias" =  {0.380358}, float[batch,1] _onx_matmul_x, float[batch,1] add, float[batch,1] mul, float[batch,1] sum_1, float[1,3] p_linear_weight, float[batch,1] linear, int64[2] "init7_s2_1_-1", float[batch,1] sigmoid, float[batch,1] sub, float[1,3] "linear.weight", float[1] p_linear_bias, float[3,1] "p_linear_weight::T10">
{
   ["GemmTransposePattern--MatMulAddPattern--Opset2"] linear = Gemm <transB: int = 1> (x, "GemmTransposePattern--p_linear_weight::T10", "linear.bias")
   [sigmoid] sigmoid = Sigmoid (linear)
   [sub_Tensor] sub = Sub (sigmoid, b_buff)
   [sum] sum_1 = ReduceSum <keepdims: int = 1> (lx_1, init7_s1_1)
   [mul_Tensor] mul = Mul (lx_0, sum_1)
   [add_Tensor] output_0 = Add (sub, mul)
}

dynamo-ir

  • inputs: #2[(T1s4x3,#2[T1s4x1,T1s4x2]),(T1s8x3,#2[T1s8x1,T1s8x2])]

  • shapes: dict(x:{0:Dim(batch)},lx:#2[{0:Dim(batch)},{0:Dim(batch)}])

<
   ir_version: 10,
   opset_import: ["" : 18],
   producer_name: "pytorch",
   producer_version: "2.9.0.dev20250701+cu126"
>
main_graph (float[batch,3] x, float[batch,1] lx_0, float[batch,2] lx_1) => (float[batch,1] add_15) 
   <float[1,3] "linear.weight" =  {0.296968,0.325305,0.272611}, float[1] "linear.bias" =  {0.254532}, float[1] buff =  {0.5}, float[1,3] "linear.weight", float[1] "linear.bias", float[1] buff, float[batch,1] linear, float[batch,1] sigmoid, float[batch,1] sub_2, int64[1] val_6, float[batch,1] sum_1, float[batch,1] mul_4>
{
   [node_linear] linear = Gemm <beta: float = 1, transB: int = 1, alpha: float = 1, transA: int = 0> (x, "linear.weight", "linear.bias")
   [node_sigmoid] sigmoid = Sigmoid (linear)
   [node_sub_2] sub_2 = Sub (sigmoid, buff)
   [node_Constant_9] val_6 = Constant <value: tensor = int64[1] val_6 {1}> ()
   [node_sum_1] sum_1 = ReduceSum <noop_with_empty_axes: int = 0, keepdims: int = 1> (lx_1, val_6)
   [node_mul_4] mul_4 = Mul (lx_0, sum_1)
   [node_add_15] add_15 = Add (sub_2, mul_4)
}

SignatureListFixedWithNone

forward

def forward(self, lx):
    x = lx[0]
    if lx[1] is not None:
        x += lx[1]
    if lx[2] is not None:
        x += lx[2]
    return x

custom

FAILED

Detected mismatch between the structure of `inputs` and `dynamic_shapes`: `inputs['lx']` has 3 elements, but `dynamic_shapes['lx']` has 2 elements
For more information about this error, see: https://pytorch.org/docs/main/generated/exportdb/index.html#dynamic-shapes-validation

The error above occurred when calling torch.export.export. If you would like to view some more information about this error, and get a list of all other errors that may occur in your export call, you can replace your `export()` call with `draft_export()`.

dynamo-ir

FAILED

Failed to export the model with torch.export. This is step 1/3 of exporting the model to ONNX. Next steps:
- Modify the model code for `torch.export.export` to succeed. Refer to https://pytorch.org/docs/stable/generated/exportdb/index.html for more information.
- Debug `torch.export.export` and summit a PR to PyTorch.
- Create an issue in the PyTorch GitHub repository against the *torch.export* component and attach the full error stack as well as reproduction scripts.

## Exception summary

<class 'torch._dynamo.exc.UserError'>: Detected mismatch between the structure of `inputs` and `dynamic_shapes`: `inputs['lx']` has 3 elements, but `dynamic_shapes['lx']` has 2 elements
For more information about this error, see: https://pytorch.org/docs/main/generated/exportdb/index.html#dynamic-shapes-validation

The error above occurred when calling torch.export.export. If you would like to view some more information about this error, and get a list of all other errors that may occur in your export call, you can replace your `export()` call with `draft_export()`.

(Refer to the full stack trace above for more information.)

SignatureListVariableLength

forward

def forward(self, x, lx: list):
    t = torch.cat(lx, dim=1).sum(axis=1, keepdim=True)
    return torch.sigmoid(self.linear(x)) - self.buff + t

custom

  • inputs: #2[(T1s4x3,#2[T1s4x1,T1s4x2]),(T1s8x3,#3[T1s8x1,T1s8x2,T1s8x3])]

  • shapes: dict(x:{0:Dim(batch)},lx:#2[{0:Dim(batch)},{0:Dim(batch)}])

<
   ir_version: 8,
   opset_import: ["" : 18],
   metadata_props: ["large_model": "False", "inline": "True", "external_threshold": "1024", "function_options": "FunctionOptions()", "optimize": "True", "n_initializers": "4", "n_large_initializers": "0", "size_initializers": "28", "size_large_initializers": "0", "n_nodes": "6", "n_nodes_other_domain": "0", "mask_outputs": "[True]", "input_args": "(T1s4x3[10.0,21.0:A15.5],#2[T1s4x1[10.0,13.0:A11.5],T1s4x2[10.0,17.0:A13.5]])", "input_kwargs": "None", "optimizations": "OptimizationOptions(constant_folding={'Transpose', 'Cast', 'Reshape', 'Concat'}, patterns=[BatchNormalizationPattern(), BatchNormalizationTrainingPattern(), CastLayerNormalizationCastPattern(), CastPattern(), CastCastBinaryPattern(), CastOpCastPattern(), ClipClipPattern(), ComputationCastOpCastPattern(), ConcatGatherPattern(), ConvBiasNullPattern(), DropoutPattern(), ExpandPattern(), ExpandBroadcastPattern(), ExpandSwapPattern(), GeluPattern(), IdentityPattern(), LayerNormalizationPattern(), LayerNormalizationScalePattern(), LeakyReluPattern(), MulMulMulScalarPattern(), ReduceReshapePattern(), ReduceSumNormalizePattern(), ReshapePattern(), ReshapeMatMulReshapePattern(), Reshape2Of3Pattern(), ReshapeReshapeBinaryPattern(), MatMulAddPattern(), GemmTransposePattern(), MatMulReshape2Of3Pattern(), MulMulMatMulPattern(), ReshapeReshapePattern(), RotaryConcatPartPattern(), SameChildrenPattern(), SequenceConstructAtPattern(), SliceSlicePattern(), SlicesSplitPattern(), SoftmaxCrossEntropyLossCastPattern(), SplitConcatPattern(), SqueezeAddPattern(), SqueezeUnsqueezePattern(), Sub1MulPattern(), SwitchOrderBinaryPattern(), SwitchReshapeActivationPattern(), TransposeEqualReshapePattern(), TransposeMatMulPattern(), TransposeReshapeMatMulPattern(), TransposeReshapeTransposePattern(), TransposeTransposePattern(), UnsqueezeEqualPattern(), UnsqueezeUnsqueezePattern()])", "dynamic_shapes": "{'lx': [{0: Dim('batch', min=0)}, {0: Dim('batch', min=0)}],
 'x': {0: Dim('batch', min=0)}}", "_discovered_shape_constraints": "{'batch': {'s53', 's50', 's77'},
 's50': {'batch'},
 's53': {'batch'},
 's77': {'batch'}}"]
>
experiment (float[batch,3] x, float[batch,1] lx_0, float[batch,2] lx_1) => (float[batch,1] output_0) 
   <float[1] b_buff =  {0.5}, int64[1] init7_s1_1 =  {1}, float[1,3] "GemmTransposePattern--p_linear_weight::T10" =  {0.0289861,-0.0736461,0.263717}, float[1] "linear.bias" =  {-0.320988}, float[batch,1] _onx_matmul_x, float[batch,1] add, float[batch,1] sum_1, float[1,3] p_linear_weight, float[batch,1] linear, int64[2] "init7_s2_1_-1", float[batch,1] sigmoid, float[batch,3] cat, float[batch,1] sub, float[1,3] "linear.weight", float[1] p_linear_bias, float[3,1] "p_linear_weight::T10">
{
   [cat] cat = Concat <axis: int = 1> (lx_0, lx_1)
   [sum] sum_1 = ReduceSum <keepdims: int = 1> (cat, init7_s1_1)
   ["GemmTransposePattern--MatMulAddPattern--Opset2"] linear = Gemm <transB: int = 1> (x, "GemmTransposePattern--p_linear_weight::T10", "linear.bias")
   [sigmoid] sigmoid = Sigmoid (linear)
   [sub_Tensor] sub = Sub (sigmoid, b_buff)
   [add_Tensor] output_0 = Add (sub, sum_1)
}

FAILED

diff.1

dynamo-ir

  • inputs: #2[(T1s4x3,#2[T1s4x1,T1s4x2]),(T1s8x3,#3[T1s8x1,T1s8x2,T1s8x3])]

  • shapes: dict(x:{0:Dim(batch)},lx:#2[{0:Dim(batch)},{0:Dim(batch)}])

<
   ir_version: 10,
   opset_import: ["" : 18],
   producer_name: "pytorch",
   producer_version: "2.9.0.dev20250701+cu126"
>
main_graph (float[batch,3] x, float[batch,1] lx_0, float[batch,2] lx_1) => (float[batch,1] add_15) 
   <float[1,3] "linear.weight" =  {-0.155996,-0.00216091,-0.157249}, float[1] "linear.bias" =  {-0.312999}, float[1] buff =  {0.5}, float[1,3] "linear.weight", float[1] "linear.bias", float[1] buff, float[batch,3] cat, int64[1] val_6, float[batch,1] sum_1, float[batch,1] linear, float[batch,1] sigmoid, float[batch,1] sub_4>
{
   [node_cat] cat = Concat <axis: int = 1> (lx_0, lx_1)
   [node_Constant_9] val_6 = Constant <value: tensor = int64[1] val_6 {1}> ()
   [node_sum_1] sum_1 = ReduceSum <noop_with_empty_axes: int = 0, keepdims: int = 1> (cat, val_6)
   [node_linear] linear = Gemm <beta: float = 1, transB: int = 1, alpha: float = 1, transA: int = 0> (x, "linear.weight", "linear.bias")
   [node_sigmoid] sigmoid = Sigmoid (linear)
   [node_sub_4] sub_4 = Sub (sigmoid, buff)
   [node_add_15] add_15 = Add (sub_4, sum_1)
}

FAILED

diff.1

SignatureShapeAsIndex

forward

def forward(self, x, y):
    t = torch.sigmoid(self.linear(x)) + x
    return t[:, : y.shape[1]]

custom

  • inputs: #1[(T1s4x3,T1s4x2)]

  • shapes: dict(x:{0:Dim(batch)},y:{0:Dim(batch),1:Dim(length)})

<
   ir_version: 8,
   opset_import: ["" : 18],
   metadata_props: ["large_model": "False", "inline": "True", "external_threshold": "1024", "function_options": "FunctionOptions()", "optimize": "True", "n_initializers": "4", "n_large_initializers": "0", "size_initializers": "32", "size_large_initializers": "0", "n_nodes": "5", "n_nodes_other_domain": "0", "mask_outputs": "[True]", "input_args": "(T1s4x3[10.0,21.0:A15.5],T1s4x2[10.0,17.0:A13.5])", "input_kwargs": "None", "optimizations": "OptimizationOptions(constant_folding={'Transpose', 'Cast', 'Reshape', 'Concat'}, patterns=[BatchNormalizationPattern(), BatchNormalizationTrainingPattern(), CastLayerNormalizationCastPattern(), CastPattern(), CastCastBinaryPattern(), CastOpCastPattern(), ClipClipPattern(), ComputationCastOpCastPattern(), ConcatGatherPattern(), ConvBiasNullPattern(), DropoutPattern(), ExpandPattern(), ExpandBroadcastPattern(), ExpandSwapPattern(), GeluPattern(), IdentityPattern(), LayerNormalizationPattern(), LayerNormalizationScalePattern(), LeakyReluPattern(), MulMulMulScalarPattern(), ReduceReshapePattern(), ReduceSumNormalizePattern(), ReshapePattern(), ReshapeMatMulReshapePattern(), Reshape2Of3Pattern(), ReshapeReshapeBinaryPattern(), MatMulAddPattern(), GemmTransposePattern(), MatMulReshape2Of3Pattern(), MulMulMatMulPattern(), ReshapeReshapePattern(), RotaryConcatPartPattern(), SameChildrenPattern(), SequenceConstructAtPattern(), SliceSlicePattern(), SlicesSplitPattern(), SoftmaxCrossEntropyLossCastPattern(), SplitConcatPattern(), SqueezeAddPattern(), SqueezeUnsqueezePattern(), Sub1MulPattern(), SwitchOrderBinaryPattern(), SwitchReshapeActivationPattern(), TransposeEqualReshapePattern(), TransposeMatMulPattern(), TransposeReshapeMatMulPattern(), TransposeReshapeTransposePattern(), TransposeTransposePattern(), UnsqueezeEqualPattern(), UnsqueezeUnsqueezePattern()])", "dynamic_shapes": "{'x': {0: Dim('batch', min=0, max=1024)},
 'y': {0: Dim('batch', min=0, max=1024), 1: Dim('length', min=0, max=2)}}", "_discovered_shape_constraints": "{'batch': {'s17', 's77'},
 'length': {'s94'},
 's17': {'batch'},
 's77': {'batch'},
 's94': {'length'}}", "_known_value_shapes": "{
  sym_size_int_3: length,
  sym_size_int_3::UnSq0: ('length',),
  y::Shape1:2: ('length',),
}"]
>
experiment (float[batch,3] x, float[batch,length] y) => (float[batch,length] output_0) 
   <int64[1] init7_s1_0 =  {0}, int64[1] init7_s1_1 =  {1}, float[1,3] "GemmTransposePattern--p_linear_weight::T10" =  {-0.226019,-0.247899,0.459164}, float[1] "linear.bias" =  {-0.5307}, float[batch,1] _onx_matmul_x, int64 sym_size_int_3, int64[1] "sym_size_int_3::UnSq0", float[batch,3] add, float[1] b_buff, float[1,3] p_linear_weight, float[batch,1] linear, int64[2] "init7_s2_1_-1", int64[1] "y::Shape1:2", float[batch,1] sigmoid, float[batch,length] slice_2, float[1,3] "linear.weight", float[1] p_linear_bias, float[3,1] "p_linear_weight::T10", float[batch,3] slice_1>
{
   [sym_size_int] "y::Shape1:2" = Shape <end: int = 2, start: int = 1> (y)
   ["GemmTransposePattern--MatMulAddPattern--Opset2"] linear = Gemm <transB: int = 1> (x, "GemmTransposePattern--p_linear_weight::T10", "linear.bias")
   [sigmoid] sigmoid = Sigmoid (linear)
   [add_Tensor] add = Add (sigmoid, x)
   [slide_Tensor] output_0 = Slice (add, init7_s1_0, "y::Shape1:2", init7_s1_1)
}

dynamo-ir

  • inputs: #1[(T1s4x3,T1s4x2)]

  • shapes: dict(x:{0:Dim(batch)},y:{0:Dim(batch),1:Dim(length)})

<
   ir_version: 10,
   opset_import: ["" : 18],
   producer_name: "pytorch",
   producer_version: "2.9.0.dev20250701+cu126"
>
main_graph (float[batch,3] x, float[batch,length] y) => (float[batch,length] slice_2) 
   <float[1,3] "linear.weight" =  {0.242793,0.552866,0.0156701}, float[1] "linear.bias" =  {-0.41803}, float[1,3] "linear.weight", float[1] "linear.bias", int64[1] val_0, float[batch,1] linear, float[batch,1] sigmoid, float[batch,3] add_6, int64[1] val_8, int64[1] val_15, int64[1] val_16>
{
   [node_Shape_0] val_0 = Shape <end: int = 2, start: int = 1> (y)
   [node_linear] linear = Gemm <beta: float = 1, transB: int = 1, alpha: float = 1, transA: int = 0> (x, "linear.weight", "linear.bias")
   [node_sigmoid] sigmoid = Sigmoid (linear)
   [node_add_6] add_6 = Add (sigmoid, x)
   [node_Constant_8] val_8 = Constant <value_ints: ints = [0]> ()
   [node_Constant_23] val_15 = Constant <value: tensor = int64[1] val_15 {1}> ()
   [node_Constant_16] val_16 = Constant <value_ints: ints = [1]> ()
   [node_slice_2] slice_2 = Slice (add_6, val_8, val_0, val_15, val_16)
}

TypeBFloat16

forward

def forward(self, x):
    xb = x.to(torch.bfloat16)
    return (xb + xb).to(torch.float32)

custom

  • inputs: #1[(T1s4x4,)]

  • shapes: dict(x:{0:Dim(batch)})

<
   ir_version: 8,
   opset_import: ["" : 18],
   metadata_props: ["large_model": "False", "inline": "True", "external_threshold": "1024", "function_options": "FunctionOptions()", "optimize": "True", "n_initializers": "0", "n_large_initializers": "0", "size_initializers": "0", "size_large_initializers": "0", "n_nodes": "3", "n_nodes_other_domain": "0", "mask_outputs": "[True]", "input_args": "(T1s4x4[0.17821824550628662,0.9225718379020691:A0.5503441654145718],)", "input_kwargs": "None", "optimizations": "OptimizationOptions(constant_folding={'Transpose', 'Cast', 'Reshape', 'Concat'}, patterns=[BatchNormalizationPattern(), BatchNormalizationTrainingPattern(), CastLayerNormalizationCastPattern(), CastPattern(), CastCastBinaryPattern(), CastOpCastPattern(), ClipClipPattern(), ComputationCastOpCastPattern(), ConcatGatherPattern(), ConvBiasNullPattern(), DropoutPattern(), ExpandPattern(), ExpandBroadcastPattern(), ExpandSwapPattern(), GeluPattern(), IdentityPattern(), LayerNormalizationPattern(), LayerNormalizationScalePattern(), LeakyReluPattern(), MulMulMulScalarPattern(), ReduceReshapePattern(), ReduceSumNormalizePattern(), ReshapePattern(), ReshapeMatMulReshapePattern(), Reshape2Of3Pattern(), ReshapeReshapeBinaryPattern(), MatMulAddPattern(), GemmTransposePattern(), MatMulReshape2Of3Pattern(), MulMulMatMulPattern(), ReshapeReshapePattern(), RotaryConcatPartPattern(), SameChildrenPattern(), SequenceConstructAtPattern(), SliceSlicePattern(), SlicesSplitPattern(), SoftmaxCrossEntropyLossCastPattern(), SplitConcatPattern(), SqueezeAddPattern(), SqueezeUnsqueezePattern(), Sub1MulPattern(), SwitchOrderBinaryPattern(), SwitchReshapeActivationPattern(), TransposeEqualReshapePattern(), TransposeMatMulPattern(), TransposeReshapeMatMulPattern(), TransposeReshapeTransposePattern(), TransposeTransposePattern(), UnsqueezeEqualPattern(), UnsqueezeUnsqueezePattern()])", "dynamic_shapes": "{'x': {0: Dim('batch', min=0)}}", "_discovered_shape_constraints": "{'batch': {'s77'}, 's77': {'batch'}}"]
>
experiment (float[batch,4] x) => (float[batch,4] output_0) 
   <float[batch,4] "add-x", bfloat16[batch,4] to, float[batch,4] to_1, bfloat16[batch,4] add>
{
   ["CastCastBinaryPattern--add_Tensor"] "add-x" = Add (x, x)
   ["CastCastBinaryPattern--add_Tensor2"] add = Cast <to: int = 16> ("add-x")
   [to_dtype2] output_0 = Cast <to: int = 1> (add)
}

dynamo-ir

  • inputs: #1[(T1s4x4,)]

  • shapes: dict(x:{0:Dim(batch)})

<
   ir_version: 10,
   opset_import: ["" : 18],
   producer_name: "pytorch",
   producer_version: "2.9.0.dev20250701+cu126"
>
main_graph (float[batch,4] x) => (float[batch,4] _to_copy_1) 
   <bfloat16[batch,4] _to_copy, bfloat16[batch,4] add_3>
{
   [node__to_copy] _to_copy = Cast <to: int = 16> (x)
   [node_add_3] add_3 = Add (_to_copy, _to_copy)
   [node__to_copy_1] _to_copy_1 = Cast <to: int = 1> (add_3)
}

FAILED

[ONNXRuntimeError] : 9 : NOT_IMPLEMENTED : Could not find an implementation for Add(14) node with name 'node_add_3'

Vmap

forward

def forward(self, x, y):
    f = lambda x, y: x * y + 1  # noqa: E731
    return torch.vmap(f)(x, y)

custom

FAILED

# no error found for the failure

dynamo-ir

FAILED

Failed to export the model with torch.export. This is step 1/3 of exporting the model to ONNX. Next steps:
- Modify the model code for `torch.export.export` to succeed. Refer to https://pytorch.org/docs/stable/generated/exportdb/index.html for more information.
- Debug `torch.export.export` and summit a PR to PyTorch.
- Create an issue in the PyTorch GitHub repository against the *torch.export* component and attach the full error stack as well as reproduction scripts.

## Exception summary

<class 'AssertionError'>: 

(Refer to the full stack trace above for more information.)

VmapPython

forward

def forward(self, x, y):
    f = lambda x, y: x * y + 1  # noqa: E731
    return patched_vmap(f)(x, y)

custom

FAILED

object of type 'Node' has no len()

dynamo-ir

FAILED

Failed to export the model with torch.export. This is step 1/3 of exporting the model to ONNX. Next steps:
- Modify the model code for `torch.export.export` to succeed. Refer to https://pytorch.org/docs/stable/generated/exportdb/index.html for more information.
- Debug `torch.export.export` and summit a PR to PyTorch.
- Create an issue in the PyTorch GitHub repository against the *torch.export* component and attach the full error stack as well as reproduction scripts.

## Exception summary

<class 'TypeError'>: object of type 'Node' has no len()

(Refer to the full stack trace above for more information.)

Summary