Overview of Exportability Comparison#

The following script shows the exported program for many short cases to retrieve an ONNX model equivalent to the original model. Go to Bottom of the page to see a table summarizing the results. The summary explicitly includes the new-tracing exporter next to yobx, dynamo-ir, and tracing.

<<<

import warnings

warnings.filterwarnings("ignore")
import inspect
import textwrap
import pandas
from yobx.helpers import string_type
from yobx.helpers.onnx_helper import pretty_onnx
from yobx.torch.testing.model_eval_cases import discover, run_exporter
from yobx.ext_test_case import unit_test_going

cases = discover()
print()
print(":ref:`Summary <ledx-summary-exported-program>`")
print()
sorted_cases = sorted(cases.items())
if unit_test_going():
    sorted_cases = sorted_cases[:3]
for name, cls_model in sorted_cases:
    print(f"* :ref:`{name} <ledx-model-case-export-{name}>`")
print()
print()

obs = []
for name, cls_model in sorted(cases.items()):
    print()
    print(f".. _ledx-model-case-export-{name}:")
    print()
    print(name)
    print("=" * len(name))
    print()
    print(f"code: :class:`yobx.torch.testing._model_eval_cases.{name}`")
    print()
    print("forward")
    print("+++++++")
    print()
    print(".. code-block:: python")
    print()
    src = inspect.getsource(cls_model.forward)
    if src:
        print(textwrap.indent(textwrap.dedent(src), "    "))
    else:
        print("    # code is missing")
    print()
    print()
    for exporter in ("yobx", "dynamo-ir", "tracing", "new-tracing"):
        expname = exporter.replace("export-", "")
        print()
        print(expname)
        print("+" * len(expname))
        print()
        res = run_exporter(exporter, cls_model, True, quiet=True)
        case_ref = f":ref:`{name} <ledx-model-case-export-{name}>`"
        expo = exporter
        if "inputs" in res:
            print(f"* **inputs:** ``{string_type(res['inputs'], with_shape=True)}``")
        if "dynamic_shapes" in res:
            print(f"* **shapes:** ``{string_type(res['dynamic_shapes'])}``")
        print()
        print()
        if "onx" in res:
            print(".. code-block:: text")
            print()
            print(textwrap.indent(pretty_onnx(res["onx"]), "    "))
            print()
            print()
            if "error" not in res:
                obs.append(
                    dict(
                        case=case_ref, n_nodes=len(res["onx"].graph.node), exporter=expo
                    )
                )
        if "error" in res:
            print("**FAILED**")
            print()
            print(".. code-block:: text")
            print()
            err = str(res["error"])
            if err:
                print(textwrap.indent(err, "    "))
            else:
                print("    # no error found for the failure")
            print()
            print()
            obs.append(dict(case=case_ref, n_nodes="FAIL", exporter=expo))

print()
print(".. _ledx-summary-exported-program:")
print()
print("Summary")
print("+++++++")
print()
df = pandas.DataFrame(obs)
piv = df.pivot(index="case", columns="exporter", values="n_nodes")
print(piv.to_markdown(tablefmt="rst"))
print()

>>>

Summary

AtenAsStrided
AtenInterpolate
AtenNnFunctionalBilinear
AtenNonZero
AtenNonZeroTuple
AtenRollPos
AtenRollRelu
BuildInIsInstance
BuildInLen
ComplexPolar
ControlFlowCond
ControlFlowCond2Inputs
ControlFlowCond2Outputs
ControlFlowCondConstant
ControlFlowCondIdentity_153832
ControlFlowCondNestedModule
ControlFlowCondNonZero
ControlFlowIndirectRanks
ControlFlowIndirectRanksCat
ControlFlowNestCond
ControlFlowNumelZero1
ControlFlowNumelZero2
ControlFlowNumelZero3
ControlFlowNumelZero4
ControlFlowNumelZero5
ControlFlowRanks
ControlFlowRanksType
ControlFlowScan
ControlFlowScan2Carried
ControlFlowScanCDist
ControlFlowScanCDist2
ControlFlowScanCDistXY
ControlFlowScanDecomposition_151564
ControlFlowScanInplace_153705
ControlFlowShapeCheck
ControlFlowWhile
ControlFlowWhileDec
ControlFlowWhileInc
CreateFromShape
CreateFromShapeThroughFunction
CropLastDimensionWithTensorContent
CropLastDimensionWithTensorShape
DynamicCacheInput
DynamicCacheInputMixedLayers
ExportWithDimension0
ExportWithDimension1
ExportWithNewConstant
ExportWithNewConstantTo
InplaceAdd
InplaceAdd2
InplaceAdd_Mul
InplaceCloneAdd_
InplaceSetItemEllipsis_1
InplaceSetItemEllipsis_2
InplaceSetItemExp
InplaceSetItemMask
InplaceSetItemSquare
InplaceSetItemSquareAdd
InplaceSetItemSquareAdd2
LayerNorm
ShapeAndTypeAndDeviceBased
ShapeAndTypeBased
ShapeBased
SignatureFloat1
SignatureInt1
SignatureInt2
SignatureListFixedLength
SignatureListFixedWithNone
SignatureListVariableLength
SignatureShapeAsIndex
TinyLLM
TinyLLMfp16
TypeBFloat16
Vmap
VmapPython

AtenAsStrided#

code: yobx.torch.testing._model_eval_cases.AtenAsStrided

forward#

def forward(self, x):
    y = torch.as_strided(x, (2, 2, 8, 4), (128, 8, 16, 1))
    return y

yobx#

inputs: #1[(T1s2x2x8x8,)]
shapes: dict(x:{0:Dim(batch)})

opset: domain='' version=21
input: name='x' type=dtype('float32') shape=['batch', 2, 8, 8]
init: name='init7_s1_-1' type=int64 shape=(1,) -- array([-1])         -- Opset.make_node.1/Shape
init: name='init7_s128_' type=int64 shape=(128,)                      -- Opset.make_node.0
init: name='init7_s4_2_2_8_4' type=int64 shape=(4,) -- array([2, 2, 8, 4])-- Opset.make_node.1/Shape
Reshape(x, init7_s1_-1) -> x::RSh-1
  Gather(x::RSh-1, init7_s128_) -> _onx_gather_x::RSh-1
    Reshape(_onx_gather_x::RSh-1, init7_s4_2_2_8_4) -> output_0
output: name='output_0' type=dtype('float32') shape=[2, 2, 8, 4]

dynamo-ir#

inputs: #1[(T1s2x2x8x8,)]
shapes: dict(x:{0:Dim(batch)})

opset: domain='' version=20
input: name='x' type=dtype('float32') shape=['batch', 2, 8, 8]
init: name='rank_tensor' type=int64 shape=(1,) -- array([4])
init: name='val_0' type=int64 shape=(4,) -- array([2, 2, 8, 4])
init: name='val_1' type=int64 shape=(4,) -- array([128,   8,  16,   1])
init: name='neg_1' type=int64 shape=(1,) -- array([-1])
init: name='indices' type=int64 shape=() -- array([0])
init: name='rank_0' type=int64 shape=() -- array([4])
init: name='int64_1_cast' type=int64 shape=() -- array([1])
init: name='tmp_14' type=float32 shape=(1,) -- array([1.], dtype=float32)
Reshape(x, neg_1) -> self_flatten
SequenceEmpty() -> one_seq
Loop(rank_0, , indices, one_seq, body=G1) -> indices_16, one_seq_17
  CastLike(indices, indices_16) -> storage_offset_cast
  Add(indices_16, storage_offset_cast) -> indices_19
  Gather(self_flatten, indices_19) -> as_strided
output: name='as_strided' type=dtype('float32') shape=[2, 2, 8, 4]
----- subgraph ---- Loop - n6_2 - att.body=G1 -- level=1 -- i,cond_in,indices_1,one_seq_2 -> cond_out,indices_13,one_seq_15
input: name='i' type=dtype('int64') shape=None
input: name='cond_in' type=dtype('bool') shape=None
input: name='indices_1' type='NOTENSOR' shape=None
input: name='one_seq_2' type='NOTENSOR' shape=None
Equal(i, indices) -> cond
Sub(rank_0, i) -> tmp
Sub(tmp, int64_1_cast) -> j
Reshape(j, neg_1) -> j_tensor
Gather(val_0, j_tensor, axis=0) -> size_dim_j
Range(indices, size_dim_j, int64_1_cast) -> tmp_6
Slice(val_0, j_tensor, rank_tensor) -> size_after_j
  Expand(indices_1, size_after_j) -> indices_4
Gather(val_1, j_tensor, axis=0) -> stride_dim_j
  Mul(tmp_6, stride_dim_j) -> add_value
If(cond, then_branch=G2, else_branch=G3) -> shape_11
  Reshape(add_value, shape_11) -> add_value_12
    Add(indices_4, add_value_12) -> indices_13
SequenceInsert(one_seq_2, tmp_14) -> one_seq_15
Identity(cond_in) -> cond_out
output: name='cond_out' type=dtype('bool') shape=None
output: name='indices_13' type='NOTENSOR' shape=None
output: name='one_seq_15' type='NOTENSOR' shape=None
----- subgraph ---- If - n20 - att.then_branch=G2 -- level=2 --  -> shape
Identity(size_dim_j) -> shape
output: name='shape' type=dtype('int64') shape=[1]
----- subgraph ---- If - n20 - att.else_branch=G3 -- level=2 --  -> shape_10
Cast(size_dim_j, to=1) -> tmp_8
ConcatFromSequence(one_seq_2, axis=0) -> ones
  Concat(tmp_8, ones, axis=0) -> shape_9
    Cast(shape_9, to=7) -> shape_10
output: name='shape_10' type=dtype('int64') shape=None
----- subgraph ---- If - n20 - att.then_branch=G2 -- level=1 --  -> shape
Identity(size_dim_j) -> shape
output: name='shape' type=dtype('int64') shape=[1]
----- subgraph ---- If - n20 - att.else_branch=G3 -- level=1 --  -> shape_10
Cast(size_dim_j, to=1) -> tmp_8
ConcatFromSequence(one_seq_2, axis=0) -> ones
  Concat(tmp_8, ones, axis=0) -> shape_9
    Cast(shape_9, to=7) -> shape_10
output: name='shape_10' type=dtype('int64') shape=None

tracing#

inputs: #1[(T1s2x2x8x8,)]
shapes: dict(x:{0:Dim(batch)})

opset: domain='' version=21
input: name='x' type=dtype('float32') shape=['batch', 2, 8, 8]
init: name='init7_s1_-1' type=int64 shape=(1,) -- array([-1])         -- Opset.make_node.1/Shape
init: name='init7_s128_' type=int64 shape=(128,)                      -- Opset.make_node.0
init: name='init7_s4_2_2_8_4' type=int64 shape=(4,) -- array([2, 2, 8, 4])-- Opset.make_node.1/Shape
Reshape(x, init7_s1_-1) -> x::RSh-1
  Gather(x::RSh-1, init7_s128_) -> _onx_gather_x::RSh-1
    Reshape(_onx_gather_x::RSh-1, init7_s4_2_2_8_4) -> output
output: name='output' type=dtype('float32') shape=[2, 2, 8, 4]

new-tracing#

inputs: #1[(T1s2x2x8x8,)]
shapes: dict(x:{0:Dim(batch)})

opset: domain='' version=21
input: name='x' type=dtype('float32') shape=['batch', 2, 8, 8]
init: name='init7_s1_-1' type=int64 shape=(1,) -- array([-1])         -- Opset.make_node.1/Shape
init: name='init7_s128_' type=int64 shape=(128,)                      -- Opset.make_node.0
init: name='init7_s4_2_2_8_4' type=int64 shape=(4,) -- array([2, 2, 8, 4])-- Opset.make_node.1/Shape
Reshape(x, init7_s1_-1) -> x::RSh-1
  Gather(x::RSh-1, init7_s128_) -> _onx_gather_x::RSh-1
    Reshape(_onx_gather_x::RSh-1, init7_s4_2_2_8_4) -> output
output: name='output' type=dtype('float32') shape=[2, 2, 8, 4]

AtenInterpolate#

code: yobx.torch.testing._model_eval_cases.AtenInterpolate

forward#

def forward(self, x):
    y = torch.nn.functional.interpolate(
        x, scale_factor=2.0, mode="bilinear", recompute_scale_factor=False
    )
    return y

yobx#

inputs: #1[(T1s2x2x3x4,)]
shapes: dict(x:{0:Dim(batch)})

opset: domain='' version=21
input: name='x' type=dtype('float32') shape=['batch', 2, 3, 4]
init: name='init7_s2_6_8' type=int64 shape=(2,) -- array([6, 8])      -- _aten_upsample_output_size.rsize
Shape(x, end=2, start=0) -> x::Shape:2
  Concat(x::Shape:2, init7_s2_6_8, axis=0) -> _onx_concat_x::Shape:2
Resize(x, , , _onx_concat_x::Shape:2, coordinate_transformation_mode=b'pytorch_half_pixel', mode=b'linear', nearest_mode=b'floor') -> output_0
output: name='output_0' type=dtype('float32') shape=['batch', 2, 6, 8]

dynamo-ir#

inputs: #1[(T1s2x2x3x4,)]
shapes: dict(x:{0:Dim(batch)})

opset: domain='' version=20
input: name='x' type=dtype('float32') shape=['batch', 2, 3, 4]
init: name='val_0' type=float32 shape=(4,) -- array([1., 1., 2., 2.], dtype=float32)
Resize(x, , val_0, keep_aspect_ratio_policy=b'stretch', antialias=0, extrapolation_value=0.00, exclude_outside=0, nearest_mode=b'floor', coordinate_transformation_mode=b'pytorch_half_pixel', cubic_coeff_a=-0.75, mode=b'linear') -> upsample_bilinear2d
output: name='upsample_bilinear2d' type=dtype('float32') shape=['batch', 2, 6, 8]

tracing#

inputs: #1[(T1s2x2x3x4,)]
shapes: dict(x:{0:Dim(batch)})

opset: domain='' version=21
input: name='x' type=dtype('float32') shape=['batch', 2, 3, 4]
init: name='init7_s2_6_8' type=int64 shape=(2,) -- array([6, 8])      -- _aten_upsample_output_size.rsize
Shape(x, end=2, start=0) -> x::Shape:2
  Concat(x::Shape:2, init7_s2_6_8, axis=0) -> _onx_concat_x::Shape:2
Resize(x, , , _onx_concat_x::Shape:2, coordinate_transformation_mode=b'pytorch_half_pixel', mode=b'linear', nearest_mode=b'floor') -> output
output: name='output' type=dtype('float32') shape=['d_output_0', 'd_output_1', 'd_output_2', 'd_output_3']

new-tracing#

inputs: #1[(T1s2x2x3x4,)]
shapes: dict(x:{0:Dim(batch)})

opset: domain='' version=21
input: name='x' type=dtype('float32') shape=['batch', 2, 3, 4]
init: name='init7_s2_6_8' type=int64 shape=(2,) -- array([6, 8])      -- _aten_upsample_output_size.rsize
Shape(x, end=2, start=0) -> x::Shape:2
  Concat(x::Shape:2, init7_s2_6_8, axis=0) -> _onx_concat_x::Shape:2
Resize(x, , , _onx_concat_x::Shape:2, coordinate_transformation_mode=b'pytorch_half_pixel', mode=b'linear', nearest_mode=b'floor') -> output
output: name='output' type=dtype('float32') shape=['batch', 2, 6, 8]

AtenNnFunctionalBilinear#

code: yobx.torch.testing._model_eval_cases.AtenNnFunctionalBilinear

forward#

def forward(self, x1, x2):
    return torch.nn.functional.bilinear(x1, x2, self.weight, self.bias)

yobx#

inputs: #1[(T1s2x3,T1s2x4)]
shapes: dict(x1:{0:Dim(batch)},x2:{0:Dim(batch)})

opset: domain='' version=21
input: name='x1' type=dtype('float32') shape=['batch', 3]
input: name='x2' type=dtype('float32') shape=['batch', 4]
init: name='init7_s1_-1' type=int64 shape=(1,) -- array([-1])         -- Opset.make_node.1/Shape##Opset.make_node.1/Shape##ConcatReshapePattern.m1
init: name='init7_s1_-2' type=int64 shape=(1,) -- array([-2])         -- Opset.make_node.1/Shape
init: name='p_weight::T102::RSh3x20' type=float32 shape=(3, 20)       -- GraphBuilder.constant_folding.from/fold(_onx_concat_slice_p_weight::Shape:,p_weight::T102)##p_weight::T102/GraphBuilder.constant_folding.from/fold(p_weight)##p_weight/DynamoInterpret.placeholder.1/P(weight)##_onx_concat_slice_p_weight::Shape:/GraphBuilder.constant_folding.from/fold(_onx_mul_gather_p_weight::Shape:::UnSq0,_onx_slice_p_weight::Shape:)##_onx_slice_p_weight::Shape:/##_onx_mul_gather_p_weight::Shape:::UnSq0/GraphBuilder.constant_folding.from/fold(_onx_mul_gather_p_weight::Shape:,init7_s1_0)##_onx_mul_gather_p_weight::Shape:/GraphBuilder.constant_folding.from/fold(_onx_gather_p_weight::Shape:,_onx_gather_p_weight::Shape:2)##_onx_gather_p_weight::Shape:/##_onx_gather_p_weight::Shape:2/##init7_s1_0/Opset.make_node.1/Shape##Opset.make_node.1/Shape##Opset.make_node.1/Shape##Opset.make_node.1/Shape
init: name='init7_s3_0_5_4' type=int64 shape=(3,) -- array([0, 5, 4]) -- EditDistanceReshapePattern.m1
init: name='bias' type=float32 shape=(5,)                             -- DynamoInterpret.placeholder.1/P(bias)
MatMul(x1, p_weight::T102::RSh3x20) -> _onx_matmul_x1
  Reshape(_onx_matmul_x1, init7_s3_0_5_4) -> _onx_matmul_x1::RSh
Unsqueeze(x2, init7_s1_-2) -> x2::UnSq-2
  Mul(_onx_matmul_x1::RSh, x2::UnSq-2) -> _onx_mul_matmul_x1::RSh
    ReduceSum(_onx_mul_matmul_x1::RSh, init7_s1_-1, keepdims=0) -> _onx_reducesum_mul_matmul_x1::RSh
      Add(_onx_reducesum_mul_matmul_x1::RSh, bias) -> output_0
output: name='output_0' type=dtype('float32') shape=['batch', 5]

dynamo-ir#

inputs: #1[(T1s2x3,T1s2x4)]
shapes: dict(x1:{0:Dim(batch)},x2:{0:Dim(batch)})

opset: domain='' version=20
input: name='x1' type=dtype('float32') shape=['batch', 3]
input: name='x2' type=dtype('float32') shape=['batch', 4]
init: name='bias' type=float32 shape=(5,)
init: name='val_8' type=float32 shape=(3, 20)
init: name='val_2' type=int64 shape=(1,) -- array([4])
init: name='val_3' type=int64 shape=(1,) -- array([5])
init: name='val_4' type=int64 shape=(1,) -- array([-1])
MatMul(x1, val_8) -> val_9
Shape(x1, end=-1, start=0) -> val_0
  Concat(val_0, val_3, val_2, axis=0) -> val_10
  Reshape(val_9, val_10, allowzero=0) -> val_11
Unsqueeze(x2, val_4) -> val_12
  MatMul(val_11, val_12) -> val_13
    Squeeze(val_13, val_4) -> val_14
      Add(val_14, bias) -> bilinear
output: name='bilinear' type=dtype('float32') shape=['batch', 5]

tracing#

inputs: #1[(T1s2x3,T1s2x4)]
shapes: dict(x1:{0:Dim(batch)},x2:{0:Dim(batch)})

opset: domain='' version=21
input: name='x1' type=dtype('float32') shape=['batch', 3]
input: name='x2' type=dtype('float32') shape=['batch', 4]
init: name='bias' type=float32 shape=(5,)                             -- DynamoInterpret.get_attr.1/P(bias)
init: name='init7_s1_-1' type=int64 shape=(1,) -- array([-1])         -- Opset.make_node.1/Shape##Opset.make_node.1/Shape##ConcatReshapePattern.m1
init: name='init7_s1_-2' type=int64 shape=(1,) -- array([-2])         -- Opset.make_node.1/Shape
init: name='weight::T102::RSh3x20' type=float32 shape=(3, 20)         -- GraphBuilder.constant_folding.from/fold(_onx_concat_slice_weight::Shape:,weight::T102)##weight::T102/GraphBuilder.constant_folding.from/fold(weight)##weight/DynamoInterpret.get_attr.1/P(weight)##_onx_concat_slice_weight::Shape:/GraphBuilder.constant_folding.from/fold(_onx_mul_gather_weight::Shape:::UnSq0,_onx_slice_weight::Shape:)##_onx_slice_weight::Shape:/##_onx_mul_gather_weight::Shape:::UnSq0/GraphBuilder.constant_folding.from/fold(_onx_mul_gather_weight::Shape:,init7_s1_0)##_onx_mul_gather_weight::Shape:/GraphBuilder.constant_folding.from/fold(_onx_gather_weight::Shape:,_onx_gather_weight::Shape:2)##_onx_gather_weight::Shape:/##_onx_gather_weight::Shape:2/##init7_s1_0/Opset.make_node.1/Shape##Opset.make_node.1/Shape##Opset.make_node.1/Shape##Opset.make_node.1/Shape
init: name='init7_s3_0_5_4' type=int64 shape=(3,) -- array([0, 5, 4]) -- EditDistanceReshapePattern.m1
MatMul(x1, weight::T102::RSh3x20) -> _onx_matmul_x1
  Reshape(_onx_matmul_x1, init7_s3_0_5_4) -> _onx_matmul_x1::RSh
Unsqueeze(x2, init7_s1_-2) -> x2::UnSq-2
  Mul(_onx_matmul_x1::RSh, x2::UnSq-2) -> _onx_mul_matmul_x1::RSh
    ReduceSum(_onx_mul_matmul_x1::RSh, init7_s1_-1, keepdims=0) -> _onx_reducesum_mul_matmul_x1::RSh
      Add(_onx_reducesum_mul_matmul_x1::RSh, bias) -> output
output: name='output' type=dtype('float32') shape=['batch', 5]

new-tracing#

inputs: #1[(T1s2x3,T1s2x4)]
shapes: dict(x1:{0:Dim(batch)},x2:{0:Dim(batch)})

opset: domain='' version=21
input: name='x1' type=dtype('float32') shape=['batch', 3]
input: name='x2' type=dtype('float32') shape=['batch', 4]
init: name='bias' type=float32 shape=(5,)                             -- DynamoInterpret.placeholder.1/P(bias)
init: name='param_1' type=float32 shape=(3, 20)                       -- DynamoInterpret.placeholder.0
init: name='init7_s3_-1_5_4' type=int64 shape=(3,) -- array([-1,  5,  4])-- Opset.make_node.1/Shape
init: name='init7_s1_-1' type=int64 shape=(1,) -- array([-1])         -- Opset.make_node.1/Shape
init: name='init7_s1_1' type=int64 shape=(1,) -- array([1])           -- ReshapeIsSqueezePattern.m1
MatMul(x1, param_1) -> mm_default
  Reshape(mm_default, init7_s3_-1_5_4) -> view_default
Unsqueeze(x2, init7_s1_1) -> view_default_1
  Mul(view_default, view_default_1) -> mul_tensor
    ReduceSum(mul_tensor, init7_s1_-1, keepdims=0) -> sum_dim_int_list
      Add(sum_dim_int_list, bias) -> output
output: name='output' type=dtype('float32') shape=['batch', 5]

AtenNonZero#

code: yobx.torch.testing._model_eval_cases.AtenNonZero

forward#

def forward(self, x):
    y = torch.nonzero(x)
    return y

yobx#

inputs: #1[(T1s3x4,)]
shapes: dict(x:{0:Dim(batch)})

opset: domain='' version=21
input: name='x' type=dtype('float32') shape=['batch', 4]
NonZero(x) -> _onx_nonzero_x
  Transpose(_onx_nonzero_x, perm=[1,0]) -> output_0
output: name='output_0' type=dtype('int64') shape=['NEWDIM_nonzero', 2]

dynamo-ir#

inputs: #1[(T1s3x4,)]
shapes: dict(x:{0:Dim(batch)})

opset: domain='' version=20
input: name='x' type=dtype('float32') shape=['batch', 4]
NonZero(x) -> val_0
  Transpose(val_0, perm=[1,0]) -> nonzero
output: name='nonzero' type=dtype('int64') shape=['u0', 2]

tracing#

inputs: #1[(T1s3x4,)]
shapes: dict(x:{0:Dim(batch)})

opset: domain='' version=21
input: name='x' type=dtype('float32') shape=['batch', 4]
NonZero(x) -> _onx_nonzero_x
  Transpose(_onx_nonzero_x, perm=[1,0]) -> output
output: name='output' type=dtype('int64') shape=['NEWDIM_nonzero', 2]

new-tracing#

inputs: #1[(T1s3x4,)]
shapes: dict(x:{0:Dim(batch)})

opset: domain='' version=21
input: name='x' type=dtype('float32') shape=['batch', 4]
NonZero(x) -> _onx_nonzero_x
  Transpose(_onx_nonzero_x, perm=[1,0]) -> output
output: name='output' type=dtype('int64') shape=['NEWDIM_nonzero', 2]

AtenNonZeroTuple#

code: yobx.torch.testing._model_eval_cases.AtenNonZeroTuple

forward#

def forward(self, x):
    y = torch.nonzero(x, as_tuple=True)
    return y[0], y[1]

yobx#

inputs: #1[(T1s3x4,)]
shapes: dict(x:{0:Dim(batch)})

opset: domain='' version=21
input: name='x' type=dtype('float32') shape=['batch', 4]
init: name='init7_s1_0' type=int64 shape=(1,) -- array([0])           -- ReshapeIsSqueezePattern.m1##ReshapeIsSqueezePattern.m1
NonZero(x) -> _onx_nonzero_x
  Split(_onx_nonzero_x, num_outputs=2) -> _onx_split_nonzero_x_0, _onx_split_nonzero_x_1
    Squeeze(_onx_split_nonzero_x_0, init7_s1_0) -> output_0
Squeeze(_onx_split_nonzero_x_1, init7_s1_0) -> output_1
output: name='output_0' type=dtype('int64') shape=['u0']
output: name='output_1' type=dtype('int64') shape=['u0']

dynamo-ir#

inputs: #1[(T1s3x4,)]
shapes: dict(x:{0:Dim(batch)})

opset: domain='' version=20
input: name='x' type=dtype('float32') shape=['batch', 4]
init: name='val_3' type=int64 shape=(1,) -- array([0])
init: name='val_4' type=int64 shape=(1,) -- array([1])
init: name='val_9' type=int64 shape=(1,) -- array([2])
NonZero(x) -> val_0
  Transpose(val_0, perm=[1,0]) -> nonzero
    Slice(nonzero, val_3, val_4, val_4) -> val_6
      Squeeze(val_6, val_4) -> getitem
    Slice(nonzero, val_4, val_9, val_4) -> val_11
      Squeeze(val_11, val_4) -> getitem_1
output: name='getitem' type=dtype('int64') shape=['u0']
output: name='getitem_1' type=dtype('int64') shape=['u0']

tracing#

inputs: #1[(T1s3x4,)]
shapes: dict(x:{0:Dim(batch)})

opset: domain='' version=21
input: name='x' type=dtype('float32') shape=['batch', 4]
init: name='init7_s1_0' type=int64 shape=(1,) -- array([0])           -- ReshapeIsSqueezePattern.m1##ReshapeIsSqueezePattern.m1
NonZero(x) -> _onx_nonzero_x
  Split(_onx_nonzero_x, num_outputs=2) -> _onx_split_nonzero_x_0, _onx_split_nonzero_x_1
    Squeeze(_onx_split_nonzero_x_0, init7_s1_0) -> output_0
Squeeze(_onx_split_nonzero_x_1, init7_s1_0) -> output_1
output: name='output_0' type=dtype('int64') shape=['NEWDIM_nonzero']
output: name='output_1' type=dtype('int64') shape=['NEWDIM_nonzero']

new-tracing#

inputs: #1[(T1s3x4,)]
shapes: dict(x:{0:Dim(batch)})

opset: domain='' version=21
input: name='x' type=dtype('float32') shape=['batch', 4]
init: name='init7_s1_1' type=int64 shape=(1,) -- array([1])           -- aten_unbind_int.dim_np
NonZero(x) -> _onx_nonzero_x
  Transpose(_onx_nonzero_x, perm=[1,0]) -> nonzero_default
    Split(nonzero_default, axis=1, num_outputs=2) -> unbind_int_u0, unbind_int_u1
      Squeeze(unbind_int_u0, init7_s1_1) -> output_0
Squeeze(unbind_int_u1, init7_s1_1) -> output_1
output: name='output_0' type=dtype('int64') shape=['NEWDIM_nonzero']
output: name='output_1' type=dtype('int64') shape=['NEWDIM_nonzero']

AtenRollPos#

code: yobx.torch.testing._model_eval_cases.AtenRollPos

forward#

def forward(self, x):
    return torch.roll(x, 1, -1)

yobx#

inputs: #1[(T1s2x3x4,)]
shapes: dict(x:{0:Dim(batch)})

opset: domain='' version=21
input: name='x' type=dtype('float32') shape=['batch', 3, 4]
init: name='init7_s1_-1' type=int64 shape=(1,) -- array([-1])         -- Opset.make_node.1/Shape##Opset.make_node.1/Shape##Opset.make_node.1/Shape##Opset.make_node.1/Shape
init: name='init7_s1_4' type=int64 shape=(1,) -- array([4])           -- Opset.make_node.1/Shape
init: name='init7_s1_0' type=int64 shape=(1,) -- array([0])           -- Opset.make_node.1/Shape
Slice(x, init7_s1_-1, init7_s1_4, init7_s1_-1) -> _onx_slice_x
Slice(x, init7_s1_0, init7_s1_-1, init7_s1_-1) -> _onx_slice_x2
  Concat(_onx_slice_x, _onx_slice_x2, axis=-1) -> output_0
output: name='output_0' type=dtype('float32') shape=['batch', 3, 4]

dynamo-ir#

FAILED

An error occurred when running the '<onnx_ir.passes.PassManager object at 0x78532ef60e90>' pass after the following passes: ['<onnx_ir.passes.common.inliner.InlinePass object at 0x7853342732c0>']

tracing#

inputs: #1[(T1s2x3x4,)]
shapes: dict(x:{0:Dim(batch)})

opset: domain='' version=21
input: name='x' type=dtype('float32') shape=['batch', 3, 4]
init: name='init7_s1_-1' type=int64 shape=(1,) -- array([-1])         -- Opset.make_node.1/Shape##Opset.make_node.1/Shape##Opset.make_node.1/Shape##Opset.make_node.1/Shape
init: name='init7_s1_4' type=int64 shape=(1,) -- array([4])           -- Opset.make_node.1/Shape
init: name='init7_s1_0' type=int64 shape=(1,) -- array([0])           -- Opset.make_node.1/Shape
Slice(x, init7_s1_-1, init7_s1_4, init7_s1_-1) -> _onx_slice_x
Slice(x, init7_s1_0, init7_s1_-1, init7_s1_-1) -> _onx_slice_x2
  Concat(_onx_slice_x, _onx_slice_x2, axis=-1) -> output
output: name='output' type=dtype('float32') shape=['batch', 3, 4]

new-tracing#

inputs: #1[(T1s2x3x4,)]
shapes: dict(x:{0:Dim(batch)})

opset: domain='' version=21
input: name='x' type=dtype('float32') shape=['batch', 3, 4]
init: name='init7_s1_-1' type=int64 shape=(1,) -- array([-1])         -- Opset.make_node.1/Shape##Opset.make_node.1/Shape##Opset.make_node.1/Shape##Opset.make_node.1/Shape
init: name='init7_s1_4' type=int64 shape=(1,) -- array([4])           -- Opset.make_node.1/Shape
init: name='init7_s1_0' type=int64 shape=(1,) -- array([0])           -- Opset.make_node.1/Shape
Slice(x, init7_s1_-1, init7_s1_4, init7_s1_-1) -> _onx_slice_x
Slice(x, init7_s1_0, init7_s1_-1, init7_s1_-1) -> _onx_slice_x2
  Concat(_onx_slice_x, _onx_slice_x2, axis=-1) -> output
output: name='output' type=dtype('float32') shape=['batch', 3, 4]

AtenRollRelu#

code: yobx.torch.testing._model_eval_cases.AtenRollRelu

forward#

def forward(self, x):
    return torch.relu(torch.roll(x, -1, -1))

yobx#

inputs: #1[(T1s2x3x4,)]
shapes: dict(x:{0:Dim(batch)})

opset: domain='' version=21
input: name='x' type=dtype('float32') shape=['batch', 3, 4]
init: name='init7_s1_1' type=int64 shape=(1,) -- array([1])           -- Opset.make_node.1/Shape##Opset.make_node.1/Shape
init: name='init7_s1_4' type=int64 shape=(1,) -- array([4])           -- Opset.make_node.1/Shape
init: name='init7_s1_-1' type=int64 shape=(1,) -- array([-1])         -- Opset.make_node.1/Shape##Opset.make_node.1/Shape
init: name='init7_s1_0' type=int64 shape=(1,) -- array([0])           -- Opset.make_node.1/Shape
Slice(x, init7_s1_1, init7_s1_4, init7_s1_-1) -> _onx_slice_x
Slice(x, init7_s1_0, init7_s1_1, init7_s1_-1) -> _onx_slice_x2
  Concat(_onx_slice_x, _onx_slice_x2, axis=-1) -> _onx_concat_slice_x
    Relu(_onx_concat_slice_x) -> output_0
output: name='output_0' type=dtype('float32') shape=['batch', 3, 4]

dynamo-ir#

inputs: #1[(T1s2x3x4,)]
shapes: dict(x:{0:Dim(batch)})

opset: domain='' version=20
input: name='x' type=dtype('float32') shape=['batch', 3, 4]
init: name='val_0' type=int64 shape=(1,) -- array([-1])
init: name='val_2' type=int64 shape=(1,) -- array([1])
init: name='val_3' type=int64 shape=(1,) -- array([0])
Size(x) -> val_5
  Reshape(val_5, val_0, allowzero=0) -> val_6
    Slice(x, val_2, val_6, val_0) -> val_7
Slice(x, val_3, val_2, val_0) -> val_4
  Concat(val_7, val_4, axis=-1) -> roll
    Relu(roll) -> relu
output: name='relu' type=dtype('float32') shape=['batch', 3, 4]

tracing#

inputs: #1[(T1s2x3x4,)]
shapes: dict(x:{0:Dim(batch)})

opset: domain='' version=21
input: name='x' type=dtype('float32') shape=['batch', 3, 4]
init: name='init7_s1_1' type=int64 shape=(1,) -- array([1])           -- Opset.make_node.1/Shape##Opset.make_node.1/Shape
init: name='init7_s1_4' type=int64 shape=(1,) -- array([4])           -- Opset.make_node.1/Shape
init: name='init7_s1_-1' type=int64 shape=(1,) -- array([-1])         -- Opset.make_node.1/Shape##Opset.make_node.1/Shape
init: name='init7_s1_0' type=int64 shape=(1,) -- array([0])           -- Opset.make_node.1/Shape
Slice(x, init7_s1_1, init7_s1_4, init7_s1_-1) -> _onx_slice_x
Slice(x, init7_s1_0, init7_s1_1, init7_s1_-1) -> _onx_slice_x2
  Concat(_onx_slice_x, _onx_slice_x2, axis=-1) -> _onx_concat_slice_x
    Relu(_onx_concat_slice_x) -> output
output: name='output' type=dtype('float32') shape=['batch', 3, 4]

new-tracing#

inputs: #1[(T1s2x3x4,)]
shapes: dict(x:{0:Dim(batch)})

opset: domain='' version=21
input: name='x' type=dtype('float32') shape=['batch', 3, 4]
init: name='init7_s1_1' type=int64 shape=(1,) -- array([1])           -- Opset.make_node.1/Shape##Opset.make_node.1/Shape
init: name='init7_s1_4' type=int64 shape=(1,) -- array([4])           -- Opset.make_node.1/Shape
init: name='init7_s1_-1' type=int64 shape=(1,) -- array([-1])         -- Opset.make_node.1/Shape##Opset.make_node.1/Shape
init: name='init7_s1_0' type=int64 shape=(1,) -- array([0])           -- Opset.make_node.1/Shape
Slice(x, init7_s1_1, init7_s1_4, init7_s1_-1) -> _onx_slice_x
Slice(x, init7_s1_0, init7_s1_1, init7_s1_-1) -> _onx_slice_x2
  Concat(_onx_slice_x, _onx_slice_x2, axis=-1) -> _onx_concat_slice_x
    Relu(_onx_concat_slice_x) -> output
output: name='output' type=dtype('float32') shape=['batch', 3, 4]

BuildInIsInstance#

code: yobx.torch.testing._model_eval_cases.BuildInIsInstance

forward#

def forward(self, x, lx: list | torch.Tensor):
    if isinstance(lx, list):
        t = lx[0] * lx[1].sum(axis=1, keepdim=True)
        return torch.sigmoid(self.linear(x)) - self.buff + t
    return torch.sigmoid(self.linear(x)) - self.buff + lx

yobx#

inputs: #2[(T1s4x3,#2[T1s4x1,T1s4x2]),(T1s8x3,#2[T1s8x1,T1s8x2])]
shapes: dict(x:{0:Dim(batch)},lx:#2[{0:Dim(batch)},{0:Dim(batch)}])

opset: domain='' version=21
input: name='x' type=dtype('float32') shape=['batch', 3]
input: name='lx_0' type=dtype('float32') shape=['batch', 1]
input: name='lx_1' type=dtype('float32') shape=['batch', 2]
init: name='b_buff' type=float32 shape=(1,) -- array([0.5], dtype=float32)-- DynamoInterpret.placeholder.0
init: name='init7_s1_1' type=int64 shape=(1,) -- array([1])           -- Opset.make_node.1/Shape
init: name='GemmTransposePattern--p_linear_weight::T10' type=float32 shape=(1, 3) -- array([ 0.5522241 ,  0.20047867, -0.23895475], dtype=float32)-- GraphBuilder.constant_folding.from/fold(init7_s2_1_3,p_linear_weight::T10)##p_linear_weight::T10/GraphBuilder.constant_folding.from/fold(p_linear_weight)##p_linear_weight/DynamoInterpret.placeholder.1/P(linear.weight)##init7_s2_1_3/TransposeEqualReshapePattern.apply.new_shape
init: name='linear.bias' type=float32 shape=(1,) -- array([0.24268675], dtype=float32)-- DynamoInterpret.placeholder.1/P(linear.bias)
Gemm(x, GemmTransposePattern--p_linear_weight::T10, linear.bias, transB=1) -> _onx_add_matmul_x
  Sigmoid(_onx_add_matmul_x) -> sigmoid
    Sub(sigmoid, b_buff) -> sub
ReduceSum(lx_1, init7_s1_1, keepdims=1) -> sum_1
  Mul(lx_0, sum_1) -> mul
    Add(sub, mul) -> output_0
output: name='output_0' type=dtype('float32') shape=['batch', 1]

dynamo-ir#

inputs: #2[(T1s4x3,#2[T1s4x1,T1s4x2]),(T1s8x3,#2[T1s8x1,T1s8x2])]
shapes: dict(x:{0:Dim(batch)},lx:#2[{0:Dim(batch)},{0:Dim(batch)}])

opset: domain='' version=20
input: name='x' type=dtype('float32') shape=['batch', 3]
input: name='lx_0' type=dtype('float32') shape=['batch', 1]
input: name='lx_1' type=dtype('float32') shape=['batch', 2]
init: name='linear.weight' type=float32 shape=(1, 3) -- array([ 0.35791367, -0.5074354 , -0.01255095], dtype=float32)
init: name='linear.bias' type=float32 shape=(1,) -- array([-0.29197648], dtype=float32)
init: name='buff' type=float32 shape=(1,) -- array([0.5], dtype=float32)
init: name='val_3' type=int64 shape=(1,) -- array([1])
Gemm(x, linear.weight, linear.bias, beta=1.00, transB=1, alpha=1.00, transA=0) -> linear
  Sigmoid(linear) -> sigmoid
    Sub(sigmoid, buff) -> sub_4
ReduceSum(lx_1, val_3, noop_with_empty_axes=0, keepdims=1) -> sum_1
  Mul(lx_0, sum_1) -> mul_1
    Add(sub_4, mul_1) -> add_15
output: name='add_15' type=dtype('float32') shape=['batch', 1]

tracing#

inputs: #2[(T1s4x3,#2[T1s4x1,T1s4x2]),(T1s8x3,#2[T1s8x1,T1s8x2])]
shapes: dict(x:{0:Dim(batch)},lx:#2[{0:Dim(batch)},{0:Dim(batch)}])

opset: domain='' version=21
input: name='x' type=dtype('float32') shape=['batch', 3]
input: name='lx_0' type=dtype('float32') shape=['batch', 1]
input: name='lx_1' type=dtype('float32') shape=['batch', 2]
init: name='init7_s1_1' type=int64 shape=(1,) -- array([1])           -- Opset.make_node.1/Shape
init: name='_traced_m2.linear.bias' type=float32 shape=(1,) -- array([-0.5341333], dtype=float32)-- GraphBuilder.make_nodes/from_traced_m2.linear.bias##DynamoInterpret.get_attr.1/P(_traced_m2.linear.bias)
init: name='_traced_m2_buff' type=float32 shape=(1,) -- array([0.5], dtype=float32)-- DynamoInterpret.get_attr.0
init: name='GemmTransposePattern--_sub_ime___traced_m2_linear_weight::T10' type=float32 shape=(1, 3) -- array([-0.13798325, -0.4779298 , -0.4526842 ], dtype=float32)-- GraphBuilder.constant_folding.from/fold(_sub_ime___traced_m2_linear_weight::T10,init7_s2_1_3)##_sub_ime___traced_m2_linear_weight::T10/GraphBuilder.constant_folding.from/fold(_traced_m2.linear.weight)##_traced_m2.linear.weight/GraphBuilder.make_nodes/from_traced_m2.linear.weight##DynamoInterpret.get_attr.1/P(_traced_m2.linear.weight)##init7_s2_1_3/TransposeEqualReshapePattern.apply.new_shape
Gemm(x, GemmTransposePattern--_sub_ime___traced_m2_linear_weight::T10, _traced_m2.linear.bias, transB=1) -> _sub_ime___traced_m2_linear__onx_add_matmul_input_1
  Sigmoid(_sub_ime___traced_m2_linear__onx_add_matmul_input_1) -> sigmoid
    Sub(sigmoid, _traced_m2_buff) -> sub
ReduceSum(lx_1, init7_s1_1, keepdims=1) -> sum_1
  Mul(lx_0, sum_1) -> mul
    Add(sub, mul) -> output
output: name='output' type=dtype('float32') shape=['batch', 1]

new-tracing#

inputs: #2[(T1s4x3,#2[T1s4x1,T1s4x2]),(T1s8x3,#2[T1s8x1,T1s8x2])]
shapes: dict(x:{0:Dim(batch)},lx:#2[{0:Dim(batch)},{0:Dim(batch)}])

opset: domain='' version=21
input: name='x' type=dtype('float32') shape=['batch', 3]
input: name='lx_0' type=dtype('float32') shape=['batch', 1]
input: name='lx_1' type=dtype('float32') shape=['batch', 2]
init: name='buff' type=float32 shape=(1,) -- array([0.5], dtype=float32)-- DynamoInterpret.placeholder.0
init: name='init7_s1_1' type=int64 shape=(1,) -- array([1])           -- Opset.make_node.1/Shape
init: name='GemmTransposePattern--param_1' type=float32 shape=(1, 3) -- array([-0.47557998, -0.25769773, -0.10815234], dtype=float32)-- GraphBuilder.constant_folding.from/fold(init7_s2_1_3,param_1)##param_1/DynamoInterpret.placeholder.0##init7_s2_1_3/TransposeEqualReshapePattern.apply.new_shape
init: name='linear.bias' type=float32 shape=(1,) -- array([0.08265848], dtype=float32)-- DynamoInterpret.placeholder.1/P(linear.bias)
Gemm(x, GemmTransposePattern--param_1, linear.bias, transB=1) -> addmm_default
  Sigmoid(addmm_default) -> sigmoid_default
    Sub(sigmoid_default, buff) -> sub_tensor
ReduceSum(lx_1, init7_s1_1, keepdims=1) -> sum_dim_int_list
  Mul(lx_0, sum_dim_int_list) -> mul_tensor
    Add(sub_tensor, mul_tensor) -> output
output: name='output' type=dtype('float32') shape=['batch', 1]

BuildInLen#

code: yobx.torch.testing._model_eval_cases.BuildInLen

forward#

def forward(self, x, lx: list):
    t = lx[0] * lx[1].sum(axis=1, keepdim=True)
    if len(lx) > 2:
        t = t + lx[2].sum(axis=1, keepdim=True)
    return torch.sigmoid(self.linear(x)) - self.buff + t

yobx#

inputs: #2[(T1s4x3,#2[T1s4x1,T1s4x2]),(T1s8x3,#3[T1s8x1,T1s8x2,T1s8x3])]
shapes: dict(x:{0:Dim(batch)},lx:#2[{0:Dim(batch)},{0:Dim(batch)}])

opset: domain='' version=21
input: name='x' type=dtype('float32') shape=['batch', 3]
input: name='lx_0' type=dtype('float32') shape=['batch', 1]
input: name='lx_1' type=dtype('float32') shape=['batch', 2]
init: name='b_buff' type=float32 shape=(1,) -- array([0.5], dtype=float32)-- DynamoInterpret.placeholder.0
init: name='init7_s1_1' type=int64 shape=(1,) -- array([1])           -- Opset.make_node.1/Shape
init: name='GemmTransposePattern--p_linear_weight::T10' type=float32 shape=(1, 3) -- array([-0.2177556 , -0.11069345, -0.38715976], dtype=float32)-- GraphBuilder.constant_folding.from/fold(init7_s2_1_3,p_linear_weight::T10)##p_linear_weight::T10/GraphBuilder.constant_folding.from/fold(p_linear_weight)##p_linear_weight/DynamoInterpret.placeholder.1/P(linear.weight)##init7_s2_1_3/TransposeEqualReshapePattern.apply.new_shape
init: name='linear.bias' type=float32 shape=(1,) -- array([-0.10824601], dtype=float32)-- DynamoInterpret.placeholder.1/P(linear.bias)
Gemm(x, GemmTransposePattern--p_linear_weight::T10, linear.bias, transB=1) -> _onx_add_matmul_x
  Sigmoid(_onx_add_matmul_x) -> sigmoid
    Sub(sigmoid, b_buff) -> sub
ReduceSum(lx_1, init7_s1_1, keepdims=1) -> sum_1
  Mul(lx_0, sum_1) -> mul
    Add(sub, mul) -> output_0
output: name='output_0' type=dtype('float32') shape=['batch', 1]

FAILED

diff.1

dynamo-ir#

inputs: #2[(T1s4x3,#2[T1s4x1,T1s4x2]),(T1s8x3,#3[T1s8x1,T1s8x2,T1s8x3])]
shapes: dict(x:{0:Dim(batch)},lx:#2[{0:Dim(batch)},{0:Dim(batch)}])

opset: domain='' version=20
input: name='x' type=dtype('float32') shape=['batch', 3]
input: name='lx_0' type=dtype('float32') shape=['batch', 1]
input: name='lx_1' type=dtype('float32') shape=['batch', 2]
init: name='linear.weight' type=float32 shape=(1, 3) -- array([ 0.32544634, -0.17253661, -0.16184795], dtype=float32)
init: name='linear.bias' type=float32 shape=(1,) -- array([0.05756009], dtype=float32)
init: name='buff' type=float32 shape=(1,) -- array([0.5], dtype=float32)
init: name='val_3' type=int64 shape=(1,) -- array([1])
Gemm(x, linear.weight, linear.bias, beta=1.00, transB=1, alpha=1.00, transA=0) -> linear
  Sigmoid(linear) -> sigmoid
    Sub(sigmoid, buff) -> sub_4
ReduceSum(lx_1, val_3, noop_with_empty_axes=0, keepdims=1) -> sum_1
  Mul(lx_0, sum_1) -> mul_1
    Add(sub_4, mul_1) -> add_15
output: name='add_15' type=dtype('float32') shape=['batch', 1]

FAILED

diff.1

tracing#

inputs: #2[(T1s4x3,#2[T1s4x1,T1s4x2]),(T1s8x3,#3[T1s8x1,T1s8x2,T1s8x3])]
shapes: dict(x:{0:Dim(batch)},lx:#2[{0:Dim(batch)},{0:Dim(batch)}])

opset: domain='' version=21
input: name='x' type=dtype('float32') shape=['batch', 3]
input: name='lx_0' type=dtype('float32') shape=['batch', 1]
input: name='lx_1' type=dtype('float32') shape=['batch', 2]
init: name='init7_s1_1' type=int64 shape=(1,) -- array([1])           -- Opset.make_node.1/Shape
init: name='_traced_m2.linear.bias' type=float32 shape=(1,) -- array([0.49734473], dtype=float32)-- GraphBuilder.make_nodes/from_traced_m2.linear.bias##DynamoInterpret.get_attr.1/P(_traced_m2.linear.bias)
init: name='_traced_m2_buff' type=float32 shape=(1,) -- array([0.5], dtype=float32)-- DynamoInterpret.get_attr.0
init: name='GemmTransposePattern--_sub_ime___traced_m2_linear_weight::T10' type=float32 shape=(1, 3) -- array([-0.15868135,  0.5614927 , -0.02161906], dtype=float32)-- GraphBuilder.constant_folding.from/fold(_sub_ime___traced_m2_linear_weight::T10,init7_s2_1_3)##_sub_ime___traced_m2_linear_weight::T10/GraphBuilder.constant_folding.from/fold(_traced_m2.linear.weight)##_traced_m2.linear.weight/GraphBuilder.make_nodes/from_traced_m2.linear.weight##DynamoInterpret.get_attr.1/P(_traced_m2.linear.weight)##init7_s2_1_3/TransposeEqualReshapePattern.apply.new_shape
Gemm(x, GemmTransposePattern--_sub_ime___traced_m2_linear_weight::T10, _traced_m2.linear.bias, transB=1) -> _sub_ime___traced_m2_linear__onx_add_matmul_input_1
  Sigmoid(_sub_ime___traced_m2_linear__onx_add_matmul_input_1) -> sigmoid
    Sub(sigmoid, _traced_m2_buff) -> sub
ReduceSum(lx_1, init7_s1_1, keepdims=1) -> sum_1
  Mul(lx_0, sum_1) -> mul
    Add(sub, mul) -> output
output: name='output' type=dtype('float32') shape=['batch', 1]

FAILED

diff.1

new-tracing#

inputs: #2[(T1s4x3,#2[T1s4x1,T1s4x2]),(T1s8x3,#3[T1s8x1,T1s8x2,T1s8x3])]
shapes: dict(x:{0:Dim(batch)},lx:#2[{0:Dim(batch)},{0:Dim(batch)}])

opset: domain='' version=21
input: name='x' type=dtype('float32') shape=['batch', 3]
input: name='lx_0' type=dtype('float32') shape=['batch', 1]
input: name='lx_1' type=dtype('float32') shape=['batch', 2]
init: name='buff' type=float32 shape=(1,) -- array([0.5], dtype=float32)-- DynamoInterpret.placeholder.0
init: name='init7_s1_1' type=int64 shape=(1,) -- array([1])           -- Opset.make_node.1/Shape
init: name='GemmTransposePattern--param_1' type=float32 shape=(1, 3) -- array([ 0.05076363, -0.20016325,  0.53222644], dtype=float32)-- GraphBuilder.constant_folding.from/fold(init7_s2_1_3,param_1)##param_1/DynamoInterpret.placeholder.0##init7_s2_1_3/TransposeEqualReshapePattern.apply.new_shape
init: name='linear.bias' type=float32 shape=(1,) -- array([-0.49662384], dtype=float32)-- DynamoInterpret.placeholder.1/P(linear.bias)
Gemm(x, GemmTransposePattern--param_1, linear.bias, transB=1) -> addmm_default
  Sigmoid(addmm_default) -> sigmoid_default
    Sub(sigmoid_default, buff) -> sub_tensor
ReduceSum(lx_1, init7_s1_1, keepdims=1) -> sum_dim_int_list
  Mul(lx_0, sum_dim_int_list) -> mul_tensor
    Add(sub_tensor, mul_tensor) -> output
output: name='output' type=dtype('float32') shape=['batch', 1]

FAILED

diff.1

ComplexPolar#

code: yobx.torch.testing._model_eval_cases.ComplexPolar

forward#

def forward(self, x, angle):
    return torch.polar(x, angle)

yobx#

inputs: #1[(T1s4x4,T1s4x4)]
shapes: dict(x:{0:Dim(batch)},angle:{0:Dim(batch)})

opset: domain='' version=21
input: name='x' type=dtype('float32') shape=['batch', 4]
input: name='angle' type=dtype('float32') shape=['batch', 4]
init: name='init14_s1_' type=complex64 shape=(1,) -- array([0.+1.j], dtype=complex64)-- Opset.make_node.1/Small
Cast(x, to=14) -> x::C14
Cos(angle) -> _onx_cos_angle
  Cast(_onx_cos_angle, to=14) -> _onx_cos_angle::C14
Sin(angle) -> _onx_sin_angle
  Cast(_onx_sin_angle, to=14) -> _onx_sin_angle::C14
    Mul(_onx_sin_angle::C14, init14_s1_) -> _onx_mul_sin_angle::C14
    Add(_onx_cos_angle::C14, _onx_mul_sin_angle::C14) -> _onx_add_cos_angle::C14
  Mul(x::C14, _onx_add_cos_angle::C14) -> output_0
output: name='output_0' type=dtype('complex64') shape=['batch', 4]

FAILED

[ONNXRuntimeError] : 10 : INVALID_GRAPH : This is an invalid model. Type Error: Type 'tensor(complex64)' of input parameter (_onx_sin_angle::C14) of operator (Mul) in node (polar5) is invalid.

dynamo-ir#

inputs: #1[(T1s4x4,T1s4x4)]
shapes: dict(x:{0:Dim(batch)},angle:{0:Dim(batch)})

opset: domain='' version=20
input: name='x' type=dtype('float32') shape=['batch', 4]
input: name='angle' type=dtype('float32') shape=['batch', 4]
init: name='int64_m1_1d' type=int64 shape=(1,) -- array([-1])
Cos(angle) -> tmp
  Mul(x, tmp) -> tmp_0
    Unsqueeze(tmp_0, int64_m1_1d) -> real
Sin(angle) -> tmp_1
  Mul(x, tmp_1) -> tmp_2
    Unsqueeze(tmp_2, int64_m1_1d) -> imag
      Concat(real, imag, axis=-1) -> polar
output: name='polar' type=dtype('float32') shape=['batch', 4, 2]

FAILED

diff.0

tracing#

inputs: #1[(T1s4x4,T1s4x4)]
shapes: dict(x:{0:Dim(batch)},angle:{0:Dim(batch)})

opset: domain='' version=21
input: name='x' type=dtype('float32') shape=['batch', 4]
input: name='angle' type=dtype('float32') shape=['batch', 4]
init: name='init14_s1_' type=complex64 shape=(1,) -- array([0.+1.j], dtype=complex64)-- Opset.make_node.1/Small
Cast(x, to=14) -> x::C14
Cos(angle) -> _onx_cos_angle
  Cast(_onx_cos_angle, to=14) -> _onx_cos_angle::C14
Sin(angle) -> _onx_sin_angle
  Cast(_onx_sin_angle, to=14) -> _onx_sin_angle::C14
    Mul(_onx_sin_angle::C14, init14_s1_) -> _onx_mul_sin_angle::C14
    Add(_onx_cos_angle::C14, _onx_mul_sin_angle::C14) -> _onx_add_cos_angle::C14
  Mul(x::C14, _onx_add_cos_angle::C14) -> output
output: name='output' type=dtype('complex64') shape=['batch', 4]

FAILED

[ONNXRuntimeError] : 10 : INVALID_GRAPH : This is an invalid model. Type Error: Type 'tensor(complex64)' of input parameter (_onx_sin_angle::C14) of operator (Mul) in node (polar5) is invalid.

new-tracing#

FAILED

Could not extract specialized integer from data-dependent expression u0 (unhinted: u0).  (Size-like symbols: none)


Caused by: (utils/_stats.py:29 in wrapper)
For more information, run with TORCH_LOGS="dynamic"
For extended logs when we create symbols, also add TORCHDYNAMO_EXTENDED_DEBUG_CREATE_SYMBOL="u0"
If you suspect the guard was triggered from C++, add TORCHDYNAMO_EXTENDED_DEBUG_CPP=1
For more debugging help, see https://docs.google.com/document/d/1HSuTTVvYH1pTew89Rtpeu84Ht3nQEFTYhAX3Ypa_xJs/edit?usp=sharing

For C++ stack trace, run with TORCHDYNAMO_EXTENDED_DEBUG_CPP=1

ControlFlowCond#

code: yobx.torch.testing._model_eval_cases.ControlFlowCond

forward#

def forward(self, x):
    def true_fn(x):
        return torch.sin(x)

    def false_fn(x):
        return torch.cos(x)

    return torch.cond(x.sum() > 0, true_fn, false_fn, [x])

yobx#

inputs: #1[(T1s5x3,)]
shapes: dict(x:{0:Dim(batch)})

opset: domain='' version=21
opset: domain='local_functions' version=1
input: name='x' type=dtype('float32') shape=['batch', 3]
init: name='init1_s_' type=float32 shape=() -- array([0.], dtype=float32)-- shape_type_compute._cast_inputs.1(gt_Scalar)
ReduceSum(x, keepdims=0) -> sum_1
  Greater(sum_1, init1_s_) -> gt
    If(gt, else_branch=G1, then_branch=G2) -> output_0
output: name='output_0' type=dtype('float32') shape=['batch', 3]
----- subgraph ---- If - cond - att.else_branch=G1 -- level=1 --  -> cond#0
Cos(x) -> cond#0
output: name='cond#0' type='NOTENSOR' shape=None
----- subgraph ---- If - cond - att.then_branch=G2 -- level=1 --  -> cond#0
Sin(x) -> cond#0
output: name='cond#0' type='NOTENSOR' shape=None

dynamo-ir#

inputs: #1[(T1s5x3,)]
shapes: dict(x:{0:Dim(batch)})

opset: domain='' version=20
input: name='x' type=dtype('float32') shape=['batch', 3]
init: name='scalar_tensor_default' type=float32 shape=() -- array([0.], dtype=float32)
ReduceSum(x, noop_with_empty_axes=0, keepdims=0) -> sum_1
  Greater(sum_1, scalar_tensor_default) -> gt
    If(gt, then_branch=G1, else_branch=G2) -> getitem
output: name='getitem' type=dtype('float32') shape=['batch', 3]
----- subgraph ---- If - node_cond__0 - att.then_branch=G1 -- level=1 --  -> sin_true_graph_0
Sin(x) -> sin_true_graph_0
output: name='sin_true_graph_0' type=dtype('float32') shape=['batch', 3]
----- subgraph ---- If - node_cond__0 - att.else_branch=G2 -- level=1 --  -> cos_false_graph_0
Cos(x) -> cos_false_graph_0
output: name='cos_false_graph_0' type=dtype('float32') shape=['batch', 3]

tracing#

inputs: #1[(T1s5x3,)]
shapes: dict(x:{0:Dim(batch)})

opset: domain='' version=21
opset: domain='local_functions' version=1
input: name='x' type=dtype('float32') shape=['batch', 3]
init: name='init1_s_' type=float32 shape=() -- array([0.], dtype=float32)-- shape_type_compute._cast_inputs.1(gt)
ReduceSum(x, keepdims=0) -> sum_1
  Greater(sum_1, init1_s_) -> gt
    If(gt, else_branch=G1, then_branch=G2) -> output
output: name='output' type=dtype('float32') shape=['batch', 3]
----- subgraph ---- If - cond - att.else_branch=G1 -- level=1 --  -> condcc
Cos(x) -> condcc
output: name='condcc' type='NOTENSOR' shape=None
----- subgraph ---- If - cond - att.then_branch=G2 -- level=1 --  -> condcc
Sin(x) -> condcc
output: name='condcc' type='NOTENSOR' shape=None

new-tracing#

inputs: #1[(T1s5x3,)]
shapes: dict(x:{0:Dim(batch)})

opset: domain='' version=21
opset: domain='local_functions' version=1
input: name='x' type=dtype('float32') shape=['batch', 3]
init: name='init1_s_' type=float32 shape=() -- array([0.], dtype=float32)-- shape_type_compute._cast_inputs.1(gt_Scalar)
ReduceSum(x, keepdims=0) -> sum_default
  Greater(sum_default, init1_s_) -> gt_scalar
    If(gt_scalar, else_branch=G1, then_branch=G2) -> output
output: name='output' type=dtype('float32') shape=['batch', 3]
----- subgraph ---- If - cond - att.else_branch=G1 -- level=1 --  -> cond
Cos(x) -> cond
output: name='cond' type='NOTENSOR' shape=None
----- subgraph ---- If - cond - att.then_branch=G2 -- level=1 --  -> cond
Sin(x) -> cond
output: name='cond' type='NOTENSOR' shape=None

ControlFlowCond2Inputs#

code: yobx.torch.testing._model_eval_cases.ControlFlowCond2Inputs

forward#

def forward(self, x, y):
    def true_fn(x, y):
        return torch.sin(x), torch.cos(x) + y

    def false_fn(x, y):
        return torch.cos(x), torch.sin(x) + y

    return torch.cond(x.sum() > 0, true_fn, false_fn, [x, y])

yobx#

inputs: #1[(T1s5x3,T1s5x3)]
shapes: dict(x:{0:Dim(batch)},y:{0:Dim(batch)})

opset: domain='' version=21
opset: domain='local_functions' version=1
input: name='x' type=dtype('float32') shape=['batch', 3]
input: name='y' type=dtype('float32') shape=['batch', 3]
init: name='init1_s_' type=float32 shape=() -- array([0.], dtype=float32)-- shape_type_compute._cast_inputs.1(gt_Scalar)
ReduceSum(x, keepdims=0) -> sum_1
  Greater(sum_1, init1_s_) -> gt
    If(gt, else_branch=G1, then_branch=G2) -> output_0, output_1
output: name='output_0' type=dtype('float32') shape=['batch', 3]
output: name='output_1' type=dtype('float32') shape=['batch', 3]
----- subgraph ---- If - cond - att.else_branch=G1 -- level=1 --  -> cond#0,cond#1
Cos(x) -> cond#0
Sin(x) -> sin2
Add(sin2, y) -> cond#1
output: name='cond#0' type='NOTENSOR' shape=None
output: name='cond#1' type='NOTENSOR' shape=None
----- subgraph ---- If - cond - att.then_branch=G2 -- level=1 --  -> cond#0,cond#1
Cos(x) -> cos2
Add(cos2, y) -> cond#1
Sin(x) -> cond#0
output: name='cond#0' type='NOTENSOR' shape=None
output: name='cond#1' type='NOTENSOR' shape=None

dynamo-ir#

inputs: #1[(T1s5x3,T1s5x3)]
shapes: dict(x:{0:Dim(batch)},y:{0:Dim(batch)})

opset: domain='' version=20
input: name='x' type=dtype('float32') shape=['batch', 3]
input: name='y' type=dtype('float32') shape=['batch', 3]
init: name='scalar_tensor_default' type=float32 shape=() -- array([0.], dtype=float32)
ReduceSum(x, noop_with_empty_axes=0, keepdims=0) -> sum_1
  Greater(sum_1, scalar_tensor_default) -> gt
    If(gt, then_branch=G1, else_branch=G2) -> getitem, getitem_1
output: name='getitem' type=dtype('float32') shape=['batch', 3]
output: name='getitem_1' type=dtype('float32') shape=['batch', 3]
----- subgraph ---- If - node_cond__1 - att.then_branch=G1 -- level=1 --  -> sin_true_graph_0,add_12_true_graph_0
Cos(x) -> cos
Add(cos, y) -> add_12_true_graph_0
Sin(x) -> sin_true_graph_0
output: name='sin_true_graph_0' type=dtype('float32') shape=['batch', 3]
output: name='add_12_true_graph_0' type=dtype('float32') shape=['batch', 3]
----- subgraph ---- If - node_cond__1 - att.else_branch=G2 -- level=1 --  -> cos_false_graph_0,add_12_false_graph_0
Cos(x) -> cos_false_graph_0
Sin(x) -> sin_2
Add(sin_2, y) -> add_12_false_graph_0
output: name='cos_false_graph_0' type=dtype('float32') shape=['batch', 3]
output: name='add_12_false_graph_0' type=dtype('float32') shape=['batch', 3]

tracing#

inputs: #1[(T1s5x3,T1s5x3)]
shapes: dict(x:{0:Dim(batch)},y:{0:Dim(batch)})

opset: domain='' version=21
opset: domain='local_functions' version=1
input: name='x' type=dtype('float32') shape=['batch', 3]
input: name='y' type=dtype('float32') shape=['batch', 3]
init: name='init1_s_' type=float32 shape=() -- array([0.], dtype=float32)-- shape_type_compute._cast_inputs.1(gt)
ReduceSum(x, keepdims=0) -> sum_1
  Greater(sum_1, init1_s_) -> gt
    If(gt, else_branch=G1, then_branch=G2) -> output_0, output_1
output: name='output_0' type=dtype('float32') shape=['batch', 3]
output: name='output_1' type=dtype('float32') shape=['batch', 3]
----- subgraph ---- If - cond - att.else_branch=G1 -- level=1 --  -> condcc#0,condcc#1
Cos(x) -> condcc#0
Sin(x) -> sin2
Add(sin2, y) -> condcc#1
output: name='condcc#0' type='NOTENSOR' shape=None
output: name='condcc#1' type='NOTENSOR' shape=None
----- subgraph ---- If - cond - att.then_branch=G2 -- level=1 --  -> condcc#0,condcc#1
Cos(x) -> cos2
Add(cos2, y) -> condcc#1
Sin(x) -> condcc#0
output: name='condcc#0' type='NOTENSOR' shape=None
output: name='condcc#1' type='NOTENSOR' shape=None

new-tracing#

inputs: #1[(T1s5x3,T1s5x3)]
shapes: dict(x:{0:Dim(batch)},y:{0:Dim(batch)})

opset: domain='' version=21
opset: domain='local_functions' version=1
input: name='x' type=dtype('float32') shape=['batch', 3]
input: name='y' type=dtype('float32') shape=['batch', 3]
init: name='init1_s_' type=float32 shape=() -- array([0.], dtype=float32)-- shape_type_compute._cast_inputs.1(gt_Scalar)
ReduceSum(x, keepdims=0) -> sum_default
  Greater(sum_default, init1_s_) -> gt_scalar
    If(gt_scalar, else_branch=G1, then_branch=G2) -> output_0, output_1
output: name='output_0' type=dtype('float32') shape=['batch', 3]
output: name='output_1' type=dtype('float32') shape=['batch', 3]
----- subgraph ---- If - cond - att.else_branch=G1 -- level=1 --  -> cond#0,cond#1
Cos(x) -> cond#0
Sin(x) -> sin2
Add(sin2, y) -> cond#1
output: name='cond#0' type='NOTENSOR' shape=None
output: name='cond#1' type='NOTENSOR' shape=None
----- subgraph ---- If - cond - att.then_branch=G2 -- level=1 --  -> cond#0,cond#1
Cos(x) -> cos2
Add(cos2, y) -> cond#1
Sin(x) -> cond#0
output: name='cond#0' type='NOTENSOR' shape=None
output: name='cond#1' type='NOTENSOR' shape=None

ControlFlowCond2Outputs#

code: yobx.torch.testing._model_eval_cases.ControlFlowCond2Outputs

forward#

def forward(self, x):
    def true_fn(x):
        return torch.sin(x), torch.cos(x)

    def false_fn(x):
        return torch.cos(x), torch.sin(x)

    return torch.cond(x.sum() > 0, true_fn, false_fn, [x])

yobx#

inputs: #1[(T1s5x3,)]
shapes: dict(x:{0:Dim(batch)})

opset: domain='' version=21
opset: domain='local_functions' version=1
input: name='x' type=dtype('float32') shape=['batch', 3]
init: name='init1_s_' type=float32 shape=() -- array([0.], dtype=float32)-- shape_type_compute._cast_inputs.1(gt_Scalar)
ReduceSum(x, keepdims=0) -> sum_1
  Greater(sum_1, init1_s_) -> gt
    If(gt, else_branch=G1, then_branch=G2) -> output_0, output_1
output: name='output_0' type=dtype('float32') shape=['batch', 3]
output: name='output_1' type=dtype('float32') shape=['batch', 3]
----- subgraph ---- If - cond - att.else_branch=G1 -- level=1 --  -> cond#0,cond#1
Cos(x) -> cond#0
Sin(x) -> cond#1
output: name='cond#0' type='NOTENSOR' shape=None
output: name='cond#1' type='NOTENSOR' shape=None
----- subgraph ---- If - cond - att.then_branch=G2 -- level=1 --  -> cond#0,cond#1
Cos(x) -> cond#1
Sin(x) -> cond#0
output: name='cond#0' type='NOTENSOR' shape=None
output: name='cond#1' type='NOTENSOR' shape=None

dynamo-ir#

inputs: #1[(T1s5x3,)]
shapes: dict(x:{0:Dim(batch)})

opset: domain='' version=20
input: name='x' type=dtype('float32') shape=['batch', 3]
init: name='scalar_tensor_default' type=float32 shape=() -- array([0.], dtype=float32)
ReduceSum(x, noop_with_empty_axes=0, keepdims=0) -> sum_1
  Greater(sum_1, scalar_tensor_default) -> gt
    If(gt, then_branch=G1, else_branch=G2) -> getitem, getitem_1
output: name='getitem' type=dtype('float32') shape=['batch', 3]
output: name='getitem_1' type=dtype('float32') shape=['batch', 3]
----- subgraph ---- If - node_cond__1 - att.then_branch=G1 -- level=1 --  -> sin_true_graph_0,cos_true_graph_0
Cos(x) -> cos_true_graph_0
Sin(x) -> sin_true_graph_0
output: name='sin_true_graph_0' type=dtype('float32') shape=['batch', 3]
output: name='cos_true_graph_0' type=dtype('float32') shape=['batch', 3]
----- subgraph ---- If - node_cond__1 - att.else_branch=G2 -- level=1 --  -> cos_false_graph_0,sin_false_graph_0
Cos(x) -> cos_false_graph_0
Sin(x) -> sin_false_graph_0
output: name='cos_false_graph_0' type=dtype('float32') shape=['batch', 3]
output: name='sin_false_graph_0' type=dtype('float32') shape=['batch', 3]

tracing#

inputs: #1[(T1s5x3,)]
shapes: dict(x:{0:Dim(batch)})

opset: domain='' version=21
opset: domain='local_functions' version=1
input: name='x' type=dtype('float32') shape=['batch', 3]
init: name='init1_s_' type=float32 shape=() -- array([0.], dtype=float32)-- shape_type_compute._cast_inputs.1(gt)
ReduceSum(x, keepdims=0) -> sum_1
  Greater(sum_1, init1_s_) -> gt
    If(gt, else_branch=G1, then_branch=G2) -> output_0, output_1
output: name='output_0' type=dtype('float32') shape=['batch', 3]
output: name='output_1' type=dtype('float32') shape=['batch', 3]
----- subgraph ---- If - cond - att.else_branch=G1 -- level=1 --  -> condcc#0,condcc#1
Cos(x) -> condcc#0
Sin(x) -> condcc#1
output: name='condcc#0' type='NOTENSOR' shape=None
output: name='condcc#1' type='NOTENSOR' shape=None
----- subgraph ---- If - cond - att.then_branch=G2 -- level=1 --  -> condcc#0,condcc#1
Cos(x) -> condcc#1
Sin(x) -> condcc#0
output: name='condcc#0' type='NOTENSOR' shape=None
output: name='condcc#1' type='NOTENSOR' shape=None

new-tracing#

inputs: #1[(T1s5x3,)]
shapes: dict(x:{0:Dim(batch)})

opset: domain='' version=21
opset: domain='local_functions' version=1
input: name='x' type=dtype('float32') shape=['batch', 3]
init: name='init1_s_' type=float32 shape=() -- array([0.], dtype=float32)-- shape_type_compute._cast_inputs.1(gt_Scalar)
ReduceSum(x, keepdims=0) -> sum_default
  Greater(sum_default, init1_s_) -> gt_scalar
    If(gt_scalar, else_branch=G1, then_branch=G2) -> output_0, output_1
output: name='output_0' type=dtype('float32') shape=['batch', 3]
output: name='output_1' type=dtype('float32') shape=['batch', 3]
----- subgraph ---- If - cond - att.else_branch=G1 -- level=1 --  -> cond#0,cond#1
Cos(x) -> cond#0
Sin(x) -> cond#1
output: name='cond#0' type='NOTENSOR' shape=None
output: name='cond#1' type='NOTENSOR' shape=None
----- subgraph ---- If - cond - att.then_branch=G2 -- level=1 --  -> cond#0,cond#1
Cos(x) -> cond#1
Sin(x) -> cond#0
output: name='cond#0' type='NOTENSOR' shape=None
output: name='cond#1' type='NOTENSOR' shape=None

ControlFlowCondConstant#

code: yobx.torch.testing._model_eval_cases.ControlFlowCondConstant

forward#

def forward(self, x):
    def true_fn(x):
        return torch.sin(x) - torch.ones(x.shape, dtype=x.dtype, device=x.device)

    def false_fn(x):
        return torch.cos(x) + torch.ones((1, 1024), dtype=x.dtype, device=x.device)

    return torch.cond(x.sum() > 0, true_fn, false_fn, [x])

yobx#

inputs: #1[(T1s1024x1024,)]
shapes: dict(x:{0:Dim(batch)})

opset: domain='' version=21
opset: domain='local_functions' version=1
input: name='x' type=dtype('float32') shape=['batch', 1024]
init: name='init1_s_' type=float32 shape=() -- array([0.], dtype=float32)-- shape_type_compute._cast_inputs.1(gt_Scalar)
init: name='init7_s2_1_10242_cst2init' type=int64 shape=(2,) -- array([   1, 1024])-- GraphBuilderPatternOptimization.make_initializer.1/Shape
init: name='init7_s1_10242_cst2init' type=int64 shape=(1,) -- array([1024])-- GraphBuilderPatternOptimization.make_initializer.1/Shape
ReduceSum(x, keepdims=0) -> sum_1
  Greater(sum_1, init1_s_) -> gt
    If(gt, else_branch=G1, then_branch=G2) -> output_0
output: name='output_0' type=dtype('float32') shape=['batch', 1024]
----- subgraph ---- If - cond - att.else_branch=G1 -- level=1 --  -> cond#0
ConstantOfShape(init7_s2_1_10242_cst2init, value=[1.0]) -> ones2
Cos(x) -> cos2
  Add(cos2, ones2) -> cond#0
output: name='cond#0' type='NOTENSOR' shape=None
----- subgraph ---- If - cond - att.then_branch=G2 -- level=1 --  -> cond#0
Shape(x, end=1, start=0) -> x::Shape:12
Concat(x::Shape:12, init7_s1_10242_cst2init, axis=0) -> _onx_concat_sym_size_int_1::UnSq02
  ConstantOfShape(_onx_concat_sym_size_int_1::UnSq02, value=[1.0]) -> ones32
Sin(x) -> sin2
  Sub(sin2, ones32) -> cond#0
output: name='cond#0' type='NOTENSOR' shape=None

dynamo-ir#

inputs: #1[(T1s1024x1024,)]
shapes: dict(x:{0:Dim(batch)})

opset: domain='' version=20
input: name='x' type=dtype('float32') shape=['batch', 1024]
init: name='scalar_tensor_default' type=float32 shape=() -- array([0.], dtype=float32)
init: name='val_7' type=float32 shape=() -- array([1.], dtype=float32)
init: name='val_3' type=int64 shape=(1,) -- array([1024])
init: name='ones_2' type=float32 shape=(1, 1024)
ReduceSum(x, noop_with_empty_axes=0, keepdims=0) -> sum_1
  Greater(sum_1, scalar_tensor_default) -> gt
    If(gt, then_branch=G1, else_branch=G2) -> getitem
output: name='getitem' type=dtype('float32') shape=['batch', 1024]
----- subgraph ---- If - node_cond__0 - att.then_branch=G1 -- level=1 --  -> sub_3_true_graph_0
Shape(x, end=1, start=0) -> val_0_2
Concat(val_0_2, val_3, axis=0) -> val_4
Expand(val_7, val_4) -> ones
Sin(x) -> sin
  Sub(sin, ones) -> sub_3_true_graph_0
output: name='sub_3_true_graph_0' type=dtype('float32') shape=['batch', 1024]
----- subgraph ---- If - node_cond__0 - att.else_branch=G2 -- level=1 --  -> add_6_false_graph_0
Cos(x) -> cos
Add(cos, ones_2) -> add_6_false_graph_0
output: name='add_6_false_graph_0' type=dtype('float32') shape=['batch', 1024]

tracing#

inputs: #1[(T1s1024x1024,)]
shapes: dict(x:{0:Dim(batch)})

opset: domain='' version=21
opset: domain='local_functions' version=1
input: name='x' type=dtype('float32') shape=['batch', 1024]
init: name='init1_s_' type=float32 shape=() -- array([0.], dtype=float32)-- shape_type_compute._cast_inputs.1(gt)
init: name='init7_s2_1_10242_cst2init' type=int64 shape=(2,) -- array([   1, 1024])-- GraphBuilderPatternOptimization.make_initializer.1/Shape
ReduceSum(x, keepdims=0) -> sum_1
  Greater(sum_1, init1_s_) -> gt
    If(gt, else_branch=G1, then_branch=G2) -> output
output: name='output' type=dtype('float32') shape=['d_output_0', 'd_output_1']
----- subgraph ---- If - cond - att.else_branch=G1 -- level=1 --  -> condcc
ConstantOfShape(init7_s2_1_10242_cst2init, value=[1.0]) -> ones2
Cos(x) -> cos2
  Add(cos2, ones2) -> condcc
output: name='condcc' type='NOTENSOR' shape=None
----- subgraph ---- If - cond - att.then_branch=G2 -- level=1 --  -> condcc
Shape(x) -> size2
  ConstantOfShape(size2, value=[1.0]) -> ones32
Sin(x) -> sin2
  Sub(sin2, ones32) -> condcc
output: name='condcc' type='NOTENSOR' shape=None

new-tracing#

inputs: #1[(T1s1024x1024,)]
shapes: dict(x:{0:Dim(batch)})

opset: domain='' version=21
opset: domain='local_functions' version=1
input: name='x' type=dtype('float32') shape=['batch', 1024]
init: name='init1_s_' type=float32 shape=() -- array([0.], dtype=float32)-- shape_type_compute._cast_inputs.1(gt_Scalar)
init: name='init7_s2_1_10242_cst2init' type=int64 shape=(2,) -- array([   1, 1024])-- GraphBuilderPatternOptimization.make_initializer.1/Shape
ReduceSum(x, keepdims=0) -> sum_default
  Greater(sum_default, init1_s_) -> gt_scalar
    If(gt_scalar, else_branch=G1, then_branch=G2) -> output
output: name='output' type=dtype('float32') shape=['batch', 'd_output_1']
----- subgraph ---- If - cond - att.else_branch=G1 -- level=1 --  -> cond
ConstantOfShape(init7_s2_1_10242_cst2init, value=[1.0]) -> ones2
Cos(x) -> cos2
  Add(cos2, ones2) -> cond
output: name='cond' type='NOTENSOR' shape=None
----- subgraph ---- If - cond - att.then_branch=G2 -- level=1 --  -> cond
Shape(x) -> size2
  ConstantOfShape(size2, value=[1.0]) -> ones32
Sin(x) -> sin2
  Sub(sin2, ones32) -> cond
output: name='cond' type='NOTENSOR' shape=None

ControlFlowCondIdentity_153832#

code: yobx.torch.testing._model_eval_cases.ControlFlowCondIdentity_153832

forward#

def forward(self, x, y):
    def branch_cond_then_1(x):
        x = torch.abs(x) + 1
        return x

    def branch_cond_else_1(x):
        return x  # fails but succeeds with x.clone()

    x = torch.cond(x.sum() > 0, branch_cond_then_1, branch_cond_else_1, [x])
    return x + y

yobx#

FAILED

This higher order operator doesn't work unless it is captured completely with torch.compile. Got graph break/error:

Encountered aliasing during higher order op tracing
  Higher Order Operator: torch.cond
  Explanation: Higher order ops do not support aliasing. Found in <bound method HigherOrderOperator.name of <torch._higher_order_ops.cond.CondOp object at 0x78535821a6c0>>
  Hint: Replace `return input` with `return input.clone()` to avoid aliasing.
  Hint: Consider using the debug context to change user code to avoid aliasing.
  Hint: Please open an issue.

  Developer debug context: Input-to-output aliasing detected at nodes l_args_3_0_ and l_args_3_0_ in
     graph():
        %l_args_3_0_ : torch._subclasses.fake_tensor.FakeTensor [num_users=1] = placeholder[target=l_args_3_0_]
        return (l_args_3_0_,)

 For more details about this graph break, please visit: https://meta-pytorch.github.io/compile-graph-break-site/gb/gb0040.html

from user code:
   File "~/vv/this312/lib/python3.12/site-packages/torch/_higher_order_ops/cond.py", line 242, in _cond_op_wrapper
    return cond_op(*args, **kwargs)
  File "~/vv/this312/lib/python3.12/site-packages/torch/_export/non_strict_utils.py", line 1154, in __torch_function__
    return func(*args, **kwargs)

Set TORCHDYNAMO_VERBOSE=1 for the internal stack trace (please do this especially if you're reporting a bug to PyTorch). For even more developer context, set TORCH_LOGS="+dynamo"

dynamo-ir#

FAILED

Failed to export the model with torch.export. [96mThis is step 1/3[0m of exporting the model to ONNX. Next steps:
- Modify the model code for `torch.export.export` to succeed. Refer to https://pytorch.org/docs/stable/generated/exportdb/index.html for more information.
- Debug `torch.export.export` and submit a PR to PyTorch.
- Create an issue in the PyTorch GitHub repository against the [96m*torch.export*[0m component and attach the full error stack as well as reproduction scripts.

## Exception summary

<class 'torch._dynamo.exc.UncapturedHigherOrderOpError'>: This higher order operator doesn't work unless it is captured completely with torch.compile. Got graph break/error:

Encountered aliasing during higher order op tracing
  Higher Order Operator: torch.cond
  Explanation: Higher order ops do not support aliasing. Found in <bound method HigherOrderOperator.name of <torch._higher_order_ops.cond.CondOp object at 0x78535821a6c0>>
  Hint: Replace `return input` with `return input.clone()` to avoid aliasing.
  Hint: Consider using the debug context to change user code to avoid aliasing.
  Hint: Please open an issue.

  Developer debug context: Input-to-output aliasing detected at nodes l_args_3_0_ and l_args_3_0_ in
     graph():
        %l_args_3_0_ : torch._subclasses.fake_tensor.FakeTensor [num_users=1] = placeholder[target=l_args_3_0_]
        return (l_args_3_0_,)

 For more details about this graph break, please visit: https://meta-pytorch.github.io/compile-graph-break-site/gb/gb0040.html

from user code:
   File "~/vv/this312/lib/python3.12/site-packages/torch/_higher_order_ops/cond.py", line 242, in _cond_op_wrapper
    return cond_op(*args, **kwargs)
  File "~/vv/this312/lib/python3.12/site-packages/torch/_export/non_strict_utils.py", line 1154, in __torch_function__
    return func(*args, **kwargs)

Set TORCHDYNAMO_VERBOSE=1 for the internal stack trace (please do this especially if you're reporting a bug to PyTorch). For even more developer context, set TORCH_LOGS="+dynamo"


(Refer to the full stack trace above for more information.)

tracing#

FAILED

This higher order operator doesn't work unless it is captured completely with torch.compile. Got graph break/error:

Encountered aliasing during higher order op tracing
  Higher Order Operator: torch.cond
  Explanation: Higher order ops do not support aliasing. Found in <bound method HigherOrderOperator.name of <torch._higher_order_ops.cond.CondOp object at 0x78535821a6c0>>
  Hint: Replace `return input` with `return input.clone()` to avoid aliasing.
  Hint: Consider using the debug context to change user code to avoid aliasing.
  Hint: Please open an issue.

  Developer debug context: Input-to-output aliasing detected at nodes l_args_3_0_ and l_args_3_0_ in
     graph():
        %l_args_3_0_ : torch._subclasses.fake_tensor.FakeTensor [num_users=1] = placeholder[target=l_args_3_0_]
        return (l_args_3_0_,)

 For more details about this graph break, please visit: https://meta-pytorch.github.io/compile-graph-break-site/gb/gb0040.html

from user code:
   File "~/vv/this312/lib/python3.12/site-packages/torch/_higher_order_ops/cond.py", line 242, in _cond_op_wrapper
    return cond_op(*args, **kwargs)

Set TORCHDYNAMO_VERBOSE=1 for the internal stack trace (please do this especially if you're reporting a bug to PyTorch). For even more developer context, set TORCH_LOGS="+dynamo"

new-tracing#

FAILED

This higher order operator doesn't work unless it is captured completely with torch.compile. Got graph break/error:

Encountered aliasing during higher order op tracing
  Higher Order Operator: torch.cond
  Explanation: Higher order ops do not support aliasing. Found in <bound method HigherOrderOperator.name of <torch._higher_order_ops.cond.CondOp object at 0x78535821a6c0>>
  Hint: Replace `return input` with `return input.clone()` to avoid aliasing.
  Hint: Consider using the debug context to change user code to avoid aliasing.
  Hint: Please open an issue.

  Developer debug context: Input-to-output aliasing detected at nodes l_args_3_0_ and l_args_3_0_ in
     graph():
        %l_args_3_0_ : torch._subclasses.fake_tensor.FakeTensor [num_users=1] = placeholder[target=l_args_3_0_]
        return (l_args_3_0_,)

 For more details about this graph break, please visit: https://meta-pytorch.github.io/compile-graph-break-site/gb/gb0040.html

from user code:
   File "~/vv/this312/lib/python3.12/site-packages/torch/_higher_order_ops/cond.py", line 242, in _cond_op_wrapper
    return cond_op(*args, **kwargs)

Set TORCHDYNAMO_VERBOSE=1 for the internal stack trace (please do this especially if you're reporting a bug to PyTorch). For even more developer context, set TORCH_LOGS="+dynamo"

ControlFlowCondNestedModule#

code: yobx.torch.testing._model_eval_cases.ControlFlowCondNestedModule

forward#

def forward(self, x):
    def true_fn(x):
        return self.submodule(x)

    def false_fn(x):
        return x - self.weight

    y = torch.cond(x.sum() > 0, true_fn, false_fn, [x])
    return y

yobx#

inputs: #1[(T7s2,)]
shapes: dict(x:{0:Dim(batch)})

opset: domain='' version=21
opset: domain='local_functions.0' version=1
opset: domain='local_functions' version=1
input: name='x' type=dtype('int64') shape=['batch']
init: name='init7_s_0' type=int64 shape=() -- array([0])              -- shape_type_compute._cast_inputs.1(gt_Scalar)
init: name='init7_s_1002_cst2init' type=int64 shape=() -- array([100])-- GraphBuilderPatternOptimization.make_initializer.1/Shape
init: name='weight' type=float32 shape=(1,) -- array([42.], dtype=float32)-- DynamoInterpret.placeholder.1/P(weight)
init: name='submodule.weight' type=float32 shape=(1,) -- array([100.], dtype=float32)-- DynamoInterpret.placeholder.1/P(submodule.weight)
ReduceSum(x, keepdims=0) -> sum_1
  Greater(sum_1, init7_s_0) -> gt
    If(gt, else_branch=G1, then_branch=G2) -> output_0
output: name='output_0' type=dtype('float32') shape=['batch']
----- subgraph ---- If - cond - att.else_branch=G1 -- level=1 --  -> cond#0
Cast(x, to=1) -> x::C12
Sub(x::C12, weight) -> cond#0
output: name='cond#0' type='NOTENSOR' shape=None
----- subgraph ---- If - cond - att.then_branch=G2 -- level=1 --  -> cond#0
Abs(x) -> abs_12
  ReduceSum(abs_12, keepdims=0) -> sum_122
Greater(sum_122, init7_s_1002_cst2init) -> gt22
  If(gt22, else_branch=G3, then_branch=G4) -> cond#0
output: name='cond#0' type='NOTENSOR' shape=None
----- subgraph ---- If - cond22 - att.else_branch=G3 -- level=2 --  -> cond#0
Cast(x, to=1) -> x::C132
Div(x::C132, submodule.weight) -> cond#0
output: name='cond#0' type='NOTENSOR' shape=None
----- subgraph ---- If - cond22 - att.then_branch=G4 -- level=2 --  -> cond#0
Cast(x, to=1) -> x::C142
Mul(x::C142, submodule.weight) -> cond#0
output: name='cond#0' type='NOTENSOR' shape=None
----- subgraph ---- If - cond22 - att.else_branch=G3 -- level=1 --  -> cond#0
Cast(x, to=1) -> x::C132
Div(x::C132, submodule.weight) -> cond#0
output: name='cond#0' type='NOTENSOR' shape=None
----- subgraph ---- If - cond22 - att.then_branch=G4 -- level=1 --  -> cond#0
Cast(x, to=1) -> x::C142
Mul(x::C142, submodule.weight) -> cond#0
output: name='cond#0' type='NOTENSOR' shape=None

dynamo-ir#

inputs: #1[(T7s2,)]
shapes: dict(x:{0:Dim(batch)})

opset: domain='' version=20
input: name='x' type=dtype('int64') shape=['batch']
init: name='weight' type=float32 shape=(1,) -- array([42.], dtype=float32)
init: name='submodule.weight' type=float32 shape=(1,) -- array([100.], dtype=float32)
init: name='val_0' type=int64 shape=() -- array([0])
init: name='val_0_2' type=int64 shape=() -- array([100])
ReduceSum(x, noop_with_empty_axes=0, keepdims=0) -> sum_1
  Greater(sum_1, val_0) -> gt
    If(gt, then_branch=G1, else_branch=G2) -> getitem
output: name='getitem' type=dtype('float32') shape=['batch']
----- subgraph ---- If - node_cond__0 - att.then_branch=G1 -- level=1 --  -> getitem_true_graph_0
Abs(x) -> abs_1
  ReduceSum(abs_1, noop_with_empty_axes=0, keepdims=0) -> sum_1_2
Greater(sum_1_2, val_0_2) -> gt_2
  If(gt_2, then_branch=G3, else_branch=G4) -> getitem_true_graph_0
output: name='getitem_true_graph_0' type=dtype('float32') shape=['batch']
----- subgraph ---- If - node_cond__0_2 - att.then_branch=G3 -- level=2 --  -> mul_1_true_graph_0__true_graph_0
Cast(x, to=1) -> convert_element_type_default
Mul(convert_element_type_default, submodule.weight) -> mul_1_true_graph_0__true_graph_0
output: name='mul_1_true_graph_0__true_graph_0' type=dtype('float32') shape=['batch']
----- subgraph ---- If - node_cond__0_2 - att.else_branch=G4 -- level=2 --  -> div_true_graph_0__false_graph_0
Cast(x, to=1) -> convert_element_type_default_2
Div(convert_element_type_default_2, submodule.weight) -> div_true_graph_0__false_graph_0
output: name='div_true_graph_0__false_graph_0' type=dtype('float32') shape=['batch']
----- subgraph ---- If - node_cond__0_2 - att.then_branch=G3 -- level=1 --  -> mul_1_true_graph_0__true_graph_0
Cast(x, to=1) -> convert_element_type_default
Mul(convert_element_type_default, submodule.weight) -> mul_1_true_graph_0__true_graph_0
output: name='mul_1_true_graph_0__true_graph_0' type=dtype('float32') shape=['batch']
----- subgraph ---- If - node_cond__0_2 - att.else_branch=G4 -- level=1 --  -> div_true_graph_0__false_graph_0
Cast(x, to=1) -> convert_element_type_default_2
Div(convert_element_type_default_2, submodule.weight) -> div_true_graph_0__false_graph_0
output: name='div_true_graph_0__false_graph_0' type=dtype('float32') shape=['batch']
----- subgraph ---- If - node_cond__0 - att.else_branch=G2 -- level=1 --  -> sub_1_false_graph_0
Cast(x, to=1) -> convert_element_type_default_3
Sub(convert_element_type_default_3, weight) -> sub_1_false_graph_0
output: name='sub_1_false_graph_0' type=dtype('float32') shape=['batch']

tracing#

inputs: #1[(T7s2,)]
shapes: dict(x:{0:Dim(batch)})

opset: domain='' version=21
opset: domain='local_functions.0' version=1
opset: domain='local_functions' version=1
input: name='x' type=dtype('int64') shape=['batch']
init: name='init7_s_0' type=int64 shape=() -- array([0])              -- shape_type_compute._cast_inputs.1(gt)
init: name='weight2_cst2init' type=float32 shape=(1,) -- array([42.], dtype=float32)-- GraphBuilderPatternOptimization.make_initializer.1/Small
init: name='weight32_cst2init' type=float32 shape=(1,) -- array([100.], dtype=float32)-- GraphBuilderPatternOptimization.make_initializer.1/Small
init: name='init7_s_1002_cst2init' type=int64 shape=() -- array([100])-- GraphBuilderPatternOptimization.make_initializer.1/Shape
ReduceSum(x, keepdims=0) -> sum_1
  Greater(sum_1, init7_s_0) -> gt
    If(gt, else_branch=G1, then_branch=G2) -> output
output: name='output' type=dtype('float32') shape=['batch']
----- subgraph ---- If - cond - att.else_branch=G1 -- level=1 --  -> condcc
Cast(x, to=1) -> arg0::C12
Sub(arg0::C12, weight2_cst2init) -> condcc
output: name='condcc' type='NOTENSOR' shape=None
----- subgraph ---- If - cond - att.then_branch=G2 -- level=1 --  -> condcc
Abs(x) -> abs_12
  ReduceSum(abs_12, keepdims=0) -> sum_122
Greater(sum_122, init7_s_1002_cst2init) -> gt22
  If(gt22, else_branch=G3, then_branch=G4) -> condcc
output: name='condcc' type='NOTENSOR' shape=None
----- subgraph ---- If - cond22 - att.else_branch=G3 -- level=2 --  -> condcc
Cast(x, to=1) -> arg0::C132
Div(arg0::C132, weight32_cst2init) -> condcc
output: name='condcc' type='NOTENSOR' shape=None
----- subgraph ---- If - cond22 - att.then_branch=G4 -- level=2 --  -> condcc
Cast(x, to=1) -> arg0::C142
Mul(arg0::C142, weight32_cst2init) -> condcc
output: name='condcc' type='NOTENSOR' shape=None
----- subgraph ---- If - cond22 - att.else_branch=G3 -- level=1 --  -> condcc
Cast(x, to=1) -> arg0::C132
Div(arg0::C132, weight32_cst2init) -> condcc
output: name='condcc' type='NOTENSOR' shape=None
----- subgraph ---- If - cond22 - att.then_branch=G4 -- level=1 --  -> condcc
Cast(x, to=1) -> arg0::C142
Mul(arg0::C142, weight32_cst2init) -> condcc
output: name='condcc' type='NOTENSOR' shape=None

new-tracing#

inputs: #1[(T7s2,)]
shapes: dict(x:{0:Dim(batch)})

opset: domain='' version=21
opset: domain='local_functions.0' version=1
opset: domain='local_functions' version=1
input: name='x' type=dtype('int64') shape=['batch']
init: name='init7_s_0' type=int64 shape=() -- array([0])              -- shape_type_compute._cast_inputs.1(gt_Scalar)
init: name='weight22_cst2init' type=float32 shape=(1,) -- array([42.], dtype=float32)-- GraphBuilderPatternOptimization.make_initializer.1/Small
init: name='weight32_cst2init' type=float32 shape=(1,) -- array([100.], dtype=float32)-- GraphBuilderPatternOptimization.make_initializer.1/Small
init: name='init7_s_1002_cst2init' type=int64 shape=() -- array([100])-- GraphBuilderPatternOptimization.make_initializer.1/Shape
ReduceSum(x, keepdims=0) -> sum_default
  Greater(sum_default, init7_s_0) -> gt_scalar
    If(gt_scalar, else_branch=G1, then_branch=G2) -> output
output: name='output' type=dtype('float32') shape=['batch']
----- subgraph ---- If - cond - att.else_branch=G1 -- level=1 --  -> cond
Cast(x, to=1) -> arg0::C12
Sub(arg0::C12, weight22_cst2init) -> cond
output: name='cond' type='NOTENSOR' shape=None
----- subgraph ---- If - cond - att.then_branch=G2 -- level=1 --  -> cond
Abs(x) -> abs_12
  ReduceSum(abs_12, keepdims=0) -> sum_12
Greater(sum_12, init7_s_1002_cst2init) -> gt2
  If(gt2, else_branch=G3, then_branch=G4) -> cond
output: name='cond' type='NOTENSOR' shape=None
----- subgraph ---- If - cond22 - att.else_branch=G3 -- level=2 --  -> condcc
Cast(x, to=1) -> arg0::C132
Div(arg0::C132, weight32_cst2init) -> condcc
output: name='condcc' type='NOTENSOR' shape=None
----- subgraph ---- If - cond22 - att.then_branch=G4 -- level=2 --  -> condcc
Cast(x, to=1) -> arg0::C142
Mul(arg0::C142, weight32_cst2init) -> condcc
output: name='condcc' type='NOTENSOR' shape=None
----- subgraph ---- If - cond22 - att.else_branch=G3 -- level=1 --  -> condcc
Cast(x, to=1) -> arg0::C132
Div(arg0::C132, weight32_cst2init) -> condcc
output: name='condcc' type='NOTENSOR' shape=None
----- subgraph ---- If - cond22 - att.then_branch=G4 -- level=1 --  -> condcc
Cast(x, to=1) -> arg0::C142
Mul(arg0::C142, weight32_cst2init) -> condcc
output: name='condcc' type='NOTENSOR' shape=None

ControlFlowCondNonZero#

code: yobx.torch.testing._model_eval_cases.ControlFlowCondNonZero

forward#

def forward(self, input_ids, image_features, vocab_size):
    def then_branch(input_ids, image_features, vocab_size):
        input_shape = input_ids.size()
        input_ids = input_ids.view(-1, input_shape[-1])

        condition = (input_ids < 0) & (input_ids > -int(1e9))
        positions = torch.nonzero(condition, as_tuple=True)
        input_ids = input_ids.clamp_min(0).clamp_max(vocab_size)
        return (input_ids, positions[0], positions[1])

    def else_branch(input_ids, image_features, vocab_size):
        r = torch.where(torch.zeros((1, 1), dtype=torch.bool))
        return (input_ids, r[0], r[1])

    a, b, c = torch.cond(
        image_features.numel() > 0,
        then_branch,
        else_branch,
        [input_ids, image_features, vocab_size],
    )
    return a, b, c

yobx#

FAILED

Expect operands to be a tuple of possibly nested dict/list/tuple that only consists of tensor leaves, but got [FakeTensor(..., size=(s72, 12), dtype=torch.int64), FakeTensor(..., size=(s28, s11)), 1025].

dynamo-ir#

FAILED

Failed to export the model with torch.export. [96mThis is step 1/3[0m of exporting the model to ONNX. Next steps:
- Modify the model code for `torch.export.export` to succeed. Refer to https://pytorch.org/docs/stable/generated/exportdb/index.html for more information.
- Debug `torch.export.export` and submit a PR to PyTorch.
- Create an issue in the PyTorch GitHub repository against the [96m*torch.export*[0m component and attach the full error stack as well as reproduction scripts.

## Exception summary

<class 'RuntimeError'>: Expect operands to be a tuple of possibly nested dict/list/tuple that only consists of tensor leaves, but got [FakeTensor(..., size=(s72, 12), dtype=torch.int64), FakeTensor(..., size=(s28, s11)), 1025].

(Refer to the full stack trace above for more information.)

tracing#

inputs: #2[(T7s2x12,T1s2x16,int),(T7s2x12,T1s2x0,int)]
shapes: ({0:Dim(batch)},{0:Dim(batch),1:Dim(seq_length)},None)

opset: domain='' version=21
opset: domain='local_functions' version=1
input: name='input_ids' type=dtype('int64') shape=['batch', 12]
input: name='image_features' type=dtype('float32') shape=['batch', 'seq_length']
input: name='vocab_size' type=dtype('int64') shape=None
init: name='init7_s_0' type=int64 shape=() -- array([0])              -- shape_type_compute._cast_inputs.1(gt)
init: name='init7_s1_02_cst2init' type=int64 shape=(1,) -- array([0]) -- GraphBuilderPatternOptimization.make_initializer.1/Shape
init: name='init7_s_-1000000000::RSh12_cst2init' type=int64 shape=(1,) -- array([-1000000000])-- GraphBuilderPatternOptimization.make_initializer.1/Shape
Size(image_features) -> numel
  Greater(numel, init7_s_0) -> gt
    If(gt, else_branch=G1, then_branch=G2) -> output_0, output_1, output_2
output: name='output_0' type=dtype('int64') shape=['batch', 12]
output: name='output_1' type=dtype('int64') shape=['NEWDIM_nonzero']
output: name='output_2' type=dtype('int64') shape=['NEWDIM_nonzero']
----- subgraph ---- If - cond - att.else_branch=G1 -- level=1 --  -> condcc#0,condcc#1,condcc#2
Constant(value=[]) -> condcc#1
  Identity(condcc#1) -> condcc#2
Identity(input_ids) -> condcc#0
output: name='condcc#0' type='NOTENSOR' shape=None
output: name='condcc#1' type='NOTENSOR' shape=None
output: name='condcc#2' type='NOTENSOR' shape=None
----- subgraph ---- If - cond - att.then_branch=G2 -- level=1 --  -> condcc#0,condcc#1,condcc#2
Clip(input_ids, init7_s1_02_cst2init) -> clamp_min2
Min(clamp_min2, vocab_size) -> condcc#0
Less(input_ids, init7_s1_02_cst2init) -> lt2
Greater(input_ids, init7_s_-1000000000::RSh12_cst2init) -> gt22
  And(lt2, gt22) -> and_2
    NonZero(and_2) -> _onx_nonzero_and_2
      Split(_onx_nonzero_and_2, num_outputs=2) -> _onx_split_nonzero_and__02, _onx_split_nonzero_and__12
Squeeze(_onx_split_nonzero_and__02, init7_s1_02_cst2init) -> condcc#1
Squeeze(_onx_split_nonzero_and__12, init7_s1_02_cst2init) -> condcc#2
output: name='condcc#0' type='NOTENSOR' shape=None
output: name='condcc#1' type='NOTENSOR' shape=None
output: name='condcc#2' type='NOTENSOR' shape=None

new-tracing#

inputs: #2[(T7s2x12,T1s2x16,int),(T7s2x12,T1s2x0,int)]
shapes: ({0:Dim(batch)},{0:Dim(batch),1:Dim(seq_length)},None)

opset: domain='' version=21
opset: domain='local_functions' version=1
input: name='input_ids' type=dtype('int64') shape=['batch', 12]
input: name='image_features' type=dtype('float32') shape=['batch', 'seq_length']
init: name='init7_s_0' type=int64 shape=() -- array([0])              -- shape_type_compute._cast_inputs.1(gt_Scalar)
init: name='cst_scalar_int' type=int64 shape=() -- array([1025])      -- aten_cond_scalar
init: name='init7_s1_02_cst2init' type=int64 shape=(1,) -- array([0]) -- GraphBuilderPatternOptimization.make_initializer.1/Shape
init: name='init7_s_-10000000002_cst2init' type=int64 shape=() -- array([-1000000000])-- GraphBuilderPatternOptimization.make_initializer.1/Shape
Shape(image_features, end=1, start=0) -> image_features::Shape:1
  Squeeze(image_features::Shape:1) -> sym_size_int
Shape(image_features, end=2, start=1) -> image_features::Shape1:2
  Squeeze(image_features::Shape1:2) -> sym_size_int_1
    Mul(sym_size_int, sym_size_int_1) -> mul_tensor
      Greater(mul_tensor, init7_s_0) -> gt_scalar
        If(gt_scalar, else_branch=G1, then_branch=G2) -> output_0, output_1, output_2
output: name='output_0' type=dtype('int64') shape=['batch', 12]
output: name='output_1' type=dtype('int64') shape=['u2']
output: name='output_2' type=dtype('int64') shape=['u2']
----- subgraph ---- If - cond - att.else_branch=G1 -- level=1 --  -> cond#0,cond#1,cond#2
Constant(value=[]) -> cond#1
  Identity(cond#1) -> cond#2
Identity(input_ids) -> cond#0
output: name='cond#0' type='NOTENSOR' shape=None
output: name='cond#1' type='NOTENSOR' shape=None
output: name='cond#2' type='NOTENSOR' shape=None
----- subgraph ---- If - cond - att.then_branch=G2 -- level=1 --  -> cond#0,cond#1,cond#2
CastLike(init7_s1_02_cst2init, input_ids) -> _onx_castlike_init7_s1_02
Clip(input_ids, _onx_castlike_init7_s1_02) -> clamp_min2
Min(clamp_min2, cst_scalar_int) -> cond#0
Less(input_ids, init7_s_0) -> lt2
Greater(input_ids, init7_s_-10000000002_cst2init) -> gt2
  And(lt2, gt2) -> and_2
    NonZero(and_2) -> _onx_nonzero_and_2
      Split(_onx_nonzero_and_2, num_outputs=2) -> _onx_split_nonzero_and__02, _onx_split_nonzero_and__12
Squeeze(_onx_split_nonzero_and__02, init7_s1_02_cst2init) -> cond#1
Squeeze(_onx_split_nonzero_and__12, init7_s1_02_cst2init) -> cond#2
output: name='cond#0' type='NOTENSOR' shape=None
output: name='cond#1' type='NOTENSOR' shape=None
output: name='cond#2' type='NOTENSOR' shape=None

ControlFlowIndirectRanks#

code: yobx.torch.testing._model_eval_cases.ControlFlowIndirectRanks

forward#

def forward(self, x):
    x1 = x + 1
    if x1.ndim == 2:
        return x1.clone()
    return x / x1.ndim

yobx#

inputs: #2[(T1s3x4,),(T1s5x4,)]
shapes: dict(x:{0:Dim(batch)})

opset: domain='' version=21
input: name='x' type=dtype('float32') shape=['batch', 4]
init: name='init1_s_::RSh1' type=float32 shape=(1,) -- array([1.], dtype=float32)-- GraphBuilder.constant_folding.from/fold(init1_s_,init7_s1_1)##init1_s_/shape_type_compute._cast_inputs.0##init7_s1_1/Opset.make_node.1/Shape
Add(x, init1_s_::RSh1) -> output_0
output: name='output_0' type=dtype('float32') shape=['batch', 4]

dynamo-ir#

inputs: #2[(T1s3x4,),(T1s5x4,)]
shapes: dict(x:{0:Dim(batch)})

opset: domain='' version=20
input: name='x' type=dtype('float32') shape=['batch', 4]
init: name='scalar_tensor_default' type=float32 shape=() -- array([1.], dtype=float32)
Add(x, scalar_tensor_default) -> clone
output: name='clone' type=dtype('float32') shape=['batch', 4]

tracing#

inputs: #2[(T1s3x4,),(T1s5x4,)]
shapes: dict(x:{0:Dim(batch)})

opset: domain='' version=21
input: name='x' type=dtype('float32') shape=['batch', 4]
init: name='init1_s_::RSh1' type=float32 shape=(1,) -- array([1.], dtype=float32)-- GraphBuilder.constant_folding.from/fold(init1_s_,init7_s1_1)##init1_s_/shape_type_compute._cast_inputs.1(add)##init7_s1_1/Opset.make_node.1/Shape
Add(x, init1_s_::RSh1) -> output
output: name='output' type=dtype('float32') shape=['batch', 4]

new-tracing#

inputs: #2[(T1s3x4,),(T1s5x4,)]
shapes: dict(x:{0:Dim(batch)})

opset: domain='' version=21
input: name='x' type=dtype('float32') shape=['batch', 4]
init: name='init1_s_::RSh1' type=float32 shape=(1,) -- array([1.], dtype=float32)-- GraphBuilder.constant_folding.from/fold(init1_s_,init7_s1_1)##init1_s_/shape_type_compute._cast_inputs.0##init7_s1_1/Opset.make_node.1/Shape
Add(x, init1_s_::RSh1) -> output
output: name='output' type=dtype('float32') shape=['batch', 4]

ControlFlowIndirectRanksCat#

code: yobx.torch.testing._model_eval_cases.ControlFlowIndirectRanksCat

forward#

def forward(self, x, y):
    x1 = x + 1
    y1 = y + 2
    cat = torch.cat([x1, y1], dim=1)
    if cat.ndim == 2:
        return cat.clone()
    return cat / cat.ndim

yobx#

inputs: #2[(T1s3x4,T1s3x4),(T1s5x4,T1s5x2)]
shapes: dict(x:{0:Dim(batch)},y:{0:Dim(batch),1:Dim(seq)})

opset: domain='' version=21
input: name='x' type=dtype('float32') shape=['batch', 4]
input: name='y' type=dtype('float32') shape=['batch', 'seq']
init: name='init1_s_::RSh1' type=float32 shape=(1,) -- array([1.], dtype=float32)-- GraphBuilder.constant_folding.from/fold(init1_s_,init7_s1_1)##init1_s_/shape_type_compute._cast_inputs.0##init7_s1_1/Opset.make_node.1/Shape##Opset.make_node.1/Shape
init: name='init1_s_2::RSh1' type=float32 shape=(1,) -- array([2.], dtype=float32)-- GraphBuilder.constant_folding.from/fold(init1_s_2,init7_s1_1)##init1_s_2/shape_type_compute._cast_inputs.0##init7_s1_1/Opset.make_node.1/Shape##Opset.make_node.1/Shape
Add(x, init1_s_::RSh1) -> add
Add(y, init1_s_2::RSh1) -> add_1
  Concat(add, add_1, axis=1) -> output_0
output: name='output_0' type=dtype('float32') shape=['batch', 'seq+4']

dynamo-ir#

inputs: #2[(T1s3x4,T1s3x4),(T1s5x4,T1s5x2)]
shapes: dict(x:{0:Dim(batch)},y:{0:Dim(batch),1:Dim(seq)})

opset: domain='' version=20
input: name='x' type=dtype('float32') shape=['batch', 4]
input: name='y' type=dtype('float32') shape=['batch', 'seq']
init: name='scalar_tensor_default' type=float32 shape=() -- array([1.], dtype=float32)
init: name='scalar_tensor_default_1' type=float32 shape=() -- array([2.], dtype=float32)
Add(x, scalar_tensor_default) -> add
Add(y, scalar_tensor_default_1) -> add_4
  Concat(add, add_4, axis=1) -> clone
output: name='clone' type=dtype('float32') shape=['batch', 'seq + 4']

tracing#

inputs: #2[(T1s3x4,T1s3x4),(T1s5x4,T1s5x2)]
shapes: dict(x:{0:Dim(batch)},y:{0:Dim(batch),1:Dim(seq)})

opset: domain='' version=21
input: name='x' type=dtype('float32') shape=['batch', 4]
input: name='y' type=dtype('float32') shape=['batch', 'seq']
init: name='init1_s_::RSh1' type=float32 shape=(1,) -- array([1.], dtype=float32)-- GraphBuilder.constant_folding.from/fold(init1_s_,init7_s1_1)##init1_s_/shape_type_compute._cast_inputs.1(add)##init7_s1_1/Opset.make_node.1/Shape##Opset.make_node.1/Shape
init: name='init1_s_2::RSh1' type=float32 shape=(1,) -- array([2.], dtype=float32)-- GraphBuilder.constant_folding.from/fold(init1_s_2,init7_s1_1)##init1_s_2/shape_type_compute._cast_inputs.1(add)##init7_s1_1/Opset.make_node.1/Shape##Opset.make_node.1/Shape
Add(x, init1_s_::RSh1) -> _onx_add_x
Add(y, init1_s_2::RSh1) -> _onx_add_y
  Concat(_onx_add_x, _onx_add_y, axis=1) -> output
output: name='output' type=dtype('float32') shape=['batch', 'seq+4']

new-tracing#

inputs: #2[(T1s3x4,T1s3x4),(T1s5x4,T1s5x2)]
shapes: dict(x:{0:Dim(batch)},y:{0:Dim(batch),1:Dim(seq)})

opset: domain='' version=21
input: name='x' type=dtype('float32') shape=['batch', 4]
input: name='y' type=dtype('float32') shape=['batch', 'seq']
init: name='init1_s_::RSh1' type=float32 shape=(1,) -- array([1.], dtype=float32)-- GraphBuilder.constant_folding.from/fold(init1_s_,init7_s1_1)##init1_s_/shape_type_compute._cast_inputs.0##init7_s1_1/Opset.make_node.1/Shape##Opset.make_node.1/Shape
init: name='init1_s_2::RSh1' type=float32 shape=(1,) -- array([2.], dtype=float32)-- GraphBuilder.constant_folding.from/fold(init1_s_2,init7_s1_1)##init1_s_2/shape_type_compute._cast_inputs.0##init7_s1_1/Opset.make_node.1/Shape##Opset.make_node.1/Shape
Add(x, init1_s_::RSh1) -> add_tensor
Add(y, init1_s_2::RSh1) -> add_tensor_1
  Concat(add_tensor, add_tensor_1, axis=1) -> output
output: name='output' type=dtype('float32') shape=['batch', 'seq+4']

ControlFlowNestCond#

code: yobx.torch.testing._model_eval_cases.ControlFlowNestCond

forward#

def forward(self, x):
    def true_fn2(x):
        def true_fn1(x):
            return torch.sin(x)

        def false_fn1(x):
            return torch.cos(x)

        return torch.cond(x.sum() < 0, true_fn1, false_fn1, [x])

    def false_fn2(x):
        return -x

    return torch.cond(x.sum() > 0, true_fn2, false_fn2, [x])

yobx#

inputs: #1[(T1s5x3,)]
shapes: dict(x:{0:Dim(batch)})

opset: domain='' version=21
opset: domain='local_functions.0' version=1
opset: domain='local_functions' version=1
input: name='x' type=dtype('float32') shape=['batch', 3]
init: name='init1_s_' type=float32 shape=() -- array([0.], dtype=float32)-- shape_type_compute._cast_inputs.1(gt_Scalar)
ReduceSum(x, keepdims=0) -> sum_1
  Greater(sum_1, init1_s_) -> gt
    If(gt, else_branch=G1, then_branch=G2) -> output_0
output: name='output_0' type=dtype('float32') shape=['batch', 3]
----- subgraph ---- If - cond - att.else_branch=G1 -- level=1 --  -> cond#0
Neg(x) -> cond#0
output: name='cond#0' type='NOTENSOR' shape=None
----- subgraph ---- If - cond - att.then_branch=G2 -- level=1 --  -> cond#0
ReduceSum(x, keepdims=0) -> sum_122
Less(sum_122, init1_s_) -> lt2
  If(lt2, else_branch=G3, then_branch=G4) -> cond#0
output: name='cond#0' type='NOTENSOR' shape=None
----- subgraph ---- If - cond22 - att.else_branch=G3 -- level=2 --  -> cond#0
Cos(x) -> cond#0
output: name='cond#0' type='NOTENSOR' shape=None
----- subgraph ---- If - cond22 - att.then_branch=G4 -- level=2 --  -> cond#0
Sin(x) -> cond#0
output: name='cond#0' type='NOTENSOR' shape=None
----- subgraph ---- If - cond22 - att.else_branch=G3 -- level=1 --  -> cond#0
Cos(x) -> cond#0
output: name='cond#0' type='NOTENSOR' shape=None
----- subgraph ---- If - cond22 - att.then_branch=G4 -- level=1 --  -> cond#0
Sin(x) -> cond#0
output: name='cond#0' type='NOTENSOR' shape=None

dynamo-ir#

inputs: #1[(T1s5x3,)]
shapes: dict(x:{0:Dim(batch)})

opset: domain='' version=20
input: name='x' type=dtype('float32') shape=['batch', 3]
init: name='scalar_tensor_default' type=float32 shape=() -- array([0.], dtype=float32)
ReduceSum(x, noop_with_empty_axes=0, keepdims=0) -> sum_1
  Greater(sum_1, scalar_tensor_default) -> gt
    If(gt, then_branch=G1, else_branch=G2) -> getitem
output: name='getitem' type=dtype('float32') shape=['batch', 3]
----- subgraph ---- If - node_cond__0 - att.then_branch=G1 -- level=1 --  -> getitem_true_graph_0
ReduceSum(x, noop_with_empty_axes=0, keepdims=0) -> sum_1_2
Less(sum_1_2, scalar_tensor_default) -> lt
  If(lt, then_branch=G3, else_branch=G4) -> getitem_true_graph_0
output: name='getitem_true_graph_0' type=dtype('float32') shape=['batch', 3]
----- subgraph ---- If - node_cond__0_2 - att.then_branch=G3 -- level=2 --  -> sin_true_graph_0__true_graph_0
Sin(x) -> sin_true_graph_0__true_graph_0
output: name='sin_true_graph_0__true_graph_0' type=dtype('float32') shape=['batch', 3]
----- subgraph ---- If - node_cond__0_2 - att.else_branch=G4 -- level=2 --  -> cos_true_graph_0__false_graph_0
Cos(x) -> cos_true_graph_0__false_graph_0
output: name='cos_true_graph_0__false_graph_0' type=dtype('float32') shape=['batch', 3]
----- subgraph ---- If - node_cond__0_2 - att.then_branch=G3 -- level=1 --  -> sin_true_graph_0__true_graph_0
Sin(x) -> sin_true_graph_0__true_graph_0
output: name='sin_true_graph_0__true_graph_0' type=dtype('float32') shape=['batch', 3]
----- subgraph ---- If - node_cond__0_2 - att.else_branch=G4 -- level=1 --  -> cos_true_graph_0__false_graph_0
Cos(x) -> cos_true_graph_0__false_graph_0
output: name='cos_true_graph_0__false_graph_0' type=dtype('float32') shape=['batch', 3]
----- subgraph ---- If - node_cond__0 - att.else_branch=G2 -- level=1 --  -> neg_false_graph_0
Neg(x) -> neg_false_graph_0
output: name='neg_false_graph_0' type=dtype('float32') shape=['batch', 3]

tracing#

inputs: #1[(T1s5x3,)]
shapes: dict(x:{0:Dim(batch)})

opset: domain='' version=21
opset: domain='local_functions.0' version=1
opset: domain='local_functions' version=1
input: name='x' type=dtype('float32') shape=['batch', 3]
init: name='init1_s_' type=float32 shape=() -- array([0.], dtype=float32)-- shape_type_compute._cast_inputs.1(gt)
ReduceSum(x, keepdims=0) -> sum_1
  Greater(sum_1, init1_s_) -> gt
    If(gt, else_branch=G1, then_branch=G2) -> output
output: name='output' type=dtype('float32') shape=['batch', 3]
----- subgraph ---- If - cond - att.else_branch=G1 -- level=1 --  -> condcc
Neg(x) -> condcc
output: name='condcc' type='NOTENSOR' shape=None
----- subgraph ---- If - cond - att.then_branch=G2 -- level=1 --  -> condcc
ReduceSum(x, keepdims=0) -> sum_122
Less(sum_122, init1_s_) -> lt2
  If(lt2, else_branch=G3, then_branch=G4) -> condcc
output: name='condcc' type='NOTENSOR' shape=None
----- subgraph ---- If - cond22 - att.else_branch=G3 -- level=2 --  -> condcc
Cos(x) -> condcc
output: name='condcc' type='NOTENSOR' shape=None
----- subgraph ---- If - cond22 - att.then_branch=G4 -- level=2 --  -> condcc
Sin(x) -> condcc
output: name='condcc' type='NOTENSOR' shape=None
----- subgraph ---- If - cond22 - att.else_branch=G3 -- level=1 --  -> condcc
Cos(x) -> condcc
output: name='condcc' type='NOTENSOR' shape=None
----- subgraph ---- If - cond22 - att.then_branch=G4 -- level=1 --  -> condcc
Sin(x) -> condcc
output: name='condcc' type='NOTENSOR' shape=None

new-tracing#

inputs: #1[(T1s5x3,)]
shapes: dict(x:{0:Dim(batch)})

opset: domain='' version=21
opset: domain='local_functions.0' version=1
opset: domain='local_functions' version=1
input: name='x' type=dtype('float32') shape=['batch', 3]
init: name='init1_s_' type=float32 shape=() -- array([0.], dtype=float32)-- shape_type_compute._cast_inputs.1(gt_Scalar)
ReduceSum(x, keepdims=0) -> sum_default
  Greater(sum_default, init1_s_) -> gt_scalar
    If(gt_scalar, else_branch=G1, then_branch=G2) -> output
output: name='output' type=dtype('float32') shape=['batch', 3]
----- subgraph ---- If - cond - att.else_branch=G1 -- level=1 --  -> cond
Neg(x) -> cond
output: name='cond' type='NOTENSOR' shape=None
----- subgraph ---- If - cond - att.then_branch=G2 -- level=1 --  -> cond
ReduceSum(x, keepdims=0) -> sum_12
Less(sum_12, init1_s_) -> lt2
  If(lt2, else_branch=G3, then_branch=G4) -> cond
output: name='cond' type='NOTENSOR' shape=None
----- subgraph ---- If - cond22 - att.else_branch=G3 -- level=2 --  -> condcc
Cos(x) -> condcc
output: name='condcc' type='NOTENSOR' shape=None
----- subgraph ---- If - cond22 - att.then_branch=G4 -- level=2 --  -> condcc
Sin(x) -> condcc
output: name='condcc' type='NOTENSOR' shape=None
----- subgraph ---- If - cond22 - att.else_branch=G3 -- level=1 --  -> condcc
Cos(x) -> condcc
output: name='condcc' type='NOTENSOR' shape=None
----- subgraph ---- If - cond22 - att.then_branch=G4 -- level=1 --  -> condcc
Sin(x) -> condcc
output: name='condcc' type='NOTENSOR' shape=None

ControlFlowNumelZero1#

code: yobx.torch.testing._model_eval_cases.ControlFlowNumelZero1

forward#

def forward(self, x):
    def empty_cache(x):
        return x.shape[-2]

    size = (empty_cache(x), 1)
    return torch.full(size, fill_value=2)

yobx#

inputs: #3[(T1s3x2x2x5,),(T1s3x2x1x5,),(T1s3x2x0x5,)]
shapes: dict(x:{0:DYNAMIC,2:DYNAMIC})

opset: domain='' version=21
input: name='x' type=dtype('float32') shape=['batch', 2, 'D0', 5]
init: name='init7_s1_1' type=int64 shape=(1,) -- array([1])           -- GraphBuilder.make_shape_from_results.conc
Shape(x, end=3, start=2) -> x::Shape2:3
  Concat(x::Shape2:3, init7_s1_1, axis=0) -> _onx_concat_sym_size_int_2::UnSq0
    ConstantOfShape(_onx_concat_sym_size_int_2::UnSq0, value=[2]) -> output_0
output: name='output_0' type=dtype('int64') shape=['D0', 1]

dynamo-ir#

inputs: #3[(T1s3x2x2x5,),(T1s3x2x1x5,),(T1s3x2x0x5,)]
shapes: dict(x:{0:DYNAMIC,2:DYNAMIC})

opset: domain='' version=20
input: name='x' type=dtype('float32') shape=['s77', 2, 's53', 5]
init: name='val_2' type=float32 shape=() -- array([2.], dtype=float32)
init: name='val_5' type=int64 shape=(1,) -- array([1])
Shape(x, end=3, start=2) -> val_0
  Concat(val_0, val_5, axis=0) -> val_6
    Expand(val_2, val_6) -> full
output: name='full' type=dtype('float32') shape=['s53', 1]

tracing#

inputs: #3[(T1s3x2x2x5,),(T1s3x2x1x5,),(T1s3x2x0x5,)]
shapes: dict(x:{0:DYNAMIC,2:DYNAMIC})

opset: domain='' version=21
input: name='x' type=dtype('float32') shape=['batch', 2, 'D0', 5]
init: name='init7_s1_1' type=int64 shape=(1,) -- array([1])           -- Opset.make_node.1/Shape##GraphBuilder.make_shape_from_results.conc
Shape(x, end=3, start=2) -> _onx_gather_size3
  Concat(_onx_gather_size3, init7_s1_1, axis=0) -> _onx_concat_getitem_2::UnSq0
    ConstantOfShape(_onx_concat_getitem_2::UnSq0, value=[2]) -> output
output: name='output' type=dtype('int64') shape=['D0', 1]

new-tracing#

inputs: #3[(T1s3x2x2x5,),(T1s3x2x1x5,),(T1s3x2x0x5,)]
shapes: dict(x:{0:DYNAMIC,2:DYNAMIC})

opset: domain='' version=21
input: name='x' type=dtype('float32') shape=['batch', 2, 'D0', 5]
init: name='init7_s1_1' type=int64 shape=(1,) -- array([1])           -- GraphBuilder.make_shape_from_results.conc
Shape(x, end=3, start=2) -> x::Shape2:3
  Concat(x::Shape2:3, init7_s1_1, axis=0) -> _onx_concat_sym_size_int::UnSq0
    ConstantOfShape(_onx_concat_sym_size_int::UnSq0, value=[2]) -> output
output: name='output' type=dtype('int64') shape=['D0', 1]

ControlFlowNumelZero2#

code: yobx.torch.testing._model_eval_cases.ControlFlowNumelZero2

forward#

def forward(self, x):
    def empty_cache(x):
        torch._check(x.numel() != 0)
        if x.numel() == 0:
            return 0
        return x.shape[-2]

    size = (empty_cache(x), 1)
    return torch.full(size, fill_value=2)

yobx#

inputs: #2[(T1s3x2x2x5,),(T1s3x2x1x5,)]
shapes: dict(x:{0:DYNAMIC,2:DYNAMIC})

opset: domain='' version=21
input: name='x' type=dtype('float32') shape=['batch', 2, 'D0', 5]
init: name='init7_s1_1' type=int64 shape=(1,) -- array([1])           -- GraphBuilder.make_shape_from_results.conc
Shape(x, end=3, start=2) -> x::Shape2:3
  Concat(x::Shape2:3, init7_s1_1, axis=0) -> _onx_concat_sym_size_int_2::UnSq0
    ConstantOfShape(_onx_concat_sym_size_int_2::UnSq0, value=[2]) -> output_0
output: name='output_0' type=dtype('int64') shape=['D0', 1]

dynamo-ir#

inputs: #2[(T1s3x2x2x5,),(T1s3x2x1x5,)]
shapes: dict(x:{0:DYNAMIC,2:DYNAMIC})

opset: domain='' version=20
input: name='x' type=dtype('float32') shape=['s77', 2, 's53', 5]
init: name='val_2' type=float32 shape=() -- array([2.], dtype=float32)
init: name='val_5' type=int64 shape=(1,) -- array([1])
Shape(x, end=3, start=2) -> val_0
  Concat(val_0, val_5, axis=0) -> val_6
    Expand(val_2, val_6) -> full
output: name='full' type=dtype('float32') shape=['s53', 1]

tracing#

inputs: #2[(T1s3x2x2x5,),(T1s3x2x1x5,)]
shapes: dict(x:{0:DYNAMIC,2:DYNAMIC})

opset: domain='' version=21
input: name='x' type=dtype('float32') shape=['batch', 2, 'D0', 5]
init: name='init7_s1_1' type=int64 shape=(1,) -- array([1])           -- Opset.make_node.1/Shape##GraphBuilder.make_shape_from_results.conc
Shape(x, end=3, start=2) -> _onx_gather_size3
  Concat(_onx_gather_size3, init7_s1_1, axis=0) -> _onx_concat_getitem_2::UnSq0
    ConstantOfShape(_onx_concat_getitem_2::UnSq0, value=[2]) -> output
output: name='output' type=dtype('int64') shape=['D0', 1]

new-tracing#

inputs: #2[(T1s3x2x2x5,),(T1s3x2x1x5,)]
shapes: dict(x:{0:DYNAMIC,2:DYNAMIC})

opset: domain='' version=21
input: name='x' type=dtype('float32') shape=['batch', 2, 'D0', 5]
init: name='init7_s1_1' type=int64 shape=(1,) -- array([1])           -- GraphBuilder.make_shape_from_results.conc
Shape(x, end=3, start=2) -> x::Shape2:3
  Concat(x::Shape2:3, init7_s1_1, axis=0) -> _onx_concat_sym_size_int_8::UnSq0
    ConstantOfShape(_onx_concat_sym_size_int_8::UnSq0, value=[2]) -> output
output: name='output' type=dtype('int64') shape=['D0', 1]

ControlFlowNumelZero3#

code: yobx.torch.testing._model_eval_cases.ControlFlowNumelZero3

forward#

def forward(self, x):
    def empty_cache(x):
        torch._check(x.shape[0] > 0)
        torch._check(x.shape[2] > 0)
        if x.numel() == 0:
            return 0
        return x.shape[-2]

    size = (empty_cache(x), 1)
    return torch.full(size, fill_value=2)

yobx#

inputs: #2[(T1s3x2x2x5,),(T1s3x2x1x5,)]
shapes: dict(x:{0:DYNAMIC,2:DYNAMIC})

opset: domain='' version=21
input: name='x' type=dtype('float32') shape=['batch', 2, 'D0', 5]
init: name='init7_s1_1' type=int64 shape=(1,) -- array([1])           -- GraphBuilder.make_shape_from_results.conc
Shape(x, end=3, start=2) -> x::Shape2:3
  Concat(x::Shape2:3, init7_s1_1, axis=0) -> _onx_concat_sym_size_int_3::UnSq0
    ConstantOfShape(_onx_concat_sym_size_int_3::UnSq0, value=[2]) -> output_0
output: name='output_0' type=dtype('int64') shape=['D0', 1]

dynamo-ir#

inputs: #2[(T1s3x2x2x5,),(T1s3x2x1x5,)]
shapes: dict(x:{0:DYNAMIC,2:DYNAMIC})

opset: domain='' version=20
input: name='x' type=dtype('float32') shape=['s77', 2, 's53', 5]
init: name='val_2' type=float32 shape=() -- array([2.], dtype=float32)
init: name='val_5' type=int64 shape=(1,) -- array([1])
Shape(x, end=3, start=2) -> val_0
  Concat(val_0, val_5, axis=0) -> val_6
    Expand(val_2, val_6) -> full
output: name='full' type=dtype('float32') shape=['s53', 1]

tracing#

inputs: #2[(T1s3x2x2x5,),(T1s3x2x1x5,)]
shapes: dict(x:{0:DYNAMIC,2:DYNAMIC})

opset: domain='' version=21
input: name='x' type=dtype('float32') shape=['batch', 2, 'D0', 5]
init: name='init7_s1_1' type=int64 shape=(1,) -- array([1])           -- Opset.make_node.1/Shape##Opset.make_node.1/Shape##Opset.make_node.1/Shape##GraphBuilder.make_shape_from_results.conc
Shape(x, end=3, start=2) -> _onx_gather_size_23
  Concat(_onx_gather_size_23, init7_s1_1, axis=0) -> _onx_concat_getitem_10::UnSq0
    ConstantOfShape(_onx_concat_getitem_10::UnSq0, value=[2]) -> output
output: name='output' type=dtype('int64') shape=['D0', 1]

new-tracing#

inputs: #2[(T1s3x2x2x5,),(T1s3x2x1x5,)]
shapes: dict(x:{0:DYNAMIC,2:DYNAMIC})

opset: domain='' version=21
input: name='x' type=dtype('float32') shape=['batch', 2, 'D0', 5]
init: name='init7_s1_1' type=int64 shape=(1,) -- array([1])           -- GraphBuilder.make_shape_from_results.conc
Shape(x, end=3, start=2) -> x::Shape2:3
  Concat(x::Shape2:3, init7_s1_1, axis=0) -> _onx_concat_sym_size_int_4::UnSq0
    ConstantOfShape(_onx_concat_sym_size_int_4::UnSq0, value=[2]) -> output
output: name='output' type=dtype('int64') shape=['D0', 1]

ControlFlowNumelZero4#

code: yobx.torch.testing._model_eval_cases.ControlFlowNumelZero4

forward#

def forward(self, x):
    def empty_cache(x):
        torch._check(x.shape[0] > 0 and x.shape[2] > 0)
        if x.numel() == 0:
            return 0
        return x.shape[-2]

    size = (empty_cache(x), 1)
    return torch.full(size, fill_value=2)

yobx#

inputs: #2[(T1s3x2x2x5,),(T1s3x2x1x5,)]
shapes: dict(x:{0:DYNAMIC,2:DYNAMIC})

opset: domain='' version=21
input: name='x' type=dtype('float32') shape=['batch', 2, 'D0', 5]
init: name='init7_s1_1' type=int64 shape=(1,) -- array([1])           -- GraphBuilder.make_shape_from_results.conc
Shape(x, end=3, start=2) -> x::Shape2:3
  Concat(x::Shape2:3, init7_s1_1, axis=0) -> _onx_concat_sym_size_int_3::UnSq0
    ConstantOfShape(_onx_concat_sym_size_int_3::UnSq0, value=[2]) -> output_0
output: name='output_0' type=dtype('int64') shape=['D0', 1]

dynamo-ir#

inputs: #2[(T1s3x2x2x5,),(T1s3x2x1x5,)]
shapes: dict(x:{0:DYNAMIC,2:DYNAMIC})

opset: domain='' version=20
input: name='x' type=dtype('float32') shape=['s77', 2, 's53', 5]
init: name='val_2' type=float32 shape=() -- array([2.], dtype=float32)
init: name='val_5' type=int64 shape=(1,) -- array([1])
Shape(x, end=3, start=2) -> val_0
  Concat(val_0, val_5, axis=0) -> val_6
    Expand(val_2, val_6) -> full
output: name='full' type=dtype('float32') shape=['s53', 1]

tracing#

inputs: #2[(T1s3x2x2x5,),(T1s3x2x1x5,)]
shapes: dict(x:{0:DYNAMIC,2:DYNAMIC})

opset: domain='' version=21
input: name='x' type=dtype('float32') shape=['batch', 2, 'D0', 5]
init: name='init7_s1_1' type=int64 shape=(1,) -- array([1])           -- Opset.make_node.1/Shape##Opset.make_node.1/Shape##Opset.make_node.1/Shape##GraphBuilder.make_shape_from_results.conc
Shape(x, end=3, start=2) -> _onx_gather_size_23
  Concat(_onx_gather_size_23, init7_s1_1, axis=0) -> _onx_concat_getitem_10::UnSq0
    ConstantOfShape(_onx_concat_getitem_10::UnSq0, value=[2]) -> output
output: name='output' type=dtype('int64') shape=['D0', 1]

new-tracing#

inputs: #2[(T1s3x2x2x5,),(T1s3x2x1x5,)]
shapes: dict(x:{0:DYNAMIC,2:DYNAMIC})

opset: domain='' version=21
input: name='x' type=dtype('float32') shape=['batch', 2, 'D0', 5]
init: name='init7_s1_1' type=int64 shape=(1,) -- array([1])           -- GraphBuilder.make_shape_from_results.conc
Shape(x, end=3, start=2) -> x::Shape2:3
  Concat(x::Shape2:3, init7_s1_1, axis=0) -> _onx_concat_sym_size_int_4::UnSq0
    ConstantOfShape(_onx_concat_sym_size_int_4::UnSq0, value=[2]) -> output
output: name='output' type=dtype('int64') shape=['D0', 1]

ControlFlowNumelZero5#

code: yobx.torch.testing._model_eval_cases.ControlFlowNumelZero5

forward#

def forward(self, x):
    def empty_cache(x):
        torch._check(x.numel() != 0)
        if x.shape[0] != 0 and x.shape[2] != 0:
            return 0
        return x.shape[-2]

    size = (empty_cache(x), 1)
    return torch.full(size, fill_value=2)

yobx#

inputs: #2[(T1s3x2x2x5,),(T1s3x2x1x5,)]
shapes: dict(x:{0:DYNAMIC,2:DYNAMIC})

opset: domain='' version=21
input: name='x' type=dtype('float32') shape=['DYN0', 2, 'DYN1', 5]
init: name='init7_s2_0_1' type=int64 shape=(2,) -- array([0, 1])      -- Opset.make_node.1/Shape
ConstantOfShape(init7_s2_0_1, value=[2]) -> output_0
output: name='output_0' type=dtype('int64') shape=['', 1]

dynamo-ir#

inputs: #2[(T1s3x2x2x5,),(T1s3x2x1x5,)]
shapes: dict(x:{0:DYNAMIC,2:DYNAMIC})

opset: domain='' version=20
input: name='x' type=dtype('float32') shape=['s77', 2, 's53', 5]
init: name='full' type=float32 shape=(0, 1) -- array([], dtype=float32)
output: name='full' type=dtype('float32') shape=['', 1]

tracing#

inputs: #2[(T1s3x2x2x5,),(T1s3x2x1x5,)]
shapes: dict(x:{0:DYNAMIC,2:DYNAMIC})

opset: domain='' version=21
input: name='x' type=dtype('float32') shape=['batch', 2, 'D0', 5]
init: name='_tensor_constant0' type=int64 shape=(0, 1) -- array([], dtype=int64)-- DynamoInterpret.get_attr.0
Identity(_tensor_constant0) -> output
output: name='output' type=dtype('int64') shape=['', 1]

new-tracing#

inputs: #2[(T1s3x2x2x5,),(T1s3x2x1x5,)]
shapes: dict(x:{0:DYNAMIC,2:DYNAMIC})

opset: domain='' version=21
input: name='x' type=dtype('float32') shape=['DYN0', 2, 'DYN1', 5]
init: name='init7_s2_0_1' type=int64 shape=(2,) -- array([0, 1])      -- Opset.make_node.1/Shape
ConstantOfShape(init7_s2_0_1, value=[2]) -> output
output: name='output' type=dtype('int64') shape=['', 1]

ControlFlowRanks#

code: yobx.torch.testing._model_eval_cases.ControlFlowRanks

forward#

def forward(self, x):
    if x.ndim == 2:
        return x.clone()
    return x / x.ndim

yobx#

inputs: #2[(T1s3x4,),(T1s5x4,)]
shapes: dict(x:{0:Dim(batch)})

opset: domain='' version=21
input: name='x' type=dtype('float32') shape=['batch', 4]
Identity(x) -> output_0
output: name='output_0' type=dtype('float32') shape=['batch', 4]

dynamo-ir#

inputs: #2[(T1s3x4,),(T1s5x4,)]
shapes: dict(x:{0:Dim(batch)})

opset: domain='' version=20
input: name='x' type=dtype('float32') shape=['batch', 4]
Identity(x) -> clone
output: name='clone' type=dtype('float32') shape=['batch', 4]

tracing#

inputs: #2[(T1s3x4,),(T1s5x4,)]
shapes: dict(x:{0:Dim(batch)})

opset: domain='' version=21
input: name='x' type=dtype('float32') shape=['batch', 4]
Identity(x) -> output
output: name='output' type=dtype('float32') shape=['batch', 4]

new-tracing#

inputs: #2[(T1s3x4,),(T1s5x4,)]
shapes: dict(x:{0:Dim(batch)})

opset: domain='' version=21
input: name='x' type=dtype('float32') shape=['batch', 4]
Identity(x) -> output
output: name='output' type=dtype('float32') shape=['batch', 4]

ControlFlowRanksType#

code: yobx.torch.testing._model_eval_cases.ControlFlowRanksType

forward#

def forward(self, x=None):
    if (
        x is not None
        and (x.dtype == torch.float32 or x.dtype == torch.float16)
        and x.ndim == 2
    ):
        return x.clone()
    torch._check(x is not None)
    return (x / x.ndim).to(torch.float32)  # type: ignore

yobx#

inputs: #2[(T1s3x4,),(T1s5x4,)]
shapes: dict(x:{0:Dim(batch)})

opset: domain='' version=21
input: name='x' type=dtype('float32') shape=['batch', 4]
Identity(x) -> output_0
output: name='output_0' type=dtype('float32') shape=['batch', 4]

dynamo-ir#

inputs: #2[(T1s3x4,),(T1s5x4,)]
shapes: dict(x:{0:Dim(batch)})

opset: domain='' version=20
input: name='x' type=dtype('float32') shape=['batch', 4]
Identity(x) -> clone
output: name='clone' type=dtype('float32') shape=['batch', 4]

tracing#

inputs: #2[(T1s3x4,),(T1s5x4,)]
shapes: dict(x:{0:Dim(batch)})

opset: domain='' version=21
input: name='x' type=dtype('float32') shape=['batch', 4]
Identity(x) -> output
output: name='output' type=dtype('float32') shape=['batch', 4]

new-tracing#

inputs: #2[(T1s3x4,),(T1s5x4,)]
shapes: dict(x:{0:Dim(batch)})

opset: domain='' version=21
input: name='x' type=dtype('float32') shape=['batch', 4]
Identity(x) -> output
output: name='output' type=dtype('float32') shape=['batch', 4]

ControlFlowScan#

code: yobx.torch.testing._model_eval_cases.ControlFlowScan

forward#

def forward(self, x):
    def add(carry: torch.Tensor, y: torch.Tensor):
        next_carry = carry + y
        return [next_carry, next_carry]

    init = x.new_zeros(x.shape[1:])
    carry, _out = torch.ops.higher_order.scan(add, [init], [x], additional_inputs=[])
    return carry

yobx#

inputs: #1[(T1s3x3,)]
shapes: dict(x:{0:Dim(batch)})

opset: domain='' version=21
opset: domain='local_functions' version=1
input: name='x' type=dtype('float32') shape=['batch', 3]
init: name='init7_s1_3' type=int64 shape=(1,) -- array([3])           -- Opset.make_node.1/Shape
ConstantOfShape(init7_s1_3, value=[0.0]) -> new_zeros
  Scan(new_zeros, x, body=G1, num_scan_inputs=1, scan_input_directions=[0], scan_output_axes=[0], scan_output_directions=[0]) -> output_0, scan#1
output: name='output_0' type=dtype('float32') shape=[3]
----- subgraph ---- Scan - scan - att.body=G1 -- level=1 -- init_0_new_zeros,scan_0_x -> output_0,output_1
input: name='init_0_new_zeros' type=dtype('float32') shape=None
input: name='scan_0_x' type=dtype('float32') shape=None
Add(init_0_new_zeros, scan_0_x) -> output_0
  Identity(output_0) -> output_1
output: name='output_0' type=dtype('float32') shape=None
output: name='output_1' type=dtype('float32') shape=None

dynamo-ir#

FAILED

Failed to decompose the FX graph for ONNX compatibility. [96mThis is step 2/3[0m of exporting the model to ONNX. Next steps:
- Create an issue in the PyTorch GitHub repository against the [96m*torch.export*[0m component and attach the full error stack as well as reproduction scripts.
- Create an error report with `torch.onnx.export(..., report=True)`, and save the ExportedProgram as a pt2 file. Create an issue in the PyTorch GitHub repository against the [96m*onnx*[0m component. Attach the error report and the pt2 model.

## Exception summary

<class 'RuntimeError'>: scan might be aliasing the input or the output!

While executing %scan : [num_users=2] = call_function[target=torch.ops.higher_order.scan](args = (%scan_combine_graph_0, [%new_zeros], [%x], ()), kwargs = {})
Original traceback:
File "~/github/yet-another-onnx-builder/yobx/torch/testing/_model_eval_cases.py", line 668, in forward
    carry, _out = torch.ops.higher_order.scan(add, [init], [x], additional_inputs=[])
Use tlparse to see full graph. (https://github.com/pytorch/tlparse?tab=readme-ov-file#tlparse-parse-structured-pt2-logs)

(Refer to the full stack trace above for more information.)

tracing#

inputs: #1[(T1s3x3,)]
shapes: dict(x:{0:Dim(batch)})

opset: domain='' version=21
opset: domain='local_functions' version=1
input: name='x' type=dtype('float32') shape=['batch', 3]
init: name='init7_s1_3' type=int64 shape=(1,) -- array([3])           -- Opset.make_node.1/Shape
ConstantOfShape(init7_s1_3, value=[0.0]) -> new_zeros
  Scan(new_zeros, x, body=G1, num_scan_inputs=1, scan_input_directions=[0], scan_output_axes=[0], scan_output_directions=[0]) -> output, scancc#1
output: name='output' type=dtype('float32') shape=[3]
----- subgraph ---- Scan - scan - att.body=G1 -- level=1 -- init_0_new_zeros,scan_0_x -> output_0,output_1
input: name='init_0_new_zeros' type=dtype('float32') shape=None
input: name='scan_0_x' type=dtype('float32') shape=None
Add(init_0_new_zeros, scan_0_x) -> output_0
  Identity(output_0) -> output_1
output: name='output_0' type=dtype('float32') shape=None
output: name='output_1' type=dtype('float32') shape=None

new-tracing#

inputs: #1[(T1s3x3,)]
shapes: dict(x:{0:Dim(batch)})

opset: domain='' version=21
opset: domain='local_functions' version=1
input: name='x' type=dtype('float32') shape=['batch', 3]
init: name='init7_s1_3' type=int64 shape=(1,) -- array([3])           -- Opset.make_node.1/Shape
ConstantOfShape(init7_s1_3, value=[0.0]) -> new_zeros_default
  Scan(new_zeros_default, x, body=G1, num_scan_inputs=1, scan_input_directions=[0], scan_output_axes=[0], scan_output_directions=[0]) -> output, scan#1
output: name='output' type=dtype('float32') shape=[3]
----- subgraph ---- Scan - scan - att.body=G1 -- level=1 -- init_0_new_zeros_default,scan_0_x -> output_0,output_1
input: name='init_0_new_zeros_default' type=dtype('float32') shape=None
input: name='scan_0_x' type=dtype('float32') shape=None
Add(init_0_new_zeros_default, scan_0_x) -> output_0
  Identity(output_0) -> output_1
output: name='output_0' type=dtype('float32') shape=None
output: name='output_1' type=dtype('float32') shape=None

ControlFlowScan2Carried#

code: yobx.torch.testing._model_eval_cases.ControlFlowScan2Carried

forward#

def forward(self, x):
    def add(carry1: torch.Tensor, carry2: torch.Tensor, y1: torch.Tensor, y2: torch.Tensor):
        next_carry1 = carry1 + y1
        next_carry2 = carry2 * y2
        return [next_carry1, next_carry2, next_carry1, next_carry2]

    init1 = torch.zeros_like(x[0])
    init2 = torch.ones_like(x[0])
    carry1, carry2, out1, out2 = torch.ops.higher_order.scan(
        add,
        [init1, init2],
        [x, x * 2],
        # dim=0,  # 01/31/2025, not supported anymore
        additional_inputs=[],
    )
    return carry1, carry2, out1, out2

yobx#

inputs: #1[(T1s3x4,)]
shapes: dict(x:{0:Dim(batch)})

opset: domain='' version=21
opset: domain='local_functions' version=1
input: name='x' type=dtype('float32') shape=['batch', 4]
init: name='init7_s1_4' type=int64 shape=(1,) -- array([4])           -- Opset.make_node.1/Shape##Opset.make_node.1/Shape
init: name='init1_s_::RSh1' type=float32 shape=(1,) -- array([2.], dtype=float32)-- GraphBuilder.constant_folding.from/fold(init1_s_,init7_s1_1)##init1_s_/shape_type_compute._cast_inputs.1(mul_Tensor)##init7_s1_1/Opset.make_node.1/Shape
ConstantOfShape(init7_s1_4, value=[1.0]) -> ones_like
ConstantOfShape(init7_s1_4, value=[0.0]) -> zeros_like
Mul(x, init1_s_::RSh1) -> _onx_mul_x
  Scan(zeros_like, ones_like, x, _onx_mul_x, body=G1, num_scan_inputs=2, scan_input_directions=[0,0], scan_output_axes=[0,0], scan_output_directions=[0,0]) -> output_0, output_1, output_2, output_3
output: name='output_0' type=dtype('float32') shape=[4]
output: name='output_1' type=dtype('float32') shape=[4]
output: name='output_2' type=dtype('float32') shape=['batch', 4]
output: name='output_3' type=dtype('float32') shape=['batch', 4]
----- subgraph ---- Scan - scan - att.body=G1 -- level=1 -- init_0_zeros_like,init_1_ones_like,scan_0_x,scan_1_mul -> output_0,output_1,output_2,output_3
input: name='init_0_zeros_like' type=dtype('float32') shape=None
input: name='init_1_ones_like' type=dtype('float32') shape=None
input: name='scan_0_x' type=dtype('float32') shape=None
input: name='scan_1_mul' type=dtype('float32') shape=None
Add(init_0_zeros_like, scan_0_x) -> output_0
  Identity(output_0) -> output_2
Mul(init_1_ones_like, scan_1_mul) -> output_1
  Identity(output_1) -> output_3
output: name='output_0' type=dtype('float32') shape=None
output: name='output_1' type=dtype('float32') shape=None
output: name='output_2' type=dtype('float32') shape=None
output: name='output_3' type=dtype('float32') shape=None

dynamo-ir#

FAILED

Failed to decompose the FX graph for ONNX compatibility. [96mThis is step 2/3[0m of exporting the model to ONNX. Next steps:
- Create an issue in the PyTorch GitHub repository against the [96m*torch.export*[0m component and attach the full error stack as well as reproduction scripts.
- Create an error report with `torch.onnx.export(..., report=True)`, and save the ExportedProgram as a pt2 file. Create an issue in the PyTorch GitHub repository against the [96m*onnx*[0m component. Attach the error report and the pt2 model.

## Exception summary

<class 'RuntimeError'>: scan might be aliasing the input or the output!

While executing %scan : [num_users=4] = call_function[target=torch.ops.higher_order.scan](args = (%scan_combine_graph_0, [%zeros_like, %ones_like], [%x, %mul], ()), kwargs = {})
Original traceback:
File "~/github/yet-another-onnx-builder/yobx/torch/testing/_model_eval_cases.py", line 684, in forward
    carry1, carry2, out1, out2 = torch.ops.higher_order.scan(
Use tlparse to see full graph. (https://github.com/pytorch/tlparse?tab=readme-ov-file#tlparse-parse-structured-pt2-logs)

(Refer to the full stack trace above for more information.)

tracing#

inputs: #1[(T1s3x4,)]
shapes: dict(x:{0:Dim(batch)})

opset: domain='' version=21
opset: domain='local_functions' version=1
input: name='x' type=dtype('float32') shape=['batch', 4]
init: name='init7_s1_4' type=int64 shape=(1,) -- array([4])           -- Opset.make_node.1/Shape##Opset.make_node.1/Shape
init: name='init1_s_::RSh1' type=float32 shape=(1,) -- array([2.], dtype=float32)-- GraphBuilder.constant_folding.from/fold(init1_s_,init7_s1_1)##init1_s_/shape_type_compute._cast_inputs.1(mul)##init7_s1_1/Opset.make_node.1/Shape
ConstantOfShape(init7_s1_4, value=[1.0]) -> ones_like
ConstantOfShape(init7_s1_4, value=[0.0]) -> zeros_like
Mul(x, init1_s_::RSh1) -> _onx_mul_x
  Scan(zeros_like, ones_like, x, _onx_mul_x, body=G1, num_scan_inputs=2, scan_input_directions=[0,0], scan_output_axes=[0,0], scan_output_directions=[0,0]) -> output_0, output_1, output_2, output_3
output: name='output_0' type=dtype('float32') shape=[4]
output: name='output_1' type=dtype('float32') shape=[4]
output: name='output_2' type=dtype('float32') shape=['d_output_2_0', 'd_output_2_1']
output: name='output_3' type=dtype('float32') shape=['d_output_3_0', 'd_output_3_1']
----- subgraph ---- Scan - scan - att.body=G1 -- level=1 -- init_0_zeros_like,init_1_ones_like,scan_0_x,scan_1_mul -> output_0,output_1,output_2,output_3
input: name='init_0_zeros_like' type=dtype('float32') shape=None
input: name='init_1_ones_like' type=dtype('float32') shape=None
input: name='scan_0_x' type=dtype('float32') shape=None
input: name='scan_1_mul' type=dtype('float32') shape=None
Add(init_0_zeros_like, scan_0_x) -> output_0
  Identity(output_0) -> output_2
Mul(init_1_ones_like, scan_1_mul) -> output_1
  Identity(output_1) -> output_3
output: name='output_0' type=dtype('float32') shape=None
output: name='output_1' type=dtype('float32') shape=None
output: name='output_2' type=dtype('float32') shape=None
output: name='output_3' type=dtype('float32') shape=None

new-tracing#

inputs: #1[(T1s3x4,)]
shapes: dict(x:{0:Dim(batch)})

opset: domain='' version=21
opset: domain='local_functions' version=1
input: name='x' type=dtype('float32') shape=['batch', 4]
init: name='init7_s1_4' type=int64 shape=(1,) -- array([4])           -- Opset.make_node.1/Shape##Opset.make_node.1/Shape
init: name='init1_s_::RSh1' type=float32 shape=(1,) -- array([2.], dtype=float32)-- GraphBuilder.constant_folding.from/fold(init1_s_,init7_s1_1)##init1_s_/shape_type_compute._cast_inputs.1(mul_Tensor)##init7_s1_1/Opset.make_node.1/Shape
ConstantOfShape(init7_s1_4, value=[1.0]) -> ones_like_default
ConstantOfShape(init7_s1_4, value=[0.0]) -> zeros_like_default
Mul(x, init1_s_::RSh1) -> _onx_mul_x
  Scan(zeros_like_default, ones_like_default, x, _onx_mul_x, body=G1, num_scan_inputs=2, scan_input_directions=[0,0], scan_output_axes=[0,0], scan_output_directions=[0,0]) -> output_0, output_1, output_2, output_3
output: name='output_0' type=dtype('float32') shape=[4]
output: name='output_1' type=dtype('float32') shape=[4]
output: name='output_2' type=dtype('float32') shape=['batch', 4]
output: name='output_3' type=dtype('float32') shape=['batch', 4]
----- subgraph ---- Scan - scan - att.body=G1 -- level=1 -- init_0_detach_default,init_1_detach_default_1,scan_0_x,scan_1_mul_tensor -> output_0,output_1,output_2,output_3
input: name='init_0_detach_default' type=dtype('float32') shape=None
input: name='init_1_detach_default_1' type=dtype('float32') shape=None
input: name='scan_0_x' type=dtype('float32') shape=None
input: name='scan_1_mul_tensor' type=dtype('float32') shape=None
Add(init_0_detach_default, scan_0_x) -> output_0
  Identity(output_0) -> output_2
Mul(init_1_detach_default_1, scan_1_mul_tensor) -> output_1
  Identity(output_1) -> output_3
output: name='output_0' type=dtype('float32') shape=None
output: name='output_1' type=dtype('float32') shape=None
output: name='output_2' type=dtype('float32') shape=None
output: name='output_3' type=dtype('float32') shape=None

ControlFlowScanCDist#

code: yobx.torch.testing._model_eval_cases.ControlFlowScanCDist

forward#

def forward(self, x):
    def dist(carry: torch.Tensor, x: torch.Tensor):
        sub = carry - x.reshape((1, -1))
        sq = sub * sub
        rd = sq.sum(dim=1) ** 0.5
        # clone --> UnsupportedAliasMutationException:
        # Combine_fn might be aliasing the input!
        return [carry.clone(), rd]

    _carry, out = torch.ops.higher_order.scan(
        dist,
        [x],
        [x],
        # dim=0,  # 01/31/2025, not supported anymore
        additional_inputs=[],
    )
    return out

yobx#

inputs: #1[(T1s3x4,)]
shapes: dict(x:{0:Dim(batch)})

opset: domain='' version=21
opset: domain='local_functions' version=1
input: name='x' type=dtype('float32') shape=['batch', 4]
init: name='init7_s1_12_cst2init' type=int64 shape=(1,) -- array([1]) -- GraphBuilderPatternOptimization.make_initializer.1/Shape
init: name='init1_s_2_cst2init' type=float32 shape=() -- array([0.5], dtype=float32)-- GraphBuilderPatternOptimization.make_initializer.1/Small
init: name='init7_s1_02_cst2init' type=int64 shape=(1,) -- array([0]) -- GraphBuilderPatternOptimization.make_initializer.1/Shape
Scan(x, x, body=G1, num_scan_inputs=1, scan_input_directions=[0], scan_output_axes=[0], scan_output_directions=[0]) -> scan#0, output_0
output: name='output_0' type=dtype('float32') shape=['batch', 'batch']
----- subgraph ---- Scan - scan - att.body=G1 -- level=1 -- init_0_x,scan_0_x -> output_0,output_1
input: name='init_0_x' type=dtype('float32') shape=None
input: name='scan_0_x' type=dtype('float32') shape=None
Identity(init_0_x) -> output_0
Unsqueeze(scan_0_x, init7_s1_02_cst2init) -> reshape2
  Sub(init_0_x, reshape2) -> sub2
    Mul(sub2, sub2) -> mul2
ReduceSum(mul2, init7_s1_12_cst2init, keepdims=0) -> sum_12
Pow(sum_12, init1_s_2_cst2init) -> output_1
output: name='output_0' type=dtype('float32') shape=None
output: name='output_1' type=dtype('float32') shape=None

dynamo-ir#

inputs: #1[(T1s3x4,)]
shapes: dict(x:{0:Dim(batch)})

opset: domain='' version=20
input: name='x' type=dtype('float32') shape=['batch', 4]
init: name='val_3' type=int64 shape=(2,) -- array([ 1, -1])
init: name='val_4' type=int64 shape=(1,) -- array([1])
init: name='val_5' type=float32 shape=() -- array([0.5], dtype=float32)
Scan(x, x, body=G1, num_scan_inputs=1, scan_input_directions=[0], scan_output_directions=[0]) -> scan__0, getitem_1
output: name='getitem_1' type=dtype('float32') shape=['batch', 'batch']
----- subgraph ---- Scan - node_scan__1 - att.body=G1 -- level=1 -- x_scan_combine_graph_0__subgraph_in,x_scan_combine_graph_0__subgraph_in_1 -> clone_scan_combine_graph_0,pow_1_scan_combine_graph_0
input: name='x_scan_combine_graph_0__subgraph_in' type=dtype('float32') shape=['s77', 4]
input: name='x_scan_combine_graph_0__subgraph_in_1' type=dtype('float32') shape=[4]
Identity(x_scan_combine_graph_0__subgraph_in) -> clone_scan_combine_graph_0
Reshape(x_scan_combine_graph_0__subgraph_in_1, val_3, allowzero=1) -> view
  Sub(x_scan_combine_graph_0__subgraph_in, view) -> sub_1
    Mul(sub_1, sub_1) -> mul_4
ReduceSum(mul_4, val_4, noop_with_empty_axes=0, keepdims=0) -> sum_1
Pow(sum_1, val_5) -> pow_1_scan_combine_graph_0
output: name='clone_scan_combine_graph_0' type=dtype('float32') shape=['batch', 4]
output: name='pow_1_scan_combine_graph_0' type=dtype('float32') shape=['batch']

tracing#

inputs: #1[(T1s3x4,)]
shapes: dict(x:{0:Dim(batch)})

opset: domain='' version=21
opset: domain='local_functions' version=1
input: name='x' type=dtype('float32') shape=['batch', 4]
init: name='init7_s1_12_cst2init' type=int64 shape=(1,) -- array([1]) -- GraphBuilderPatternOptimization.make_initializer.1/Shape
init: name='init7_s1_02_cst2init' type=int64 shape=(1,) -- array([0]) -- GraphBuilderPatternOptimization.make_initializer.1/Shape
Scan(x, x, body=G1, num_scan_inputs=1, scan_input_directions=[0], scan_output_axes=[0], scan_output_directions=[0]) -> scancc#0, output
output: name='output' type=dtype('float32') shape=['d_output_0', 'd_output_1']
----- subgraph ---- Scan - scan - att.body=G1 -- level=1 -- init_0_x,scan_0_x -> output_0,output_1
input: name='init_0_x' type=dtype('float32') shape=None
input: name='scan_0_x' type=dtype('float32') shape=None
Identity(init_0_x) -> output_0
Unsqueeze(scan_0_x, init7_s1_02_cst2init) -> reshape2
  Sub(init_0_x, reshape2) -> sub2
    Mul(sub2, sub2) -> mul2
ReduceSum(mul2, init7_s1_12_cst2init, keepdims=0) -> sum_12
  Sqrt(sum_12) -> output_1
output: name='output_0' type=dtype('float32') shape=None
output: name='output_1' type=dtype('float32') shape=None

new-tracing#

inputs: #1[(T1s3x4,)]
shapes: dict(x:{0:Dim(batch)})

opset: domain='' version=21
opset: domain='local_functions' version=1
input: name='x' type=dtype('float32') shape=['batch', 4]
init: name='init7_s1_12_cst2init' type=int64 shape=(1,) -- array([1]) -- GraphBuilderPatternOptimization.make_initializer.1/Shape
init: name='init1_s_2_cst2init' type=float32 shape=() -- array([0.5], dtype=float32)-- GraphBuilderPatternOptimization.make_initializer.1/Small
init: name='init7_s1_02_cst2init' type=int64 shape=(1,) -- array([0]) -- GraphBuilderPatternOptimization.make_initializer.1/Shape
Scan(x, x, body=G1, num_scan_inputs=1, scan_input_directions=[0], scan_output_axes=[0], scan_output_directions=[0]) -> scan#0, output
output: name='output' type=dtype('float32') shape=['batch', 'batch']
----- subgraph ---- Scan - scan - att.body=G1 -- level=1 -- init_0_x,scan_0_x -> output_0,output_1
input: name='init_0_x' type=dtype('float32') shape=None
input: name='scan_0_x' type=dtype('float32') shape=None
Identity(init_0_x) -> output_0
Unsqueeze(scan_0_x, init7_s1_02_cst2init) -> view_default2
  Sub(init_0_x, view_default2) -> sub_tensor2
    Mul(sub_tensor2, sub_tensor2) -> mul_tensor2
ReduceSum(mul_tensor2, init7_s1_12_cst2init, keepdims=0) -> sum_dim_int_list2
Pow(sum_dim_int_list2, init1_s_2_cst2init) -> output_1
output: name='output_0' type=dtype('float32') shape=None
output: name='output_1' type=dtype('float32') shape=None

ControlFlowScanCDist2#

code: yobx.torch.testing._model_eval_cases.ControlFlowScanCDist2

forward#

def forward(self, x):
    def dist(unused: torch.Tensor, x: torch.Tensor, samex: torch.Tensor):
        sub = samex - x.reshape((1, -1))
        sq = sub * sub
        rd = torch.sqrt(sq.sum(dim=1))
        # clone --> UnsupportedAliasMutationException:
        # Combine_fn might be aliasing the input!
        return [unused.clone(), rd]

    z = torch.tensor([0], dtype=torch.float32)
    y = x.clone()
    out = torch.ops.higher_order.scan(
        dist,
        [z],
        [x],
        # dim=0,  # 01/31/2025, not supported anymore
        additional_inputs=[y],
    )
    return out[1]

yobx#

inputs: #1[(T1s3x4,)]
shapes: dict(x:{0:Dim(batch)})

opset: domain='' version=21
opset: domain='local_functions' version=1
input: name='x' type=dtype('float32') shape=['batch', 4]
init: name='c_lifted_tensor_0' type=float32 shape=(1,) -- array([0.], dtype=float32)-- DynamoInterpret.placeholder.0
init: name='init7_s1_12_cst2init' type=int64 shape=(1,) -- array([1]) -- GraphBuilderPatternOptimization.make_initializer.1/Shape
init: name='init7_s1_02_cst2init' type=int64 shape=(1,) -- array([0]) -- GraphBuilderPatternOptimization.make_initializer.1/Shape
Identity(x) -> hidden_input_scan_0_clone
Scan(c_lifted_tensor_0, x, body=G1, num_scan_inputs=1, scan_input_directions=[0], scan_output_axes=[0], scan_output_directions=[0]) -> scan#0, output_0
output: name='output_0' type=dtype('float32') shape=['batch', 'batch']
----- subgraph ---- Scan - scan - att.body=G1 -- level=1 -- init_0_detach_,scan_0_x -> output_0,output_1
input: name='init_0_detach_' type=dtype('float32') shape=None
input: name='scan_0_x' type=dtype('float32') shape=None
Identity(init_0_detach_) -> output_0
Unsqueeze(scan_0_x, init7_s1_02_cst2init) -> reshape2
Sub(hidden_input_scan_0_clone, reshape2) -> sub2
  Mul(sub2, sub2) -> mul2
ReduceSum(mul2, init7_s1_12_cst2init, keepdims=0) -> sum_12
  Sqrt(sum_12) -> output_1
output: name='output_0' type=dtype('float32') shape=None
output: name='output_1' type=dtype('float32') shape=None

dynamo-ir#

inputs: #1[(T1s3x4,)]
shapes: dict(x:{0:Dim(batch)})

opset: domain='' version=20
input: name='x' type=dtype('float32') shape=['batch', 4]
init: name='clone' type=float32 shape=(1,) -- array([0.], dtype=float32)
init: name='val_3' type=int64 shape=(2,) -- array([ 1, -1])
init: name='val_4' type=int64 shape=(1,) -- array([1])
Scan(clone, x, body=G1, num_scan_inputs=1, scan_input_directions=[0], scan_output_directions=[0]) -> scan__0, getitem_1
output: name='getitem_1' type=dtype('float32') shape=['batch', 'batch']
----- subgraph ---- Scan - node_scan__1 - att.body=G1 -- level=1 -- clone_scan_combine_graph_0__subgraph_in,x_scan_combine_graph_0__subgraph_in -> clone_scan_combine_graph_0,sqrt_scan_combine_graph_0
input: name='clone_scan_combine_graph_0__subgraph_in' type=dtype('float32') shape=[1]
input: name='x_scan_combine_graph_0__subgraph_in' type=dtype('float32') shape=[4]
Identity(clone_scan_combine_graph_0__subgraph_in) -> clone_scan_combine_graph_0
Reshape(x_scan_combine_graph_0__subgraph_in, val_3, allowzero=1) -> view
Sub(x, view) -> sub_1
  Mul(sub_1, sub_1) -> mul_4
ReduceSum(mul_4, val_4, noop_with_empty_axes=0, keepdims=0) -> sum_1
  Sqrt(sum_1) -> sqrt_scan_combine_graph_0
output: name='clone_scan_combine_graph_0' type=dtype('float32') shape=[1]
output: name='sqrt_scan_combine_graph_0' type=dtype('float32') shape=['batch']

tracing#

inputs: #1[(T1s3x4,)]
shapes: dict(x:{0:Dim(batch)})

opset: domain='' version=21
opset: domain='local_functions' version=1
input: name='x' type=dtype('float32') shape=['batch', 4]
init: name='cst' type=float32 shape=(1,) -- array([0.], dtype=float32)-- _process_arg
init: name='init7_s1_12_cst2init' type=int64 shape=(1,) -- array([1]) -- GraphBuilderPatternOptimization.make_initializer.1/Shape
init: name='init7_s1_02_cst2init' type=int64 shape=(1,) -- array([0]) -- GraphBuilderPatternOptimization.make_initializer.1/Shape
Identity(x) -> hidden_input_scan_0_clone
Scan(cst, x, body=G1, num_scan_inputs=1, scan_input_directions=[0], scan_output_axes=[0], scan_output_directions=[0]) -> scancc#0, output
output: name='output' type=dtype('float32') shape=['d_output_0', 'd_output_1']
----- subgraph ---- Scan - scan - att.body=G1 -- level=1 -- init_0_cst,scan_0_x -> output_0,output_1
input: name='init_0_cst' type=dtype('float32') shape=None
input: name='scan_0_x' type=dtype('float32') shape=None
Identity(init_0_cst) -> output_0
Unsqueeze(scan_0_x, init7_s1_02_cst2init) -> reshape2
Sub(hidden_input_scan_0_clone, reshape2) -> sub2
  Mul(sub2, sub2) -> mul2
ReduceSum(mul2, init7_s1_12_cst2init, keepdims=0) -> sum_12
  Sqrt(sum_12) -> output_1
output: name='output_0' type=dtype('float32') shape=None
output: name='output_1' type=dtype('float32') shape=None

new-tracing#

inputs: #1[(T1s3x4,)]
shapes: dict(x:{0:Dim(batch)})

opset: domain='' version=21
opset: domain='local_functions' version=1
input: name='x' type=dtype('float32') shape=['batch', 4]
init: name='cst' type=float32 shape=(1,) -- array([0.], dtype=float32)-- _process_arg
init: name='init7_s1_12_cst2init' type=int64 shape=(1,) -- array([1]) -- GraphBuilderPatternOptimization.make_initializer.1/Shape
init: name='init7_s1_02_cst2init' type=int64 shape=(1,) -- array([0]) -- GraphBuilderPatternOptimization.make_initializer.1/Shape
Identity(x) -> hidden_input_scan_0_clone_default
Scan(cst, x, body=G1, num_scan_inputs=1, scan_input_directions=[0], scan_output_axes=[0], scan_output_directions=[0]) -> scan#0, output
output: name='output' type=dtype('float32') shape=['batch', 'batch']
----- subgraph ---- Scan - scan - att.body=G1 -- level=1 -- init_0_cst,scan_0_x -> output_0,output_1
input: name='init_0_cst' type=dtype('float32') shape=None
input: name='scan_0_x' type=dtype('float32') shape=None
Identity(init_0_cst) -> output_0
Unsqueeze(scan_0_x, init7_s1_02_cst2init) -> view_default2
Sub(hidden_input_scan_0_clone_default, view_default2) -> sub_tensor2
  Mul(sub_tensor2, sub_tensor2) -> mul_tensor2
ReduceSum(mul_tensor2, init7_s1_12_cst2init, keepdims=0) -> sum_dim_int_list2
  Sqrt(sum_dim_int_list2) -> output_1
output: name='output_0' type=dtype('float32') shape=None
output: name='output_1' type=dtype('float32') shape=None

ControlFlowScanCDistXY#

code: yobx.torch.testing._model_eval_cases.ControlFlowScanCDistXY

forward#

def forward(self, x, y):
    def dist(y: torch.Tensor, scanned_x: torch.Tensor):
        sub = y - scanned_x.reshape((1, -1))
        sq = sub * sub
        rd = torch.sqrt(sq.sum(dim=1))
        # clone --> UnsupportedAliasMutationException:
        # Combine_fn might be aliasing the input!
        return [y.clone(), rd]

    _carry, out = torch.ops.higher_order.scan(
        dist,
        [y],
        [x],
        # dim=0,  # 01/31/2025, not supported anymore
        additional_inputs=[],
    )
    return out

yobx#

inputs: #2[(T1s3x4,T1s5x4),(T1s13x14,T1s15x14)]
shapes: dict(x:{0:Dim(x_rows),1:Dim(dim)},y:{0:Dim(y_rows),1:Dim(dim)})

opset: domain='' version=21
opset: domain='local_functions' version=1
input: name='x' type=dtype('float32') shape=['x_rows', 'dim']
input: name='y' type=dtype('float32') shape=['y_rows', 'dim']
init: name='init7_s1_12_cst2init' type=int64 shape=(1,) -- array([1]) -- GraphBuilderPatternOptimization.make_initializer.1/Shape
init: name='init7_s1_02_cst2init' type=int64 shape=(1,) -- array([0]) -- GraphBuilderPatternOptimization.make_initializer.1/Shape
Scan(y, x, body=G1, num_scan_inputs=1, scan_input_directions=[0], scan_output_axes=[0], scan_output_directions=[0]) -> scan#0, output_0
output: name='output_0' type=dtype('float32') shape=['x_rows', 'y_rows']
----- subgraph ---- Scan - scan - att.body=G1 -- level=1 -- init_0_y,scan_0_x -> output_0,output_1
input: name='init_0_y' type=dtype('float32') shape=None
input: name='scan_0_x' type=dtype('float32') shape=None
Identity(init_0_y) -> output_0
Unsqueeze(scan_0_x, init7_s1_02_cst2init) -> reshape2
  Sub(init_0_y, reshape2) -> sub2
    Mul(sub2, sub2) -> mul2
ReduceSum(mul2, init7_s1_12_cst2init, keepdims=0) -> sum_12
  Sqrt(sum_12) -> output_1
output: name='output_0' type=dtype('float32') shape=None
output: name='output_1' type=dtype('float32') shape=None

dynamo-ir#

inputs: #2[(T1s3x4,T1s5x4),(T1s13x14,T1s15x14)]
shapes: dict(x:{0:Dim(x_rows),1:Dim(dim)},y:{0:Dim(y_rows),1:Dim(dim)})

opset: domain='' version=20
input: name='x' type=dtype('float32') shape=['x_rows', 'dim']
input: name='y' type=dtype('float32') shape=['y_rows', 'dim']
init: name='val_5' type=int64 shape=(2,) -- array([ 1, -1])
init: name='val_6' type=int64 shape=(1,) -- array([1])
Scan(y, x, body=G1, num_scan_inputs=1, scan_input_directions=[0], scan_output_directions=[0]) -> scan__0, getitem_1
output: name='getitem_1' type=dtype('float32') shape=['x_rows', 'y_rows']
----- subgraph ---- Scan - node_scan__1 - att.body=G1 -- level=1 -- y_scan_combine_graph_0__subgraph_in,x_scan_combine_graph_0__subgraph_in -> clone_scan_combine_graph_0,sqrt_scan_combine_graph_0
input: name='y_scan_combine_graph_0__subgraph_in' type=dtype('float32') shape=['s17', 's27']
input: name='x_scan_combine_graph_0__subgraph_in' type=dtype('float32') shape=['s27']
Identity(y_scan_combine_graph_0__subgraph_in) -> clone_scan_combine_graph_0
Reshape(x_scan_combine_graph_0__subgraph_in, val_5, allowzero=1) -> view
  Sub(y_scan_combine_graph_0__subgraph_in, view) -> sub_4
    Mul(sub_4, sub_4) -> mul_7
ReduceSum(mul_7, val_6, noop_with_empty_axes=0, keepdims=0) -> sum_1
  Sqrt(sum_1) -> sqrt_scan_combine_graph_0
output: name='clone_scan_combine_graph_0' type=dtype('float32') shape=['y_rows', 'dim']
output: name='sqrt_scan_combine_graph_0' type=dtype('float32') shape=['y_rows']

tracing#

inputs: #2[(T1s3x4,T1s5x4),(T1s13x14,T1s15x14)]
shapes: dict(x:{0:Dim(x_rows),1:Dim(dim)},y:{0:Dim(y_rows),1:Dim(dim)})

opset: domain='' version=21
opset: domain='local_functions' version=1
input: name='x' type=dtype('float32') shape=['x_rows', 'dim']
input: name='y' type=dtype('float32') shape=['y_rows', 'dim']
init: name='init7_s1_12_cst2init' type=int64 shape=(1,) -- array([1]) -- GraphBuilderPatternOptimization.make_initializer.1/Shape
init: name='init7_s1_02_cst2init' type=int64 shape=(1,) -- array([0]) -- GraphBuilderPatternOptimization.make_initializer.1/Shape
Scan(y, x, body=G1, num_scan_inputs=1, scan_input_directions=[0], scan_output_axes=[0], scan_output_directions=[0]) -> scancc#0, output
output: name='output' type=dtype('float32') shape=['d_output_0', 'd_output_1']
----- subgraph ---- Scan - scan - att.body=G1 -- level=1 -- init_0_y,scan_0_x -> output_0,output_1
input: name='init_0_y' type=dtype('float32') shape=None
input: name='scan_0_x' type=dtype('float32') shape=None
Identity(init_0_y) -> output_0
Unsqueeze(scan_0_x, init7_s1_02_cst2init) -> reshape2
  Sub(init_0_y, reshape2) -> sub2
    Mul(sub2, sub2) -> mul2
ReduceSum(mul2, init7_s1_12_cst2init, keepdims=0) -> sum_12
  Sqrt(sum_12) -> output_1
output: name='output_0' type=dtype('float32') shape=None
output: name='output_1' type=dtype('float32') shape=None

new-tracing#

inputs: #2[(T1s3x4,T1s5x4),(T1s13x14,T1s15x14)]
shapes: dict(x:{0:Dim(x_rows),1:Dim(dim)},y:{0:Dim(y_rows),1:Dim(dim)})

opset: domain='' version=21
opset: domain='local_functions' version=1
input: name='x' type=dtype('float32') shape=['x_rows', 'dim']
input: name='y' type=dtype('float32') shape=['y_rows', 'dim']
init: name='init7_s1_12_cst2init' type=int64 shape=(1,) -- array([1]) -- GraphBuilderPatternOptimization.make_initializer.1/Shape
init: name='init7_s1_02_cst2init' type=int64 shape=(1,) -- array([0]) -- GraphBuilderPatternOptimization.make_initializer.1/Shape
Scan(y, x, body=G1, num_scan_inputs=1, scan_input_directions=[0], scan_output_axes=[0], scan_output_directions=[0]) -> scan#0, output
output: name='output' type=dtype('float32') shape=['x_rows', 'y_rows']
----- subgraph ---- Scan - scan - att.body=G1 -- level=1 -- init_0_y,scan_0_x -> output_0,output_1
input: name='init_0_y' type=dtype('float32') shape=None
input: name='scan_0_x' type=dtype('float32') shape=None
Identity(init_0_y) -> output_0
Unsqueeze(scan_0_x, init7_s1_02_cst2init) -> view_default2
  Sub(init_0_y, view_default2) -> sub_tensor2
    Mul(sub_tensor2, sub_tensor2) -> mul_tensor2
ReduceSum(mul_tensor2, init7_s1_12_cst2init, keepdims=0) -> sum_dim_int_list2
  Sqrt(sum_dim_int_list2) -> output_1
output: name='output_0' type=dtype('float32') shape=None
output: name='output_1' type=dtype('float32') shape=None

ControlFlowScanDecomposition_151564#

code: yobx.torch.testing._model_eval_cases.ControlFlowScanDecomposition_151564

forward#

def forward(self, images, position):
    def dummy_loop(padded: torch.Tensor, pos: torch.Tensor):
        copy = torch.zeros(padded.shape)
        for i in range(pos.shape[0]):
            p = pos[i]
            copy[i, :p] = padded[i, :p]
        return copy

    def dummy_loop_with_scan(padded: torch.Tensor, pos: torch.Tensor):
        def pad_row(padded, p):
            row = torch.zeros((padded.shape[0],))
            torch._check(p.item() > 0)
            torch._check(p.item() < padded.shape[0])
            # this check is not always true, we add it anyway to make this dimension >= 2
            # and avoid raising an exception about dynamic dimension in {0, 1}
            if torch.compiler.is_exporting():
                torch._check(p.item() > 1)
            row[: p.item()] = padded[: p.item()]
            return (row,)

        return torch.ops.higher_order.scan(pad_row, [], [padded, pos], [])

    def select_when_exporting(f, f_scan):
        return f_scan if torch.compiler.is_exporting() else f

    return select_when_exporting(dummy_loop, dummy_loop_with_scan)(images, position)

yobx#

inputs: #1[(T1s5x6,T7s5)]
shapes: dict(images:{0:DYNAMIC,1:DYNAMIC},position:{0:DYNAMIC})

opset: domain='' version=21
opset: domain='aten' version=1
opset: domain='local_functions' version=1
input: name='images' type=dtype('float32') shape=['batch', 'channel']
input: name='position' type=dtype('int64') shape=['batch_2']
init: name='init7_s1_02_cst2init' type=int64 shape=(1,) -- array([0]) -- GraphBuilderPatternOptimization.make_initializer.1/Shape
Scan(images, position, body=G1, num_scan_inputs=2, scan_input_directions=[0,0], scan_output_axes=[0], scan_output_directions=[0]) -> output_0
output: name='output_0' type=dtype('float32') shape=['batch', 'channel']
----- subgraph ---- Scan - scan - att.body=G1 -- level=1 -- scan_0_images,scan_1_position -> output_0
input: name='scan_0_images' type=dtype('float32') shape=None
input: name='scan_1_position' type=dtype('int64') shape=None
Shape(scan_0_images, end=1, start=0) -> padded_1::Shape:12
  ConstantOfShape(padded_1::Shape:12, value=[0.0]) -> zeros2
Unsqueeze(scan_1_position, init7_s1_02_cst2init) -> item::UnSq02
Slice(scan_0_images, init7_s1_02_cst2init, item::UnSq02, init7_s1_02_cst2init) -> slice_12
  aten_setitem[aten](zeros2, scan_1_position, slice_12) -> output_0
output: name='output_0' type=dtype('float32') shape=None
----- function name=aten_setitem domain=aten
----- doc_string: -- function_options=FunctionOptions(export_as_function=...
opset: domain='' version=21
input: 'zeros'
input: 'item'
input: 'slice_1'
Constant(value=[0]) -> init7_s1_0
  Unsqueeze(item, init7_s1_0) -> item::UnSq0
Shape(zeros) -> zeros::Shape:
  Slice(zeros, item::UnSq0, zeros::Shape:, init7_s1_0) -> _onx_slice_zeros
    Concat(slice_1, _onx_slice_zeros, axis=0) -> setitem
output: name='setitem' type=? shape=?

dynamo-ir#

inputs: #1[(T1s5x6,T7s5)]
shapes: dict(images:{0:DYNAMIC,1:DYNAMIC},position:{0:DYNAMIC})

opset: domain='' version=20
input: name='images' type=dtype('float32') shape=['s34', 's90']
input: name='position' type=dtype('int64') shape=['s71']
init: name='val_13' type=int64 shape=(1,) -- array([0])
init: name='val_37' type=int64 shape=(1,) -- array([1])
init: name='val_1' type=float32 shape=() -- array([0.], dtype=float32)
init: name='val_5' type=int64 shape=(1,) -- array([-1])
init: name='val_7' type=int64 shape=() -- array([0])
init: name='val_10' type=int64 shape=() -- array([1])
Scan(images, position, body=G1, num_scan_inputs=2, scan_input_directions=[0,0], scan_output_directions=[0]) -> getitem
output: name='getitem' type=dtype('float32') shape=['s34', 's90']
----- subgraph ---- Scan - node_scan__0 - att.body=G1 -- level=1 -- images_scan_combine_graph_0__subgraph_in,position_scan_combine_graph_0__subgraph_in -> slice_scatter_scan_combine_graph_0
input: name='images_scan_combine_graph_0__subgraph_in' type=dtype('float32') shape=['s90']
input: name='position_scan_combine_graph_0__subgraph_in' type=dtype('int64') shape=None
Reshape(position_scan_combine_graph_0__subgraph_in, val_5, allowzero=0) -> val_6
Gather(val_6, val_7, axis=0) -> val_8
Reshape(val_8, val_5, allowzero=0) -> val_16
Slice(images_scan_combine_graph_0__subgraph_in, val_13, val_16, val_13, val_37) -> copy
Shape(images_scan_combine_graph_0__subgraph_in, end=1, start=0) -> val_0
Expand(val_1, val_0) -> zeros
  Shape(zeros, start=0) -> val_32
Gather(val_32, val_7, axis=0) -> val_33
Range(val_7, val_33, val_10) -> val_34
Unsqueeze(val_8, val_13) -> val_36
Slice(val_34, val_13, val_36, val_13, val_37) -> val_38
Unsqueeze(val_38, val_5) -> val_39
  ScatterND(zeros, val_39, copy, reduction=b'none') -> slice_scatter_scan_combine_graph_0
output: name='slice_scatter_scan_combine_graph_0' type=dtype('float32') shape=['s90']

tracing#

inputs: #1[(T1s5x6,T7s5)]
shapes: dict(images:{0:DYNAMIC,1:DYNAMIC},position:{0:DYNAMIC})

opset: domain='' version=21
opset: domain='aten' version=1
opset: domain='local_functions' version=1
input: name='images' type=dtype('float32') shape=['batch', 'channel']
input: name='position' type=dtype('int64') shape=['batch_2']
init: name='init7_s1_02_cst2init' type=int64 shape=(1,) -- array([0]) -- GraphBuilderPatternOptimization.make_initializer.1/Shape
init: name='getitem_2_step2_cst2init' type=int64 shape=(1,) -- array([1])-- GraphBuilderPatternOptimization.make_initializer.1/Shape
Scan(images, position, body=G1, num_scan_inputs=2, scan_input_directions=[0,0], scan_output_axes=[0], scan_output_directions=[0]) -> output
output: name='output' type=dtype('float32') shape=['d_output_0', 'd_output_1']
----- subgraph ---- Scan - scan - att.body=G1 -- level=1 -- scan_0_images,scan_1_position -> output_0
input: name='scan_0_images' type=dtype('float32') shape=None
input: name='scan_1_position' type=dtype('int64') shape=None
Shape(scan_0_images, end=1, start=0) -> _onx_gather_size2
  ConstantOfShape(_onx_gather_size2, value=[0.0]) -> zeros2
Unsqueeze(scan_1_position, init7_s1_02_cst2init) -> item_3::UnSq02
Slice(scan_0_images, init7_s1_02_cst2init, item_3::UnSq02, init7_s1_02_cst2init, getitem_2_step2_cst2init) -> getitem_22
  aten_setitem[aten](zeros2, scan_1_position, getitem_22) -> output_0
output: name='output_0' type=dtype('float32') shape=None
----- function name=aten_setitem domain=aten
----- doc_string: -- function_options=FunctionOptions(export_as_function=...
opset: domain='' version=21
input: 'zeros'
input: 'item_4'
input: 'getitem_2'
Constant(value=[0]) -> init7_s1_0
  Unsqueeze(item_4, init7_s1_0) -> item_4::UnSq0
Shape(zeros) -> zeros::Shape:
  Slice(zeros, item_4::UnSq0, zeros::Shape:, init7_s1_0) -> _onx_slice_zeros
    Concat(getitem_2, _onx_slice_zeros, axis=0) -> setitem
output: name='setitem' type=? shape=?

new-tracing#

inputs: #1[(T1s5x6,T7s5)]
shapes: dict(images:{0:DYNAMIC,1:DYNAMIC},position:{0:DYNAMIC})

opset: domain='' version=21
opset: domain='aten' version=1
opset: domain='local_functions' version=1
input: name='images' type=dtype('float32') shape=['batch', 'channel']
input: name='position' type=dtype('int64') shape=['batch_2']
init: name='init7_s1_02_cst2init' type=int64 shape=(1,) -- array([0]) -- GraphBuilderPatternOptimization.make_initializer.1/Shape
Scan(images, position, body=G1, num_scan_inputs=2, scan_input_directions=[0,0], scan_output_axes=[0], scan_output_directions=[0]) -> output_0
output: name='output_0' type=dtype('float32') shape=['batch', 'channel']
----- subgraph ---- Scan - scan - att.body=G1 -- level=1 -- scan_0_images,scan_1_position -> output_0
input: name='scan_0_images' type=dtype('float32') shape=None
input: name='scan_1_position' type=dtype('int64') shape=None
Shape(scan_0_images, end=1, start=0) -> scan_elem_0::Shape:12
  ConstantOfShape(scan_elem_0::Shape:12, value=[0.0]) -> zeros2
Unsqueeze(scan_1_position, init7_s1_02_cst2init) -> _local_scalar_dense_default_3::UnSq02
Slice(scan_0_images, init7_s1_02_cst2init, _local_scalar_dense_default_3::UnSq02, init7_s1_02_cst2init) -> slice_tensor2
  aten_setitem[aten](zeros2, scan_1_position, slice_tensor2) -> output_0
output: name='output_0' type=dtype('float32') shape=None
----- function name=aten_setitem domain=aten
----- doc_string: -- function_options=FunctionOptions(export_as_function=...
opset: domain='' version=21
input: 'zeros'
input: '_local_scalar_dense_default_4'
input: 'slice_tensor'
Constant(value=[0]) -> init7_s1_0
  Unsqueeze(_local_scalar_dense_default_4, init7_s1_0) -> _local_scalar_dense_default_4::UnSq0
Shape(zeros) -> zeros::Shape:
  Slice(zeros, _local_scalar_dense_default_4::UnSq0, zeros::Shape:, init7_s1_0) -> _onx_slice_zeros
    Concat(slice_tensor, _onx_slice_zeros, axis=0) -> setitem
output: name='setitem' type=? shape=?

ControlFlowScanInplace_153705#

code: yobx.torch.testing._model_eval_cases.ControlFlowScanInplace_153705

forward#

def forward(self, x, y):
    def loop_body_1(z, iv, x, y):
        z = z.clone()
        i = iv.item()
        z[i, :] = ((x[i, :] - y) ** 2).sum(dim=-1)
        return [z, iv]

    z = torch.empty((x.shape[0], y.shape[0]))
    r = torch.ops.higher_order.scan(
        loop_body_1, [z], [torch.arange(x.shape[0], dtype=torch.int64)], [x, y]
    )
    return r[0]

yobx#

FAILED

only integers, slices (`:`), ellipsis (`...`), None and long or byte Variables are valid indices (got SymInt)

dynamo-ir#

FAILED

Failed to export the model with torch.export. [96mThis is step 1/3[0m of exporting the model to ONNX. Next steps:
- Modify the model code for `torch.export.export` to succeed. Refer to https://pytorch.org/docs/stable/generated/exportdb/index.html for more information.
- Debug `torch.export.export` and submit a PR to PyTorch.
- Create an issue in the PyTorch GitHub repository against the [96m*torch.export*[0m component and attach the full error stack as well as reproduction scripts.

## Exception summary

<class 'IndexError'>: only integers, slices (`:`), ellipsis (`...`), None and long or byte Variables are valid indices (got SymInt)

(Refer to the full stack trace above for more information.)

tracing#

FAILED

Unable to interpret function <class 'builtin_function_or_method'>: <built-in method empty of type object at 0x7853d1560000>, searched for ['transformers_empty', '_VariableFunctionsClass_empty', 'empty'] and attributes ['__qualname__', '__name__'], args=((getitem, getitem_2),), kwargs={}, dispatcher=None
--DEBUG--
-- to print the exported program: PRINT_EXPORTED_PROGRAM=1

[GraphBuilder-TBC] Message starts, there are 2 initializers, 10 nodes, 2 inputs, 2 outputs.
input_names=['x', 'y']
output_names=[]
--CONSTRAINTS--
    DYN0 = {'s26'}
    DYN1 = {'s49'}
    DYN2 = {'s93'}
    DYN3 = {'s70'}
    s26 = {'DYN0'}
    s49 = {'DYN1'}
    s70 = {'DYN3'}
    s93 = {'DYN2'}
--SHAPE--
_dynamic_examples=
dynamic_objects=
   DYN0 = 'DYN0'
   DYN1 = 'DYN1'
   DYN2 = 'DYN2'
   DYN3 = 'DYN3'
   s26 = 's26'
   s49 = 's49'
   s70 = 's70'
   s93 = 's93'
dynamic_objects_rev=
   'DYN0' = <class 'list'>
     tuple
       'DYN0'
       ERR**: <class 'torch.SymInt'>:'DYN0'
   'DYN1' = <class 'list'>
     tuple
       'DYN1'
       ERR**: <class 'torch.SymInt'>:'DYN1'
   'DYN2' = <class 'list'>
     tuple
       'DYN2'
       ERR**: <class 'torch.SymInt'>:'DYN2'
   'DYN3' = <class 'list'>
     tuple
       'DYN3'
       ERR**: <class 'torch.SymInt'>:'DYN3'
dynamic_dimensions_source={'DYN0': [{'axis': 0, 'input_name': 'x'}],
 'DYN1': [{'axis': 1, 'input_name': 'x'}],
 'DYN2': [{'axis': 0, 'input_name': 'y'}],
 'DYN3': [{'axis': 1, 'input_name': 'y'}]}
dynamic_dimensions_source_flat=['x', 'y']
output_dynamic_dimensions_source_flat=None
dynamic_alias={'s26': 'DYN0', 's49': 'DYN1', 's70': 'DYN3', 's93': 'DYN2'}
dynamic_shapes={'x': {0: Dim('DYN0', min=0), 1: Dim('DYN1', min=0)},
 'y': {0: Dim('DYN2', min=0), 1: Dim('DYN3', min=0)}}
_known_shapes={'_onx_gather_size': (1,),
 '_onx_gather_size2': (1,),
 '_onx_gather_size_1': (1,),
 '_onx_gather_size_12': (1,),
 'getitem': (),
 'getitem_1': (),
 'getitem_2': (),
 'getitem_3': (),
 'init7_s1_0': (1,),
 'init7_s1_1': (1,),
 'size': (2,),
 'size_1': (2,),
 'x': ('DYN0', 'DYN1'),
 'y': ('DYN2', 'DYN3')}
_known_types={'_onx_gather_size': 7,
 '_onx_gather_size2': 7,
 '_onx_gather_size_1': 7,
 '_onx_gather_size_12': 7,
 'getitem': 7,
 'getitem_1': 7,
 'getitem_2': 7,
 'getitem_3': 7,
 'init7_s1_0': 7,
 'init7_s1_1': 7,
 'size': 7,
 'size_1': 7,
 'x': 1,
 'y': 1}
_known_devices={'_onx_gather_size': -1,
 '_onx_gather_size2': -1,
 '_onx_gather_size_1': -1,
 '_onx_gather_size_12': -1,
 'getitem': -1,
 'getitem_1': -1,
 'getitem_2': -1,
 'getitem_3': -1,
 'size': -1,
 'size_1': -1,
 'x': -1,
 'y': -1}
_context=[]
_known_value_shape={'_onx_gather_size': ('DYN0',),
 '_onx_gather_size2': ('DYN1',),
 '_onx_gather_size_1': ('DYN2',),
 '_onx_gather_size_12': ('DYN3',),
 'getitem': 'DYN0',
 'getitem_1': 'DYN1',
 'getitem_2': 'DYN2',
 'getitem_3': 'DYN3',
 'init7_s1_0': (0,),
 'init7_s1_1': (1,),
 'size': ('DYN0', 'DYN1'),
 'size_1': ('DYN2', 'DYN3')}
_known_constants=['init7_s1_0', 'init7_s1_1']
_known_ranks (with no shape)={}
--PARAMETERS--
_parameter_renaming=
--TORCH-USERS--
    empty -> {scancc}
    getitem -> {empty}
    getitem_1 -> set()
    getitem_2 -> {empty}
    getitem_3 -> set()
    size -> {getitem, getitem_1}
    size_1 -> {getitem_2, getitem_3}
    x -> {size_2, scancc, size}
    y -> {size_1, scancc}
--TORCH-SHAPES--
    x: ('run_node', (('example_value', torch.float32, torch.Size([3, 4])), ('val', torch.float32, torch.Size([s26, s49])))) --- 1:2:('DYN0', 'DYN1'):
    y: ('run_node', (('example_value', torch.float32, torch.Size([5, 4])), ('val', torch.float32, torch.Size([s93, s70])))) --- 1:2:('DYN2', 'DYN3'):
    size: ('run_node', ('', '')) --- 7:1:(2,):
    getitem: ('run_node', ('', '')) --- 7:0:():
    getitem_1: ('run_node', ('', '')) --- 7:0:():
    size_1: ('run_node', ('', '')) --- 7:1:(2,):
    getitem_2: ('run_node', ('', '')) --- 7:0:():
    getitem_3: ('run_node', ('', '')) --- 7:0:():
    empty: ('run_node', ('', '')) --- :::
--ONNX--
-- EXEPATH --
export
export_options=ExportOptions(tracing=TracingMode.TRACING, aten_as_function=('aten.histc.default', 'aten.index_copy.default', 'aten.index_put.default', 'aten._grouped_mm.default', 'aten.setitem', <built-in function setitem>))
function_options=None
-- process.graph_module --
ControlFlowScanInplace_153705()



def forward(self, x, y):
    size = x.size()
    getitem = size[0]
    getitem_1 = size[1];  size = getitem_1 = None
    size_1 = y.size()
    getitem_2 = size_1[0]
    getitem_3 = size_1[1];  size_1 = getitem_3 = None
    empty = torch.empty((getitem, getitem_2));  getitem = getitem_2 = None
    size_2 = x.size()
    getitem_4 = size_2[0]
    getitem_5 = size_2[1];  size_2 = getitem_5 = None
    arange = torch.arange(getitem_4, dtype = torch.int64);  getitem_4 = None
    _cb_scan_loop_body_1_0 = self._cb_scan_loop_body_1_0
    scancc = torch.ops.higher_order.scan(_cb_scan_loop_body_1_0, [empty], [arange], [x, y]);  _cb_scan_loop_body_1_0 = empty = arange = x = y = None
    getitem_6 = scancc[0];  scancc = None
    return getitem_6

# To see more debug info, please use `graph_module.print_readable()`
-- process.graph_module.graph --
graph():
    %x : [num_users=3] = placeholder[target=x]
    %y : [num_users=2] = placeholder[target=y]
    %size : [num_users=2] = call_method[target=size](args = (%x,), kwargs = {})
    %getitem : [num_users=1] = call_function[target=operator.getitem](args = (%size, 0), kwargs = {})
    %getitem_1 : [num_users=0] = call_function[target=operator.getitem](args = (%size, 1), kwargs = {})
    %size_1 : [num_users=2] = call_method[target=size](args = (%y,), kwargs = {})
    %getitem_2 : [num_users=1] = call_function[target=operator.getitem](args = (%size_1, 0), kwargs = {})
    %getitem_3 : [num_users=0] = call_function[target=operator.getitem](args = (%size_1, 1), kwargs = {})
    %empty : [num_users=1] = call_function[target=torch.empty](args = ((%getitem, %getitem_2),), kwargs = {})
    %size_2 : [num_users=2] = call_method[target=size](args = (%x,), kwargs = {})
    %getitem_4 : [num_users=1] = call_function[target=operator.getitem](args = (%size_2, 0), kwargs = {})
    %getitem_5 : [num_users=0] = call_function[target=operator.getitem](args = (%size_2, 1), kwargs = {})
    %arange : [num_users=1] = call_function[target=torch.arange](args = (%getitem_4,), kwargs = {dtype: torch.int64})
    %_cb_scan_loop_body_1_0 : [num_users=1] = get_attr[target=_cb_scan_loop_body_1_0]
    %scancc : [num_users=1] = call_function[target=torch.ops.higher_order.scan](args = (%_cb_scan_loop_body_1_0, [%empty], [%arange], [%x, %y]), kwargs = {})
    %getitem_6 : [num_users=1] = call_function[target=operator.getitem](args = (%scancc, 0), kwargs = {})
    return getitem_6
-- process.inputs_to_remove --
set()
-- process.progress --
node 8/17 target=<built-in method empty of type object at 0x7853d1560000>
-- 2 INPUTS
[GraphBuilder-TBC.1.make_tensor_input] x[1:DYN0xDYN1]
[GraphBuilder-TBC.1.make_tensor_input] y[1:DYN2xDYN3]
-- 2 INITIALIZERS
[GraphBuilder-TBC.1.make_initializer] init7_s1_0[int64:int64:[0]] - SOURCE: Opset.make_node.1/Shape##Opset.make_node.1/Shape##Opset.make_node.1/Shape##Opset.make_node.1/Shape##Opset.make_node.1/Shape##Opset.make_node.1/Shape
[GraphBuilder-TBC.1.make_initializer] init7_s1_1[int64:int64:[1]] - SOURCE: Opset.make_node.1/Shape##Opset.make_node.1/Shape
[GraphBuilder-TBC.4.make_node] .sizeA          [@:@   ] Shape:['x']->['size']
[GraphBuilder-TBC.4.make_node] getitemB_index  [@#:@  ] Gather:['size', 'init7_s1_0']->['_onx_gather_size']
[GraphBuilder-TBC.4.make_node] getitemB_index2 [@#:@  ] Squeeze:['_onx_gather_size', 'init7_s1_0']->['getitem']
[GraphBuilder-TBC.4.make_node] getitemB_index3 [@#:@  ] Gather:['size', 'init7_s1_1']->['_onx_gather_size2']
[GraphBuilder-TBC.4.make_node] getitemB_index4 [@#:@  ] Squeeze:['_onx_gather_size2', 'init7_s1_0']->['getitem_1']
[GraphBuilder-TBC.4.make_node] .sizeA2         [@:@   ] Shape:['y']->['size_1']
[GraphBuilder-TBC.4.make_node] getitemB_index5 [@#:@  ] Gather:['size_1', 'init7_s1_0']->['_onx_gather_size_1']
[GraphBuilder-TBC.4.make_node] getitemB_index6 [@#:@  ] Squeeze:['_onx_gather_size_1', 'init7_s1_0']->['getitem_2']
[GraphBuilder-TBC.4.make_node] getitemB_index7 [@#:@  ] Gather:['size_1', 'init7_s1_1']->['_onx_gather_size_12']
[GraphBuilder-TBC.4.make_node] getitemB_index8 [@#:@  ] Squeeze:['_onx_gather_size_12', 'init7_s1_0']->['getitem_3']
-- 0 OUTPUTS
[GraphBuilder-TBC] Message completed, there are 2 initializers, 10 nodes, 2 inputs, 2 outputs., 

new-tracing#

FAILED

empty(): argument 'size' (position 1) must be tuple of ints, but found element of type TracingInt at pos 0

ControlFlowShapeCheck#

code: yobx.torch.testing._model_eval_cases.ControlFlowShapeCheck

forward#

def forward(self, x, y):
    x1 = x + 1
    y1 = y + 2
    cat = torch.cat([x1, y1], dim=1)
    torch._check(cat.shape[0] > 2)
    if cat.shape[0] > 2:
        return cat / cat.shape[0]
    return cat / cat.ndim

yobx#

FAILED

Constraints violated (batch)! For more information, run with TORCH_LOGS="+dynamic".
  - Not all values of batch = L['x'].size()[0] in the specified range satisfy the generated guard 3 <= L['x'].size()[0] and L['x'].size()[0] <= IntInfinity()
Suggested fixes:
  batch = Dim('batch', min=3)

The error above occurred when calling torch.export.export. If you would like to view some more information about this error, and get a list of all other errors that may occur in your export call, you can replace your `export()` call with `draft_export()`.

dynamo-ir#

inputs: #2[(T1s3x4,T1s3x4),(T1s5x4,T1s5x2)]
shapes: dict(x:{0:Dim(batch)},y:{0:Dim(batch),1:Dim(seq)})

opset: domain='' version=20
input: name='x' type=dtype('float32') shape=['batch', 4]
input: name='y' type=dtype('float32') shape=['batch', 'seq']
init: name='scalar_tensor_default' type=float32 shape=() -- array([1.], dtype=float32)
init: name='scalar_tensor_default_1' type=float32 shape=() -- array([2.], dtype=float32)
Add(x, scalar_tensor_default) -> add
Shape(y, end=1, start=0) -> val_1
  Squeeze(val_1) -> sym_size_int_5
    Cast(sym_size_int_5, to=1) -> scalar_tensor_default_2
Add(y, scalar_tensor_default_1) -> add_4
  Concat(add, add_4, axis=1) -> cat
    Div(cat, scalar_tensor_default_2) -> div
output: name='div' type=dtype('float32') shape=['batch', 'seq + 4']

tracing#

FAILED

symbolically traced variables cannot be used as inputs to control flow

new-tracing#

inputs: #2[(T1s3x4,T1s3x4),(T1s5x4,T1s5x2)]
shapes: dict(x:{0:Dim(batch)},y:{0:Dim(batch),1:Dim(seq)})

opset: domain='' version=21
input: name='x' type=dtype('float32') shape=['batch', 4]
input: name='y' type=dtype('float32') shape=['batch', 'seq']
init: name='init1_s_::RSh1' type=float32 shape=(1,) -- array([1.], dtype=float32)-- GraphBuilder.constant_folding.from/fold(init1_s_,init7_s1_1)##init1_s_/shape_type_compute._cast_inputs.0##init7_s1_1/Opset.make_node.1/Shape##Opset.make_node.1/Shape##Opset.make_node.1/Shape
init: name='init1_s_2::RSh1' type=float32 shape=(1,) -- array([2.], dtype=float32)-- GraphBuilder.constant_folding.from/fold(init1_s_2,init7_s1_1)##init1_s_2/shape_type_compute._cast_inputs.0##init7_s1_1/Opset.make_node.1/Shape##Opset.make_node.1/Shape##Opset.make_node.1/Shape
init: name='init7_s1_0' type=int64 shape=(1,) -- array([0])           -- ReshapeIsSqueezePattern.m1
Add(x, init1_s_::RSh1) -> add_tensor
Add(y, init1_s_2::RSh1) -> add_tensor_1
  Concat(add_tensor, add_tensor_1, axis=1) -> cat_default
    Shape(cat_default, end=1, start=0) -> cat_default::Shape:1
      Squeeze(cat_default::Shape:1) -> sym_size_int
        Cast(sym_size_int, to=1) -> sym_size_int::C1
          Unsqueeze(sym_size_int::C1, init7_s1_0) -> sym_size_int::C1::RSh1
    Div(cat_default, sym_size_int::C1::RSh1) -> output
output: name='output' type=dtype('float32') shape=['batch', 'seq+4']

ControlFlowWhile#

code: yobx.torch.testing._model_eval_cases.ControlFlowWhile

forward#

def forward(self, ci, a, b):
    def cond_fn(i, x, y):
        return i > 0

    def body_fn(i, x, y):
        z = x + y
        return i - 1, z, y - z

    return torch._higher_order_ops.while_loop(cond_fn, body_fn, [ci, a, b])

yobx#

FAILED

Found the following conflicts between user-specified ranges and inferred ranges from model tracing:
- Received user-specified dim hint Dim.DYNAMIC(min=None, max=None), but tracing inferred a static shape of 3 for dimension inputs['a'].shape[1].

dynamo-ir#

FAILED

Failed to export the model with torch.export. [96mThis is step 1/3[0m of exporting the model to ONNX. Next steps:
- Modify the model code for `torch.export.export` to succeed. Refer to https://pytorch.org/docs/stable/generated/exportdb/index.html for more information.
- Debug `torch.export.export` and submit a PR to PyTorch.
- Create an issue in the PyTorch GitHub repository against the [96m*torch.export*[0m component and attach the full error stack as well as reproduction scripts.

## Exception summary

<class 'ValueError'>: Found the following conflicts between user-specified ranges and inferred ranges from model tracing:
- Received user-specified dim hint Dim.DYNAMIC(min=None, max=None), but tracing inferred a static shape of 3 for dimension inputs['a'].shape[1].

(Refer to the full stack trace above for more information.)

tracing#

FAILED

[CustomProxy(ci), CustomProxy(a), CustomProxy(b)] can only be of (<class 'torch.Tensor'>, <class 'int'>, <class 'torch.SymInt'>) but got (<class 'yobx.torch.tracing.CustomProxy'>, <class 'yobx.torch.tracing.CustomProxy'>, <class 'yobx.torch.tracing.CustomProxy'>)

new-tracing#

inputs: #1[(T7s,T1s2x3,T1s2x3)]
shapes: ({},{0:DYNAMIC,1:DYNAMIC},{0:DYNAMIC})

opset: domain='' version=21
opset: domain='local_functions' version=1
input: name='ci' type=dtype('int64') shape=None
input: name='a' type=dtype('float32') shape=['batch', 'channel']
input: name='b' type=dtype('float32') shape=['batch_2', 3]
init: name='init7_s_12_cst2init' type=int64 shape=() -- array([1])    -- GraphBuilderPatternOptimization.make_initializer.1/Shape
init: name='init7_s_022_cst2init' type=int64 shape=() -- array([0])   -- GraphBuilderPatternOptimization.make_initializer.1/Shape
init: name='while_loop_int64_max_cst2init' type=int64 shape=() -- array([9223372036854775807])-- GraphBuilderPatternOptimization.make_initializer.1/Shape
Greater(ci, init7_s_022_cst2init) -> while_loop_init_cond
  Loop(while_loop_int64_max_cst2init, while_loop_init_cond, ci, a, b, body=G1) -> output_0, output_1, output_2
output: name='output_0' type=dtype('int64') shape=None
output: name='output_1' type=dtype('float32') shape=['batch', 'channel']
output: name='output_2' type=dtype('float32') shape=['batch_2', 3]
----- subgraph ---- Loop - while_loop - att.body=G1 -- level=1 -- iter,cond_in,loop_in_0_ci,loop_in_1_a,loop_in_2_b -> while_cond_out,loop_out_0,loop_out_1,loop_out_2
input: name='iter' type=dtype('int64') shape=None
input: name='cond_in' type=dtype('bool') shape=None
input: name='loop_in_0_ci' type='NOTENSOR' shape=None
input: name='loop_in_1_a' type='NOTENSOR' shape=None
input: name='loop_in_2_b' type='NOTENSOR' shape=None
Add(loop_in_1_a, loop_in_2_b) -> loop_out_1
  Sub(loop_in_2_b, loop_out_1) -> loop_out_2
Sub(loop_in_0_ci, init7_s_12_cst2init) -> loop_out_0
Greater(loop_out_0, init7_s_022_cst2init) -> while_cond_out
output: name='while_cond_out' type='NOTENSOR' shape=None
output: name='loop_out_0' type='NOTENSOR' shape=None
output: name='loop_out_1' type='NOTENSOR' shape=None
output: name='loop_out_2' type='NOTENSOR' shape=None

ControlFlowWhileDec#

code: yobx.torch.testing._model_eval_cases.ControlFlowWhileDec

forward#

def forward(self, ci, a, b):
    def cond_fn(i, x, y):
        return i > 0

    def body_fn(i, x, y):
        return i - 1, x + y, y - x

    return torch._higher_order_ops.while_loop(cond_fn, body_fn, [ci, a, b])

yobx#

FAILED

Found the following conflicts between user-specified ranges and inferred ranges from model tracing:
- Received user-specified dim hint Dim.DYNAMIC(min=None, max=None), but tracing inferred a static shape of 3 for dimension inputs['a'].shape[1].

dynamo-ir#

FAILED

Failed to export the model with torch.export. [96mThis is step 1/3[0m of exporting the model to ONNX. Next steps:
- Modify the model code for `torch.export.export` to succeed. Refer to https://pytorch.org/docs/stable/generated/exportdb/index.html for more information.
- Debug `torch.export.export` and submit a PR to PyTorch.
- Create an issue in the PyTorch GitHub repository against the [96m*torch.export*[0m component and attach the full error stack as well as reproduction scripts.

## Exception summary

<class 'ValueError'>: Found the following conflicts between user-specified ranges and inferred ranges from model tracing:
- Received user-specified dim hint Dim.DYNAMIC(min=None, max=None), but tracing inferred a static shape of 3 for dimension inputs['a'].shape[1].

(Refer to the full stack trace above for more information.)

tracing#

FAILED

[CustomProxy(ci), CustomProxy(a), CustomProxy(b)] can only be of (<class 'torch.Tensor'>, <class 'int'>, <class 'torch.SymInt'>) but got (<class 'yobx.torch.tracing.CustomProxy'>, <class 'yobx.torch.tracing.CustomProxy'>, <class 'yobx.torch.tracing.CustomProxy'>)

new-tracing#

inputs: #1[(T7s,T1s2x3,T1s2x3)]
shapes: ({},{0:DYNAMIC,1:DYNAMIC},{0:DYNAMIC})

opset: domain='' version=21
opset: domain='local_functions' version=1
input: name='ci' type=dtype('int64') shape=None
input: name='a' type=dtype('float32') shape=['batch', 'channel']
input: name='b' type=dtype('float32') shape=['batch_2', 3]
init: name='init7_s_12_cst2init' type=int64 shape=() -- array([1])    -- GraphBuilderPatternOptimization.make_initializer.1/Shape
init: name='init7_s_022_cst2init' type=int64 shape=() -- array([0])   -- GraphBuilderPatternOptimization.make_initializer.1/Shape
init: name='while_loop_int64_max_cst2init' type=int64 shape=() -- array([9223372036854775807])-- GraphBuilderPatternOptimization.make_initializer.1/Shape
Greater(ci, init7_s_022_cst2init) -> while_loop_init_cond
  Loop(while_loop_int64_max_cst2init, while_loop_init_cond, ci, a, b, body=G1) -> output_0, output_1, output_2
output: name='output_0' type=dtype('int64') shape=None
output: name='output_1' type=dtype('float32') shape=['batch', 'channel']
output: name='output_2' type=dtype('float32') shape=['batch_2', 3]
----- subgraph ---- Loop - while_loop - att.body=G1 -- level=1 -- iter,cond_in,loop_in_0_ci,loop_in_1_a,loop_in_2_b -> while_cond_out,loop_out_0,loop_out_1,loop_out_2
input: name='iter' type=dtype('int64') shape=None
input: name='cond_in' type=dtype('bool') shape=None
input: name='loop_in_0_ci' type='NOTENSOR' shape=None
input: name='loop_in_1_a' type='NOTENSOR' shape=None
input: name='loop_in_2_b' type='NOTENSOR' shape=None
Add(loop_in_1_a, loop_in_2_b) -> loop_out_1
Sub(loop_in_0_ci, init7_s_12_cst2init) -> loop_out_0
Greater(loop_out_0, init7_s_022_cst2init) -> while_cond_out
Sub(loop_in_2_b, loop_in_1_a) -> loop_out_2
output: name='while_cond_out' type='NOTENSOR' shape=None
output: name='loop_out_0' type='NOTENSOR' shape=None
output: name='loop_out_1' type='NOTENSOR' shape=None
output: name='loop_out_2' type='NOTENSOR' shape=None

ControlFlowWhileInc#

code: yobx.torch.testing._model_eval_cases.ControlFlowWhileInc

forward#

def forward(self, ci, a, b):
    def cond_fn(i, x, y):
        return i < x.size(0)

    def body_fn(i, x, y):
        return i + 1, x + y, y - x

    return torch._higher_order_ops.while_loop(cond_fn, body_fn, [ci, a, b])

yobx#

FAILED

Found the following conflicts between user-specified ranges and inferred ranges from model tracing:
- Received user-specified dim hint Dim.DYNAMIC(min=None, max=None), but tracing inferred a static shape of 3 for dimension inputs['a'].shape[1].

dynamo-ir#

FAILED

Failed to export the model with torch.export. [96mThis is step 1/3[0m of exporting the model to ONNX. Next steps:
- Modify the model code for `torch.export.export` to succeed. Refer to https://pytorch.org/docs/stable/generated/exportdb/index.html for more information.
- Debug `torch.export.export` and submit a PR to PyTorch.
- Create an issue in the PyTorch GitHub repository against the [96m*torch.export*[0m component and attach the full error stack as well as reproduction scripts.

## Exception summary

<class 'ValueError'>: Found the following conflicts between user-specified ranges and inferred ranges from model tracing:
- Received user-specified dim hint Dim.DYNAMIC(min=None, max=None), but tracing inferred a static shape of 3 for dimension inputs['a'].shape[1].

(Refer to the full stack trace above for more information.)

tracing#

FAILED

[CustomProxy(ci), CustomProxy(a), CustomProxy(b)] can only be of (<class 'torch.Tensor'>, <class 'int'>, <class 'torch.SymInt'>) but got (<class 'yobx.torch.tracing.CustomProxy'>, <class 'yobx.torch.tracing.CustomProxy'>, <class 'yobx.torch.tracing.CustomProxy'>)

new-tracing#

inputs: #1[(T7s,T1s2x3,T1s2x3)]
shapes: ({},{0:DYNAMIC,1:DYNAMIC},{0:DYNAMIC})

opset: domain='' version=21
opset: domain='local_functions' version=1
input: name='ci' type=dtype('int64') shape=None
input: name='a' type=dtype('float32') shape=['batch', 'channel']
input: name='b' type=dtype('float32') shape=['batch_2', 3]
init: name='init7_s_12_cst2init' type=int64 shape=() -- array([1])    -- GraphBuilderPatternOptimization.make_initializer.1/Shape
init: name='init7_s1_022_cst2init' type=int64 shape=(1,) -- array([0])-- GraphBuilderPatternOptimization.make_initializer.1/Shape
init: name='while_loop_int64_max_cst2init' type=int64 shape=() -- array([9223372036854775807])-- GraphBuilderPatternOptimization.make_initializer.1/Shape
Shape(a, end=1, start=0) -> arg1::Shape:1
  Squeeze(arg1::Shape:1, init7_s1_022_cst2init) -> size
    Greater(size, ci) -> while_loop_init_cond
      Loop(while_loop_int64_max_cst2init, while_loop_init_cond, ci, a, b, body=G1) -> output_0, output_1, output_2
output: name='output_0' type=dtype('int64') shape=None
output: name='output_1' type=dtype('float32') shape=['batch', 'channel']
output: name='output_2' type=dtype('float32') shape=['batch_2', 3]
----- subgraph ---- Loop - while_loop - att.body=G1 -- level=1 -- iter,cond_in,loop_in_0_ci,loop_in_1_a,loop_in_2_b -> while_cond_out,loop_out_0,loop_out_1,loop_out_2
input: name='iter' type=dtype('int64') shape=None
input: name='cond_in' type=dtype('bool') shape=None
input: name='loop_in_0_ci' type='NOTENSOR' shape=None
input: name='loop_in_1_a' type='NOTENSOR' shape=None
input: name='loop_in_2_b' type='NOTENSOR' shape=None
Add(loop_in_0_ci, init7_s_12_cst2init) -> loop_out_0
Add(loop_in_1_a, loop_in_2_b) -> loop_out_1
  Shape(loop_out_1, end=1, start=0) -> arg1::Shape:122
Squeeze(arg1::Shape:122, init7_s1_022_cst2init) -> size22
  Greater(size22, loop_out_0) -> while_cond_out
Sub(loop_in_2_b, loop_in_1_a) -> loop_out_2
output: name='while_cond_out' type='NOTENSOR' shape=None
output: name='loop_out_0' type='NOTENSOR' shape=None
output: name='loop_out_1' type='NOTENSOR' shape=None
output: name='loop_out_2' type='NOTENSOR' shape=None

CreateFromShape#

code: yobx.torch.testing._model_eval_cases.CreateFromShape

forward#

def forward(self, x):
    y = torch.ones((x.shape[0], x.shape[1] + 1))
    return y

yobx#

inputs: #2[(T1s4x4,),(T1s5x5,)]
shapes: dict(x:{0:Dim(dx),1:Dim(dy)})

opset: domain='' version=21
input: name='x' type=dtype('float32') shape=['dx', 'dy']
init: name='SqueezeBinaryUnsqueezePattern_init7_s_1' type=int64 shape=(1,) -- array([1])-- GraphBuilder.constant_folding.from/fold(init7_s1_0,init7_s_1)##init7_s_1/shape_type_compute._cast_inputs.1(add)##init7_s1_0/Opset.make_node.1/Shape##Opset.make_node.1/Shape
Shape(x, end=1, start=0) -> x::Shape:1
Shape(x, end=2, start=1) -> x::Shape1:2
  Add(x::Shape1:2, SqueezeBinaryUnsqueezePattern_init7_s_1) -> add::UnSq0
  Concat(x::Shape:1, add::UnSq0, axis=0) -> _onx_concat_sym_size_int_2::UnSq0
    ConstantOfShape(_onx_concat_sym_size_int_2::UnSq0, value=[1.0]) -> output_0
output: name='output_0' type=dtype('float32') shape=['dx', 'dy']

dynamo-ir#

inputs: #2[(T1s4x4,),(T1s5x5,)]
shapes: dict(x:{0:Dim(dx),1:Dim(dy)})

opset: domain='' version=20
input: name='x' type=dtype('float32') shape=['dx', 'dy']
init: name='val_10' type=float32 shape=() -- array([1.], dtype=float32)
init: name='val_2' type=int64 shape=() -- array([1])
init: name='val_5' type=int64 shape=(1,) -- array([-1])
Shape(x, end=1, start=0) -> val_0
Shape(x, end=2, start=1) -> val_1
  Squeeze(val_1) -> sym_size_int_3
    Add(sym_size_int_3, val_2) -> add
      Reshape(add, val_5, allowzero=0) -> val_6
  Concat(val_0, val_6, axis=0) -> val_7
    Expand(val_10, val_7) -> ones
output: name='ones' type=dtype('float32') shape=['dx', 'dy + 1']

tracing#

inputs: #2[(T1s4x4,),(T1s5x5,)]
shapes: dict(x:{0:Dim(dx),1:Dim(dy)})

opset: domain='' version=21
input: name='x' type=dtype('float32') shape=['dx', 'dy']
init: name='init7_s1_0' type=int64 shape=(1,) -- array([0])           -- Opset.make_node.1/Shape##Opset.make_node.1/Shape##Opset.make_node.1/Shape##Opset.make_node.1/Shape##Opset.make_node.1/Shape##Opset.make_node.1/Shape##Opset.make_node.1/Shape##Opset.make_node.1/Shape
init: name='init7_s_1' type=int64 shape=() -- array([1])              -- shape_type_compute._cast_inputs.1(add)
Shape(x, end=1, start=0) -> _onx_gather_size
Shape(x, end=2, start=1) -> _onx_gather_size_12
  Squeeze(_onx_gather_size_12, init7_s1_0) -> getitem_3
    Add(getitem_3, init7_s_1) -> _onx_add_getitem_3
      Unsqueeze(_onx_add_getitem_3, init7_s1_0) -> add::UnSq0
  Concat(_onx_gather_size, add::UnSq0, axis=0) -> _onx_concat_getitem::UnSq0
    ConstantOfShape(_onx_concat_getitem::UnSq0, value=[1.0]) -> output
output: name='output' type=dtype('float32') shape=['dx', 'add']

new-tracing#

inputs: #2[(T1s4x4,),(T1s5x5,)]
shapes: dict(x:{0:Dim(dx),1:Dim(dy)})

opset: domain='' version=21
input: name='x' type=dtype('float32') shape=['dx', 'dy']
init: name='init7_s1_1' type=int64 shape=(1,) -- array([1])           -- GraphBuilder.get_dimension_as_result.axis_name##GraphBuilder._eval_dim_expr_node_as_1d.const
Shape(x, end=2, start=1) -> dy
  Add(dy, init7_s1_1) -> _onx_add_dy
Shape(x, end=1, start=0) -> x::Shape:1
  Concat(x::Shape:1, _onx_add_dy, axis=0) -> _onx_concat_sym_size_int::UnSq0
    ConstantOfShape(_onx_concat_sym_size_int::UnSq0, value=[1.0]) -> output
output: name='output' type=dtype('float32') shape=['dx', 'dy+1']

CreateFromShapeThroughFunction#

code: yobx.torch.testing._model_eval_cases.CreateFromShapeThroughFunction

forward#

def forward(self, x):
    def add_one(dim):
        return dim + 1

    dy1 = add_one(x.shape[1])
    y = torch.ones((x.shape[0], dy1))
    return y

yobx#

inputs: #1[(T1s4x4,)]
shapes: dict(x:{0:Dim(dx),1:Dim(dy)})

opset: domain='' version=21
input: name='x' type=dtype('float32') shape=['dx', 'dy']
init: name='SqueezeBinaryUnsqueezePattern_init7_s_1' type=int64 shape=(1,) -- array([1])-- GraphBuilder.constant_folding.from/fold(init7_s1_0,init7_s_1)##init7_s_1/shape_type_compute._cast_inputs.1(add)##init7_s1_0/Opset.make_node.1/Shape##Opset.make_node.1/Shape
Shape(x, end=1, start=0) -> x::Shape:1
Shape(x, end=2, start=1) -> x::Shape1:2
  Add(x::Shape1:2, SqueezeBinaryUnsqueezePattern_init7_s_1) -> add::UnSq0
  Concat(x::Shape:1, add::UnSq0, axis=0) -> _onx_concat_sym_size_int_2::UnSq0
    ConstantOfShape(_onx_concat_sym_size_int_2::UnSq0, value=[1.0]) -> output_0
output: name='output_0' type=dtype('float32') shape=['dx', 'dy']

dynamo-ir#

inputs: #1[(T1s4x4,)]
shapes: dict(x:{0:Dim(dx),1:Dim(dy)})

opset: domain='' version=20
input: name='x' type=dtype('float32') shape=['dx', 'dy']
init: name='val_10' type=float32 shape=() -- array([1.], dtype=float32)
init: name='val_2' type=int64 shape=() -- array([1])
init: name='val_5' type=int64 shape=(1,) -- array([-1])
Shape(x, end=1, start=0) -> val_0
Shape(x, end=2, start=1) -> val_1
  Squeeze(val_1) -> sym_size_int_3
    Add(sym_size_int_3, val_2) -> add
      Reshape(add, val_5, allowzero=0) -> val_6
  Concat(val_0, val_6, axis=0) -> val_7
    Expand(val_10, val_7) -> ones
output: name='ones' type=dtype('float32') shape=['dx', 'dy + 1']

tracing#

inputs: #1[(T1s4x4,)]
shapes: dict(x:{0:Dim(dx),1:Dim(dy)})

opset: domain='' version=21
input: name='x' type=dtype('float32') shape=['dx', 'dy']
init: name='init7_s1_0' type=int64 shape=(1,) -- array([0])           -- Opset.make_node.1/Shape##Opset.make_node.1/Shape##Opset.make_node.1/Shape##Opset.make_node.1/Shape##Opset.make_node.1/Shape##Opset.make_node.1/Shape##Opset.make_node.1/Shape##Opset.make_node.1/Shape
init: name='init7_s_1' type=int64 shape=() -- array([1])              -- shape_type_compute._cast_inputs.1(add)
Shape(x, end=2, start=1) -> _onx_gather_size2
  Squeeze(_onx_gather_size2, init7_s1_0) -> getitem_1
    Add(getitem_1, init7_s_1) -> _onx_add_getitem_1
      Unsqueeze(_onx_add_getitem_1, init7_s1_0) -> add::UnSq0
Shape(x, end=1, start=0) -> _onx_gather_size_1
  Concat(_onx_gather_size_1, add::UnSq0, axis=0) -> _onx_concat_getitem_2::UnSq0
    ConstantOfShape(_onx_concat_getitem_2::UnSq0, value=[1.0]) -> output
output: name='output' type=dtype('float32') shape=['dx', 'add']

new-tracing#

inputs: #1[(T1s4x4,)]
shapes: dict(x:{0:Dim(dx),1:Dim(dy)})

opset: domain='' version=21
input: name='x' type=dtype('float32') shape=['dx', 'dy']
init: name='init7_s1_1' type=int64 shape=(1,) -- array([1])           -- GraphBuilder.get_dimension_as_result.axis_name##GraphBuilder._eval_dim_expr_node_as_1d.const
Shape(x, end=2, start=1) -> dy
  Add(dy, init7_s1_1) -> _onx_add_dy
Shape(x, end=1, start=0) -> x::Shape:1
  Concat(x::Shape:1, _onx_add_dy, axis=0) -> _onx_concat_sym_size_int::UnSq0
    ConstantOfShape(_onx_concat_sym_size_int::UnSq0, value=[1.0]) -> output
output: name='output' type=dtype('float32') shape=['dx', 'dy+1']

CropLastDimensionWithTensorContent#

code: yobx.torch.testing._model_eval_cases.CropLastDimensionWithTensorContent

forward#

def forward(self, x, shape):
    return x[..., : shape.item()]

yobx#

inputs: #2[(T1s3x4x4,T7s1),(T1s6x4x4,T7s1)]
shapes: dict(x:{0:Dim(batch)},shape:{})

opset: domain='' version=21
input: name='x' type=dtype('float32') shape=['batch', 4, 4]
input: name='shape' type=dtype('int64') shape=[1]
init: name='init7_s1_0' type=int64 shape=(1,) -- array([0])           -- Opset.make_node.1/Shape##Opset.make_node.1/Shape##Opset.make_node.1/Shape
init: name='init7_s1_2' type=int64 shape=(1,) -- array([2])           -- Opset.make_node.1/Shape
Slice(x, init7_s1_0, shape, init7_s1_2) -> output_0
output: name='output_0' type=dtype('float32') shape=['batch', 4, 'item']

dynamo-ir#

inputs: #2[(T1s3x4x4,T7s1),(T1s6x4x4,T7s1)]
shapes: dict(x:{0:Dim(batch)},shape:{})

opset: domain='' version=20
input: name='x' type=dtype('float32') shape=['batch', 4, 4]
input: name='shape' type=dtype('int64') shape=[1]
init: name='val_11' type=int64 shape=(1,) -- array([2])
init: name='val_0' type=int64 shape=(1,) -- array([-1])
init: name='val_2' type=int64 shape=() -- array([0])
init: name='val_4' type=int64 shape=(1,) -- array([0])
init: name='val_12' type=int64 shape=(1,) -- array([1])
Reshape(shape, val_0, allowzero=0) -> val_1
  Gather(val_1, val_2, axis=0) -> val_3
    Reshape(val_3, val_0, allowzero=0) -> val_7
      Slice(x, val_4, val_7, val_11, val_12) -> slice_1
output: name='slice_1' type=dtype('float32') shape=['batch', 4, 'u1']

tracing#

inputs: #2[(T1s3x4x4,T7s1),(T1s6x4x4,T7s1)]
shapes: dict(x:{0:Dim(batch)},shape:{})

opset: domain='' version=21
input: name='x' type=dtype('float32') shape=['batch', 4, 4]
input: name='shape' type=dtype('int64') shape=[1]
init: name='init7_s1_0' type=int64 shape=(1,) -- array([0])           -- Opset.make_node.1/Shape##Opset.make_node.1/Shape
init: name='getitem_axis' type=int64 shape=(1,) -- array([-1])        -- _getitem_slice.axis.1
init: name='getitem_step' type=int64 shape=(1,) -- array([1])         -- _getitem_slice.3
Slice(x, init7_s1_0, shape, getitem_axis, getitem_step) -> output
output: name='output' type=dtype('float32') shape=['batch', 4, 'item']

new-tracing#

inputs: #2[(T1s3x4x4,T7s1),(T1s6x4x4,T7s1)]
shapes: dict(x:{0:Dim(batch)},shape:{})

opset: domain='' version=21
input: name='x' type=dtype('float32') shape=['batch', 4, 4]
input: name='shape' type=dtype('int64') shape=[1]
init: name='init7_s1_0' type=int64 shape=(1,) -- array([0])           -- Opset.make_node.1/Shape##Opset.make_node.1/Shape##Opset.make_node.1/Shape
init: name='init7_s1_2' type=int64 shape=(1,) -- array([2])           -- Opset.make_node.1/Shape
Slice(x, init7_s1_0, shape, init7_s1_2) -> output
output: name='output' type=dtype('float32') shape=['batch', 4, '_local_scalar_dense_default']

CropLastDimensionWithTensorShape#

code: yobx.torch.testing._model_eval_cases.CropLastDimensionWithTensorShape

forward#

def forward(self, x, y):
    return x[..., : y.shape[0]]

yobx#

inputs: #2[(T1s3x4x4,T1s2),(T1s6x4x4,T1s3)]
shapes: dict(x:{0:Dim(batch)},y:{0:Dim(crop)})

opset: domain='' version=21
input: name='x' type=dtype('float32') shape=['batch', 4, 4]
input: name='y' type=dtype('float32') shape=['crop']
init: name='init7_s1_0' type=int64 shape=(1,) -- array([0])           -- Opset.make_node.1/Shape##Opset.make_node.1/Shape
init: name='init7_s1_2' type=int64 shape=(1,) -- array([2])           -- Opset.make_node.1/Shape
Shape(y, end=1, start=0) -> y::Shape:1
  Slice(x, init7_s1_0, y::Shape:1, init7_s1_2) -> output_0
output: name='output_0' type=dtype('float32') shape=['batch', 4, 'crop']

dynamo-ir#

inputs: #2[(T1s3x4x4,T1s2),(T1s6x4x4,T1s3)]
shapes: dict(x:{0:Dim(batch)},y:{0:Dim(crop)})

opset: domain='' version=20
input: name='x' type=dtype('float32') shape=['batch', 4, 4]
input: name='y' type=dtype('float32') shape=['crop']
init: name='val_8' type=int64 shape=(1,) -- array([2])
init: name='val_1' type=int64 shape=(1,) -- array([0])
init: name='val_9' type=int64 shape=(1,) -- array([1])
Shape(y, end=1, start=0) -> val_0
  Slice(x, val_1, val_0, val_8, val_9) -> slice_1
output: name='slice_1' type=dtype('float32') shape=['batch', 4, 'crop']

tracing#

inputs: #2[(T1s3x4x4,T1s2),(T1s6x4x4,T1s3)]
shapes: dict(x:{0:Dim(batch)},y:{0:Dim(crop)})

opset: domain='' version=21
input: name='x' type=dtype('float32') shape=['batch', 4, 4]
input: name='y' type=dtype('float32') shape=['crop']
init: name='init7_s1_0' type=int64 shape=(1,) -- array([0])           -- Opset.make_node.1/Shape##Opset.make_node.1/Shape##Opset.make_node.1/Shape
init: name='getitem_1_axis' type=int64 shape=(1,) -- array([-1])      -- _getitem_slice.axis.1
init: name='getitem_1_step' type=int64 shape=(1,) -- array([1])       -- _getitem_slice.3
Shape(y, end=1, start=0) -> _onx_gather_size
  Slice(x, init7_s1_0, _onx_gather_size, getitem_1_axis, getitem_1_step) -> output
output: name='output' type=dtype('float32') shape=['batch', 4, 'crop']

new-tracing#

inputs: #2[(T1s3x4x4,T1s2),(T1s6x4x4,T1s3)]
shapes: dict(x:{0:Dim(batch)},y:{0:Dim(crop)})

opset: domain='' version=21
input: name='x' type=dtype('float32') shape=['batch', 4, 4]
input: name='y' type=dtype('float32') shape=['crop']
init: name='init7_s1_0' type=int64 shape=(1,) -- array([0])           -- Opset.make_node.1/Shape##Opset.make_node.1/Shape
init: name='init7_s1_2' type=int64 shape=(1,) -- array([2])           -- Opset.make_node.1/Shape
Shape(y, end=1, start=0) -> y::Shape:1
  Slice(x, init7_s1_0, y::Shape:1, init7_s1_2) -> output
output: name='output' type=dtype('float32') shape=['batch', 4, 'crop']

DynamicCacheInput#

code: yobx.torch.testing._model_eval_cases.DynamicCacheInput

forward#

def forward(self, x, cache):
    """Reduces each cache layer over dim 2 and returns (x, new_cache)."""
    pairs = [
        (layer.keys.mean(dim=2, keepdim=True), layer.values.mean(dim=2, keepdim=True))
        for layer in cache.layers
    ]
    return x, make_dynamic_cache(pairs)

yobx#

inputs: #2[(T1s2x4x3x7,DynamicCache(key_cache=#2[T1s2x4x3x7,T1s2x4x3x7], value_cache=#2[T1s2x4x3x7,T1s2x4x3x7])),(T1s3x4x5x7,DynamicCache(key_cache=#2[T1s3x4x5x7,T1s3x4x5x7], value_cache=#2[T1s3x4x5x7,T1s3x4x5x7]))]
shapes: dict(x:{0:DYNAMIC,2:DYNAMIC},cache:#4[{0:DYNAMIC,2:DYNAMIC},{0:DYNAMIC,2:DYNAMIC},{0:DYNAMIC,2:DYNAMIC},{0:DYNAMIC,2:DYNAMIC}])

opset: domain='' version=21
input: name='x' type=dtype('float32') shape=['batch', 4, 'D0', 7]
input: name='cache_key_0' type=dtype('float32') shape=['batch_5', 4, 'D0_5', 7]
input: name='cache_value_0' type=dtype('float32') shape=['batch_6', 4, 'D0_6', 7]
input: name='cache_key_1' type=dtype('float32') shape=['batch_7', 4, 'D0_7', 7]
input: name='cache_value_1' type=dtype('float32') shape=['batch_8', 4, 'D0_8', 7]
init: name='init7_s1_2' type=int64 shape=(1,) -- array([2])           -- Opset.make_node.1/Shape##Opset.make_node.1/Shape##Opset.make_node.1/Shape##Opset.make_node.1/Shape
Identity(x) -> output_0
ReduceMean(cache_key_0, init7_s1_2, keepdims=1) -> output_1
ReduceMean(cache_value_0, init7_s1_2, keepdims=1) -> output_2
ReduceMean(cache_key_1, init7_s1_2, keepdims=1) -> output_3
ReduceMean(cache_value_1, init7_s1_2, keepdims=1) -> output_4
output: name='output_0' type=dtype('float32') shape=['batch', 4, 'D0', 7]
output: name='output_1' type=dtype('float32') shape=['batch_5', 4, 1, 7]
output: name='output_2' type=dtype('float32') shape=['batch_6', 4, 1, 7]
output: name='output_3' type=dtype('float32') shape=['batch_7', 4, 1, 7]
output: name='output_4' type=dtype('float32') shape=['batch_8', 4, 1, 7]

dynamo-ir#

inputs: #2[(T1s2x4x3x7,DynamicCache(key_cache=#2[T1s2x4x3x7,T1s2x4x3x7], value_cache=#2[T1s2x4x3x7,T1s2x4x3x7])),(T1s3x4x5x7,DynamicCache(key_cache=#2[T1s3x4x5x7,T1s3x4x5x7], value_cache=#2[T1s3x4x5x7,T1s3x4x5x7]))]
shapes: dict(x:{0:DYNAMIC,2:DYNAMIC},cache:#4[{0:DYNAMIC,2:DYNAMIC},{0:DYNAMIC,2:DYNAMIC},{0:DYNAMIC,2:DYNAMIC},{0:DYNAMIC,2:DYNAMIC}])

opset: domain='' version=20
input: name='x_orig' type=dtype('float32') shape=['s77', 4, 's53', 7]
input: name='cache_key_0' type=dtype('float32') shape=['s60', 4, 's15', 7]
input: name='cache_value_0' type=dtype('float32') shape=['s81', 4, 's59', 7]
input: name='cache_key_1' type=dtype('float32') shape=['s67', 4, 's73', 7]
input: name='cache_value_1' type=dtype('float32') shape=['s50', 4, 's46', 7]
init: name='val_2' type=int64 shape=(1,) -- array([2])
Identity(x_orig) -> x
ReduceMean(cache_key_0, val_2, noop_with_empty_axes=0, keepdims=1) -> mean
ReduceMean(cache_value_0, val_2, noop_with_empty_axes=0, keepdims=1) -> mean_1
ReduceMean(cache_key_1, val_2, noop_with_empty_axes=0, keepdims=1) -> mean_2
ReduceMean(cache_value_1, val_2, noop_with_empty_axes=0, keepdims=1) -> mean_3
output: name='x' type=dtype('float32') shape=['s77', 4, 's53', 7]
output: name='mean' type=dtype('float32') shape=['s60', 4, 1, 7]
output: name='mean_1' type=dtype('float32') shape=['s81', 4, 1, 7]
output: name='mean_2' type=dtype('float32') shape=['s67', 4, 1, 7]
output: name='mean_3' type=dtype('float32') shape=['s50', 4, 1, 7]

tracing#

inputs: #2[(T1s2x4x3x7,DynamicCache(key_cache=#2[T1s2x4x3x7,T1s2x4x3x7], value_cache=#2[T1s2x4x3x7,T1s2x4x3x7])),(T1s3x4x5x7,DynamicCache(key_cache=#2[T1s3x4x5x7,T1s3x4x5x7], value_cache=#2[T1s3x4x5x7,T1s3x4x5x7]))]
shapes: dict(x:{0:DYNAMIC,2:DYNAMIC},cache:#4[{0:DYNAMIC,2:DYNAMIC},{0:DYNAMIC,2:DYNAMIC},{0:DYNAMIC,2:DYNAMIC},{0:DYNAMIC,2:DYNAMIC}])

opset: domain='' version=21
input: name='x' type=dtype('float32') shape=['batch', 4, 'D0', 7]
input: name='cache_0' type=dtype('float32') shape=['batch_5', 4, 'D0_5', 7]
input: name='cache_1' type=dtype('float32') shape=['batch_6', 4, 'D0_6', 7]
input: name='cache_2' type=dtype('float32') shape=['batch_7', 4, 'D0_7', 7]
input: name='cache_3' type=dtype('float32') shape=['batch_8', 4, 'D0_8', 7]
init: name='init7_s1_2' type=int64 shape=(1,) -- array([2])           -- aten_meth_mean.cst.1##aten_meth_mean.cst.1##aten_meth_mean.cst.1##aten_meth_mean.cst.1
Identity(x) -> output_0
ReduceMean(cache_0, init7_s1_2, keepdims=1) -> output_1
ReduceMean(cache_1, init7_s1_2, keepdims=1) -> output_2
ReduceMean(cache_2, init7_s1_2, keepdims=1) -> output_3
ReduceMean(cache_3, init7_s1_2, keepdims=1) -> output_4
output: name='output_0' type=dtype('float32') shape=['batch', 4, 'D0', 7]
output: name='output_1' type=dtype('float32') shape=['batch_5', 4, 1, 7]
output: name='output_2' type=dtype('float32') shape=['batch_6', 4, 1, 7]
output: name='output_3' type=dtype('float32') shape=['batch_7', 4, 1, 7]
output: name='output_4' type=dtype('float32') shape=['batch_8', 4, 1, 7]

new-tracing#

inputs: #2[(T1s2x4x3x7,DynamicCache(key_cache=#2[T1s2x4x3x7,T1s2x4x3x7], value_cache=#2[T1s2x4x3x7,T1s2x4x3x7])),(T1s3x4x5x7,DynamicCache(key_cache=#2[T1s3x4x5x7,T1s3x4x5x7], value_cache=#2[T1s3x4x5x7,T1s3x4x5x7]))]
shapes: dict(x:{0:DYNAMIC,2:DYNAMIC},cache:#4[{0:DYNAMIC,2:DYNAMIC},{0:DYNAMIC,2:DYNAMIC},{0:DYNAMIC,2:DYNAMIC},{0:DYNAMIC,2:DYNAMIC}])

opset: domain='' version=21
input: name='x' type=dtype('float32') shape=['batch', 4, 'D0', 7]
input: name='cache_0' type=dtype('float32') shape=['_dyn_32', 4, '_dyn_33', 7]
input: name='cache_1' type=dtype('float32') shape=['_dyn_34', 4, '_dyn_35', 7]
input: name='cache_2' type=dtype('float32') shape=['_dyn_36', 4, '_dyn_37', 7]
input: name='cache_3' type=dtype('float32') shape=['_dyn_38', 4, '_dyn_39', 7]
init: name='init7_s1_2' type=int64 shape=(1,) -- array([2])           -- Opset.make_node.1/Shape##Opset.make_node.1/Shape##Opset.make_node.1/Shape##Opset.make_node.1/Shape
Identity(x) -> output_0
ReduceMean(cache_0, init7_s1_2, keepdims=1) -> output_1
ReduceMean(cache_1, init7_s1_2, keepdims=1) -> output_2
ReduceMean(cache_2, init7_s1_2, keepdims=1) -> output_3
ReduceMean(cache_3, init7_s1_2, keepdims=1) -> output_4
output: name='output_0' type=dtype('float32') shape=['batch', 4, 'D0', 7]
output: name='output_1' type=dtype('float32') shape=['_dyn_32', 4, 1, 7]
output: name='output_2' type=dtype('float32') shape=['_dyn_34', 4, 1, 7]
output: name='output_3' type=dtype('float32') shape=['_dyn_36', 4, 1, 7]
output: name='output_4' type=dtype('float32') shape=['_dyn_38', 4, 1, 7]

DynamicCacheInputMixedLayers#

code: yobx.torch.testing._model_eval_cases.DynamicCacheInputMixedLayers

forward#

def forward(self, x, cache):
    """Reduces each cache layer over dim 2 and returns (x, new_cache)."""
    cls_layers = [type(layer) for layer in cache.layers]
    pairs = [
        (layer.keys.mean(dim=2, keepdim=True), layer.values.mean(dim=2, keepdim=True))
        for layer in cache.layers
    ]
    return x, make_dynamic_cache(pairs, cls_layers=cls_layers)

yobx#

inputs: #2[(T1s2x4x3x7,DynamicCache(DynamicLayer(T1s2x4x3x7, T1s2x4x3x7), DynamicSlidingWindowLayer(T1s2x4x3x7, T1s2x4x3x7))),(T1s3x4x5x7,DynamicCache(DynamicLayer(T1s3x4x5x7, T1s3x4x5x7), DynamicSlidingWindowLayer(T1s3x4x3x7, T1s3x4x3x7)))]
shapes: dict(x:{0:DYNAMIC,2:DYNAMIC},cache:#4[{0:DYNAMIC,2:DYNAMIC},{0:DYNAMIC,2:DYNAMIC},{0:DYNAMIC},{0:DYNAMIC}])

opset: domain='' version=21
input: name='x' type=dtype('float32') shape=['batch', 4, 'D0', 7]
input: name='cache_key_d_0' type=dtype('float32') shape=['batch_5', 4, 'D0_3', 7]
input: name='cache_value_d_0' type=dtype('float32') shape=['batch_6', 4, 'D0_4', 7]
input: name='cache_key_w_1' type=dtype('float32') shape=['batch_7', 4, 3, 7]
input: name='cache_value_w_1' type=dtype('float32') shape=['batch_8', 4, 3, 7]
init: name='init7_s1_2' type=int64 shape=(1,) -- array([2])           -- Opset.make_node.1/Shape##Opset.make_node.1/Shape##Opset.make_node.1/Shape##Opset.make_node.1/Shape
Identity(x) -> output_0
ReduceMean(cache_key_d_0, init7_s1_2, keepdims=1) -> output_1
ReduceMean(cache_value_d_0, init7_s1_2, keepdims=1) -> output_2
ReduceMean(cache_key_w_1, init7_s1_2, keepdims=1) -> output_3
ReduceMean(cache_value_w_1, init7_s1_2, keepdims=1) -> output_4
output: name='output_0' type=dtype('float32') shape=['batch', 4, 'D0', 7]
output: name='output_1' type=dtype('float32') shape=['batch_5', 4, 1, 7]
output: name='output_2' type=dtype('float32') shape=['batch_6', 4, 1, 7]
output: name='output_3' type=dtype('float32') shape=['batch_7', 4, 1, 7]
output: name='output_4' type=dtype('float32') shape=['batch_8', 4, 1, 7]

dynamo-ir#

inputs: #2[(T1s2x4x3x7,DynamicCache(DynamicLayer(T1s2x4x3x7, T1s2x4x3x7), DynamicSlidingWindowLayer(T1s2x4x3x7, T1s2x4x3x7))),(T1s3x4x5x7,DynamicCache(DynamicLayer(T1s3x4x5x7, T1s3x4x5x7), DynamicSlidingWindowLayer(T1s3x4x3x7, T1s3x4x3x7)))]
shapes: dict(x:{0:DYNAMIC,2:DYNAMIC},cache:#4[{0:DYNAMIC,2:DYNAMIC},{0:DYNAMIC,2:DYNAMIC},{0:DYNAMIC},{0:DYNAMIC}])

opset: domain='' version=20
input: name='x_orig' type=dtype('float32') shape=['s77', 4, 's53', 7]
input: name='cache_key_d_0' type=dtype('float32') shape=['s63', 4, 's13', 7]
input: name='cache_value_d_0' type=dtype('float32') shape=['s41', 4, 's1', 7]
input: name='cache_key_w_1' type=dtype('float32') shape=['s99', 4, 3, 7]
input: name='cache_value_w_1' type=dtype('float32') shape=['s44', 4, 3, 7]
init: name='val_2' type=int64 shape=(1,) -- array([2])
Identity(x_orig) -> x
ReduceMean(cache_key_d_0, val_2, noop_with_empty_axes=0, keepdims=1) -> mean
ReduceMean(cache_value_d_0, val_2, noop_with_empty_axes=0, keepdims=1) -> mean_1
ReduceMean(cache_key_w_1, val_2, noop_with_empty_axes=0, keepdims=1) -> mean_2
ReduceMean(cache_value_w_1, val_2, noop_with_empty_axes=0, keepdims=1) -> mean_3
output: name='x' type=dtype('float32') shape=['s77', 4, 's53', 7]
output: name='mean' type=dtype('float32') shape=['s63', 4, 1, 7]
output: name='mean_1' type=dtype('float32') shape=['s41', 4, 1, 7]
output: name='mean_2' type=dtype('float32') shape=['s99', 4, 1, 7]
output: name='mean_3' type=dtype('float32') shape=['s44', 4, 1, 7]

tracing#

FAILED

`__cuda_array_interface__` must be a dict

new-tracing#

inputs: #2[(T1s2x4x3x7,DynamicCache(DynamicLayer(T1s2x4x3x7, T1s2x4x3x7), DynamicSlidingWindowLayer(T1s2x4x3x7, T1s2x4x3x7))),(T1s3x4x5x7,DynamicCache(DynamicLayer(T1s3x4x5x7, T1s3x4x5x7), DynamicSlidingWindowLayer(T1s3x4x3x7, T1s3x4x3x7)))]
shapes: dict(x:{0:DYNAMIC,2:DYNAMIC},cache:#4[{0:DYNAMIC,2:DYNAMIC},{0:DYNAMIC,2:DYNAMIC},{0:DYNAMIC},{0:DYNAMIC}])

opset: domain='' version=21
input: name='x' type=dtype('float32') shape=['batch', 4, 'D0', 7]
input: name='cache_0' type=dtype('float32') shape=['_dyn_42', 4, '_dyn_43', 7]
input: name='cache_1' type=dtype('float32') shape=['_dyn_44', 4, '_dyn_45', 7]
input: name='cache_2' type=dtype('float32') shape=['_dyn_46', 4, 3, 7]
input: name='cache_3' type=dtype('float32') shape=['_dyn_47', 4, 3, 7]
init: name='init7_s1_2' type=int64 shape=(1,) -- array([2])           -- Opset.make_node.1/Shape##Opset.make_node.1/Shape##Opset.make_node.1/Shape##Opset.make_node.1/Shape
Identity(x) -> output_0
ReduceMean(cache_0, init7_s1_2, keepdims=1) -> output_1
ReduceMean(cache_1, init7_s1_2, keepdims=1) -> output_2
ReduceMean(cache_2, init7_s1_2, keepdims=1) -> output_3
ReduceMean(cache_3, init7_s1_2, keepdims=1) -> output_4
output: name='output_0' type=dtype('float32') shape=['batch', 4, 'D0', 7]
output: name='output_1' type=dtype('float32') shape=['_dyn_42', 4, 1, 7]
output: name='output_2' type=dtype('float32') shape=['_dyn_44', 4, 1, 7]
output: name='output_3' type=dtype('float32') shape=['_dyn_46', 4, 1, 7]
output: name='output_4' type=dtype('float32') shape=['_dyn_47', 4, 1, 7]

ExportWithDimension0#

code: yobx.torch.testing._model_eval_cases.ExportWithDimension0

forward#

def forward(self, x):
    return x @ torch.arange(x.shape[1], dtype=torch.float32).reshape((-1, 1))

yobx#

inputs: #1[(T1s0x3,)]
shapes: dict(x:{0:DYNAMIC,1:DYNAMIC})

opset: domain='' version=21
input: name='x' type=dtype('float32') shape=['batch', 'channel']
init: name='init1_s_' type=float32 shape=() -- array([0.], dtype=float32)-- Opset.make_node.1/Small
init: name='init1_s_2' type=float32 shape=() -- array([1.], dtype=float32)-- Opset.make_node.1/Small
init: name='init7_s1_1' type=int64 shape=(1,) -- array([1])           -- ReshapeIsSqueezePattern.m1
Shape(x, end=2, start=1) -> x::Shape1:2
  Squeeze(x::Shape1:2) -> sym_size_int_2
    Cast(sym_size_int_2, to=1) -> sym_size_int_2::C1
      Range(init1_s_, sym_size_int_2::C1, init1_s_2) -> arange
        Unsqueeze(arange, init7_s1_1) -> reshape
          MatMul(x, reshape) -> output_0
output: name='output_0' type=dtype('float32') shape=['batch', 1]

dynamo-ir#

inputs: #1[(T1s0x3,)]
shapes: dict(x:{0:DYNAMIC,1:DYNAMIC})

opset: domain='' version=20
input: name='x' type=dtype('float32') shape=['s77', 's27']
init: name='val_3' type=float32 shape=() -- array([0.], dtype=float32)
init: name='val_5' type=float32 shape=() -- array([1.], dtype=float32)
init: name='val_9' type=int64 shape=(2,) -- array([-1,  1])
Shape(x, end=2, start=1) -> val_0
  Squeeze(val_0) -> sym_size_int_5
    Cast(sym_size_int_5, to=1) -> val_1
      Range(val_3, val_1, val_5) -> arange
        Reshape(arange, val_9, allowzero=1) -> view
          MatMul(x, view) -> matmul
output: name='matmul' type=dtype('float32') shape=['s77', 1]

tracing#

inputs: #1[(T1s0x3,)]
shapes: dict(x:{0:DYNAMIC,1:DYNAMIC})

opset: domain='' version=21
input: name='x' type=dtype('float32') shape=['batch', 'channel']
init: name='init7_s1_0' type=int64 shape=(1,) -- array([0])           -- Opset.make_node.1/Shape##Opset.make_node.1/Shape##Opset.make_node.1/Shape
init: name='init7_s1_1' type=int64 shape=(1,) -- array([1])           -- Opset.make_node.1/Shape##ReshapeIsSqueezePattern.m1
init: name='init1_s_' type=float32 shape=() -- array([0.], dtype=float32)-- Opset.make_node.1/Small
init: name='init1_s_2' type=float32 shape=() -- array([1.], dtype=float32)-- Opset.make_node.1/Small
Shape(x, end=2, start=1) -> _onx_gather_size2
  Squeeze(_onx_gather_size2, init7_s1_0) -> getitem_1
    Cast(getitem_1, to=1) -> getitem_1::C1
      Range(init1_s_, getitem_1::C1, init1_s_2) -> arange
        Unsqueeze(arange, init7_s1_1) -> reshape
          MatMul(x, reshape) -> output
output: name='output' type=dtype('float32') shape=['batch', 1]

new-tracing#

inputs: #1[(T1s0x3,)]
shapes: dict(x:{0:DYNAMIC,1:DYNAMIC})

opset: domain='' version=21
input: name='x' type=dtype('float32') shape=['batch', 'channel']
init: name='init1_s_' type=float32 shape=() -- array([0.], dtype=float32)-- Opset.make_node.1/Small
init: name='init1_s_2' type=float32 shape=() -- array([1.], dtype=float32)-- Opset.make_node.1/Small
init: name='init7_s1_1' type=int64 shape=(1,) -- array([1])           -- ReshapeIsSqueezePattern.m1
Shape(x, end=2, start=1) -> x::Shape1:2
  Squeeze(x::Shape1:2) -> sym_size_int
    Cast(sym_size_int, to=1) -> sym_size_int::C1
      Range(init1_s_, sym_size_int::C1, init1_s_2) -> arange
        Unsqueeze(arange, init7_s1_1) -> view_default
          MatMul(x, view_default) -> output
output: name='output' type=dtype('float32') shape=['batch', 1]

ExportWithDimension1#

code: yobx.torch.testing._model_eval_cases.ExportWithDimension1

forward#

def forward(self, x):
    return x @ torch.arange(x.shape[1], dtype=torch.float32).reshape((-1, 1))

yobx#

inputs: #1[(T1s1x3,)]
shapes: dict(x:{0:DYNAMIC,1:DYNAMIC})

opset: domain='' version=21
input: name='x' type=dtype('float32') shape=['batch', 'channel']
init: name='init1_s_' type=float32 shape=() -- array([0.], dtype=float32)-- Opset.make_node.1/Small
init: name='init1_s_2' type=float32 shape=() -- array([1.], dtype=float32)-- Opset.make_node.1/Small
init: name='init7_s1_1' type=int64 shape=(1,) -- array([1])           -- ReshapeIsSqueezePattern.m1
Shape(x, end=2, start=1) -> x::Shape1:2
  Squeeze(x::Shape1:2) -> sym_size_int_2
    Cast(sym_size_int_2, to=1) -> sym_size_int_2::C1
      Range(init1_s_, sym_size_int_2::C1, init1_s_2) -> arange
        Unsqueeze(arange, init7_s1_1) -> reshape
          MatMul(x, reshape) -> output_0
output: name='output_0' type=dtype('float32') shape=['batch', 1]

dynamo-ir#

inputs: #1[(T1s1x3,)]
shapes: dict(x:{0:DYNAMIC,1:DYNAMIC})

opset: domain='' version=20
input: name='x' type=dtype('float32') shape=['s77', 's27']
init: name='val_3' type=float32 shape=() -- array([0.], dtype=float32)
init: name='val_5' type=float32 shape=() -- array([1.], dtype=float32)
init: name='val_9' type=int64 shape=(2,) -- array([-1,  1])
Shape(x, end=2, start=1) -> val_0
  Squeeze(val_0) -> sym_size_int_5
    Cast(sym_size_int_5, to=1) -> val_1
      Range(val_3, val_1, val_5) -> arange
        Reshape(arange, val_9, allowzero=1) -> view
          MatMul(x, view) -> matmul
output: name='matmul' type=dtype('float32') shape=['s77', 1]

tracing#

inputs: #1[(T1s1x3,)]
shapes: dict(x:{0:DYNAMIC,1:DYNAMIC})

opset: domain='' version=21
input: name='x' type=dtype('float32') shape=['batch', 'channel']
init: name='init7_s1_0' type=int64 shape=(1,) -- array([0])           -- Opset.make_node.1/Shape##Opset.make_node.1/Shape##Opset.make_node.1/Shape
init: name='init7_s1_1' type=int64 shape=(1,) -- array([1])           -- Opset.make_node.1/Shape##ReshapeIsSqueezePattern.m1
init: name='init1_s_' type=float32 shape=() -- array([0.], dtype=float32)-- Opset.make_node.1/Small
init: name='init1_s_2' type=float32 shape=() -- array([1.], dtype=float32)-- Opset.make_node.1/Small
Shape(x, end=2, start=1) -> _onx_gather_size2
  Squeeze(_onx_gather_size2, init7_s1_0) -> getitem_1
    Cast(getitem_1, to=1) -> getitem_1::C1
      Range(init1_s_, getitem_1::C1, init1_s_2) -> arange
        Unsqueeze(arange, init7_s1_1) -> reshape
          MatMul(x, reshape) -> output
output: name='output' type=dtype('float32') shape=['batch', 1]

new-tracing#

inputs: #1[(T1s1x3,)]
shapes: dict(x:{0:DYNAMIC,1:DYNAMIC})

opset: domain='' version=21
input: name='x' type=dtype('float32') shape=['batch', 'channel']
init: name='init1_s_' type=float32 shape=() -- array([0.], dtype=float32)-- Opset.make_node.1/Small
init: name='init1_s_2' type=float32 shape=() -- array([1.], dtype=float32)-- Opset.make_node.1/Small
init: name='init7_s1_1' type=int64 shape=(1,) -- array([1])           -- ReshapeIsSqueezePattern.m1
Shape(x, end=2, start=1) -> x::Shape1:2
  Squeeze(x::Shape1:2) -> sym_size_int
    Cast(sym_size_int, to=1) -> sym_size_int::C1
      Range(init1_s_, sym_size_int::C1, init1_s_2) -> arange
        Unsqueeze(arange, init7_s1_1) -> view_default
          MatMul(x, view_default) -> output
output: name='output' type=dtype('float32') shape=['batch', 1]

ExportWithNewConstant#

code: yobx.torch.testing._model_eval_cases.ExportWithNewConstant

forward#

def forward(self, x):
    new_shape = (x.shape[0], 1)
    ones = torch.ones(new_shape, dtype=x.dtype, device=x.device)
    return torch.cat([x, ones], dim=1)

yobx#

inputs: #2[(T1s4x4,),(T1s5x6,)]
shapes: dict(x:{0:Dim(batch),1:Dim(seq)})

opset: domain='' version=21
input: name='x' type=dtype('float32') shape=['batch', 'seq']
init: name='init7_s1_1' type=int64 shape=(1,) -- array([1])           -- GraphBuilder.make_shape_from_results.conc
Shape(x, end=1, start=0) -> x::Shape:1
  Concat(x::Shape:1, init7_s1_1, axis=0) -> _onx_concat_sym_size_int_1::UnSq0
    ConstantOfShape(_onx_concat_sym_size_int_1::UnSq0, value=[1.0]) -> ones
      Concat(x, ones, axis=1) -> output_0
output: name='output_0' type=dtype('float32') shape=['batch', 'seq+1']

dynamo-ir#

inputs: #2[(T1s4x4,),(T1s5x6,)]
shapes: dict(x:{0:Dim(batch),1:Dim(seq)})

opset: domain='' version=20
input: name='x' type=dtype('float32') shape=['batch', 'seq']
init: name='val_7' type=float32 shape=() -- array([1.], dtype=float32)
init: name='val_3' type=int64 shape=(1,) -- array([1])
Shape(x, end=1, start=0) -> val_0
  Concat(val_0, val_3, axis=0) -> val_4
    Expand(val_7, val_4) -> ones
      Concat(x, ones, axis=1) -> cat
output: name='cat' type=dtype('float32') shape=['batch', 'seq + 1']

tracing#

inputs: #2[(T1s4x4,),(T1s5x6,)]
shapes: dict(x:{0:Dim(batch),1:Dim(seq)})

opset: domain='' version=21
input: name='x' type=dtype('float32') shape=['batch', 'seq']
init: name='init7_s1_1' type=int64 shape=(1,) -- array([1])           -- Opset.make_node.1/Shape##GraphBuilder.make_shape_from_results.conc
Shape(x, end=1, start=0) -> _onx_gather_size
  Concat(_onx_gather_size, init7_s1_1, axis=0) -> _onx_concat_getitem::UnSq0
    ConstantOfShape(_onx_concat_getitem::UnSq0, value=[1.0]) -> ones
      Concat(x, ones, axis=1) -> output
output: name='output' type=dtype('float32') shape=['batch', 'seq+1']

new-tracing#

inputs: #2[(T1s4x4,),(T1s5x6,)]
shapes: dict(x:{0:Dim(batch),1:Dim(seq)})

opset: domain='' version=21
input: name='x' type=dtype('float32') shape=['batch', 'seq']
init: name='init7_s1_1' type=int64 shape=(1,) -- array([1])           -- GraphBuilder.make_shape_from_results.conc
Shape(x, end=1, start=0) -> x::Shape:1
  Concat(x::Shape:1, init7_s1_1, axis=0) -> _onx_concat_sym_size_int::UnSq0
    ConstantOfShape(_onx_concat_sym_size_int::UnSq0, value=[1.0]) -> ones
      Concat(x, ones, axis=1) -> output
output: name='output' type=dtype('float32') shape=['batch', 'seq+1']

ExportWithNewConstantTo#

code: yobx.torch.testing._model_eval_cases.ExportWithNewConstantTo

forward#

def forward(self, x):
    new_shape = (x.shape[0], 1)
    ones = torch.ones(new_shape, dtype=x.dtype)
    return torch.cat([x, ones.to(x.device)], dim=1)

yobx#

inputs: #2[(T1s4x4,),(T1s5x6,)]
shapes: dict(x:{0:Dim(batch),1:Dim(seq)})

opset: domain='' version=21
input: name='x' type=dtype('float32') shape=['batch', 'seq']
init: name='init7_s1_1' type=int64 shape=(1,) -- array([1])           -- GraphBuilder.make_shape_from_results.conc
Shape(x, end=1, start=0) -> x::Shape:1
  Concat(x::Shape:1, init7_s1_1, axis=0) -> _onx_concat_sym_size_int_1::UnSq0
    ConstantOfShape(_onx_concat_sym_size_int_1::UnSq0, value=[1.0]) -> ones
      Concat(x, ones, axis=1) -> output_0
output: name='output_0' type=dtype('float32') shape=['batch', 'seq+1']

dynamo-ir#

inputs: #2[(T1s4x4,),(T1s5x6,)]
shapes: dict(x:{0:Dim(batch),1:Dim(seq)})

opset: domain='' version=20
input: name='x' type=dtype('float32') shape=['batch', 'seq']
init: name='val_7' type=float32 shape=() -- array([1.], dtype=float32)
init: name='val_3' type=int64 shape=(1,) -- array([1])
Shape(x, end=1, start=0) -> val_0
  Concat(val_0, val_3, axis=0) -> val_4
    Expand(val_7, val_4) -> ones
      Concat(x, ones, axis=1) -> cat
output: name='cat' type=dtype('float32') shape=['batch', 'seq + 1']

tracing#

inputs: #2[(T1s4x4,),(T1s5x6,)]
shapes: dict(x:{0:Dim(batch),1:Dim(seq)})

opset: domain='' version=21
input: name='x' type=dtype('float32') shape=['batch', 'seq']
init: name='init7_s1_1' type=int64 shape=(1,) -- array([1])           -- Opset.make_node.1/Shape##GraphBuilder.make_shape_from_results.conc
Shape(x, end=1, start=0) -> _onx_gather_size
  Concat(_onx_gather_size, init7_s1_1, axis=0) -> _onx_concat_getitem::UnSq0
    ConstantOfShape(_onx_concat_getitem::UnSq0, value=[1.0]) -> ones
      Concat(x, ones, axis=1) -> output
output: name='output' type=dtype('float32') shape=['batch', 'seq+1']

new-tracing#

inputs: #2[(T1s4x4,),(T1s5x6,)]
shapes: dict(x:{0:Dim(batch),1:Dim(seq)})

opset: domain='' version=21
input: name='x' type=dtype('float32') shape=['batch', 'seq']
init: name='init7_s1_1' type=int64 shape=(1,) -- array([1])           -- GraphBuilder.make_shape_from_results.conc
Shape(x, end=1, start=0) -> x::Shape:1
  Concat(x::Shape:1, init7_s1_1, axis=0) -> _onx_concat_sym_size_int::UnSq0
    ConstantOfShape(_onx_concat_sym_size_int::UnSq0, value=[1.0]) -> ones
      Concat(x, ones, axis=1) -> output
output: name='output' type=dtype('float32') shape=['batch', 'seq+1']

InplaceAdd#

code: yobx.torch.testing._model_eval_cases.InplaceAdd

forward#

def forward(self, x):
    x += self.bias
    return x

yobx#

inputs: #2[(T1s3x4,),(T1s5x4,)]
shapes: dict(x:{0:Dim(batch)})

opset: domain='' version=21
input: name='x' type=dtype('float32') shape=['batch', 4]
init: name='c_bias' type=float32 shape=(1, 4) -- array([1., 1., 1., 1.], dtype=float32)-- DynamoInterpret.placeholder.0
Add(x, c_bias) -> output_0
output: name='output_0' type=dtype('float32') shape=['batch', 4]

dynamo-ir#

inputs: #2[(T1s3x4,),(T1s5x4,)]
shapes: dict(x:{0:Dim(batch)})

opset: domain='' version=20
input: name='x' type=dtype('float32') shape=['batch', 4]
init: name='bias' type=float32 shape=(1, 4) -- array([1., 1., 1., 1.], dtype=float32)
Add(x, bias) -> add_3
output: name='add_3' type=dtype('float32') shape=['batch', 4]

tracing#

inputs: #2[(T1s3x4,),(T1s5x4,)]
shapes: dict(x:{0:Dim(batch)})

opset: domain='' version=21
input: name='x' type=dtype('float32') shape=['batch', 4]
init: name='bias' type=float32 shape=(1, 4) -- array([1., 1., 1., 1.], dtype=float32)-- DynamoInterpret.get_attr.0
Add(x, bias) -> output
output: name='output' type=dtype('float32') shape=['batch', 4]

new-tracing#

inputs: #2[(T1s3x4,),(T1s5x4,)]
shapes: dict(x:{0:Dim(batch)})

opset: domain='' version=21
input: name='x' type=dtype('float32') shape=['batch', 4]
init: name='bias' type=float32 shape=(1, 4) -- array([1., 1., 1., 1.], dtype=float32)-- DynamoInterpret.placeholder.0
Add(x, bias) -> output
output: name='output' type=dtype('float32') shape=['batch', 4]

InplaceAdd2#

code: yobx.torch.testing._model_eval_cases.InplaceAdd2

forward#

def forward(self, x):
    x.add_(self.bias)
    return x

yobx#

inputs: #2[(T1s3x4,),(T1s5x4,)]
shapes: dict(x:{0:Dim(batch)})

opset: domain='' version=21
input: name='x' type=dtype('float32') shape=['batch', 4]
init: name='c_bias' type=float32 shape=(1, 4) -- array([1., 1., 1., 1.], dtype=float32)-- DynamoInterpret.placeholder.0
Add(x, c_bias) -> output_0
output: name='output_0' type=dtype('float32') shape=['batch', 4]

dynamo-ir#

inputs: #2[(T1s3x4,),(T1s5x4,)]
shapes: dict(x:{0:Dim(batch)})

opset: domain='' version=20
input: name='x' type=dtype('float32') shape=['batch', 4]
init: name='bias' type=float32 shape=(1, 4) -- array([1., 1., 1., 1.], dtype=float32)
Add(x, bias) -> add_3
output: name='add_3' type=dtype('float32') shape=['batch', 4]

tracing#

inputs: #2[(T1s3x4,),(T1s5x4,)]
shapes: dict(x:{0:Dim(batch)})

opset: domain='' version=21
input: name='x' type=dtype('float32') shape=['batch', 4]
init: name='bias' type=float32 shape=(1, 4) -- array([1., 1., 1., 1.], dtype=float32)-- DynamoInterpret.get_attr.0
Add(x, bias) -> output
output: name='output' type=dtype('float32') shape=['batch', 4]

new-tracing#

inputs: #2[(T1s3x4,),(T1s5x4,)]
shapes: dict(x:{0:Dim(batch)})

opset: domain='' version=21
input: name='x' type=dtype('float32') shape=['batch', 4]
init: name='bias' type=float32 shape=(1, 4) -- array([1., 1., 1., 1.], dtype=float32)-- DynamoInterpret.placeholder.0
Add(x, bias) -> output
output: name='output' type=dtype('float32') shape=['batch', 4]

InplaceAdd_Mul#

code: yobx.torch.testing._model_eval_cases.InplaceAdd_Mul

forward#

def forward(self, x):
    x.add_(self.bias)
    return x * 2

yobx#

inputs: #2[(T1s3x4,),(T1s5x4,)]
shapes: dict(x:{0:Dim(batch)})

opset: domain='' version=21
input: name='x' type=dtype('float32') shape=['batch', 4]
init: name='c_bias' type=float32 shape=(1, 4) -- array([1., 1., 1., 1.], dtype=float32)-- DynamoInterpret.placeholder.0
init: name='init1_s_::RSh1' type=float32 shape=(1,) -- array([2.], dtype=float32)-- GraphBuilder.constant_folding.from/fold(init1_s_,init7_s1_1)##init1_s_/shape_type_compute._cast_inputs.1(mul_Tensor)##init7_s1_1/Opset.make_node.1/Shape
Add(x, c_bias) -> add_
  Mul(add_, init1_s_::RSh1) -> output_0
output: name='output_0' type=dtype('float32') shape=['batch', 4]

dynamo-ir#

inputs: #2[(T1s3x4,),(T1s5x4,)]
shapes: dict(x:{0:Dim(batch)})

opset: domain='' version=20
input: name='x' type=dtype('float32') shape=['batch', 4]
init: name='bias' type=float32 shape=(1, 4) -- array([1., 1., 1., 1.], dtype=float32)
init: name='scalar_tensor_default' type=float32 shape=() -- array([2.], dtype=float32)
Add(x, bias) -> add_3
  Mul(add_3, scalar_tensor_default) -> mul_4
output: name='mul_4' type=dtype('float32') shape=['batch', 4]

tracing#

inputs: #2[(T1s3x4,),(T1s5x4,)]
shapes: dict(x:{0:Dim(batch)})

opset: domain='' version=21
input: name='x' type=dtype('float32') shape=['batch', 4]
init: name='bias' type=float32 shape=(1, 4) -- array([1., 1., 1., 1.], dtype=float32)-- DynamoInterpret.get_attr.0
init: name='init1_s_::RSh1' type=float32 shape=(1,) -- array([2.], dtype=float32)-- GraphBuilder.constant_folding.from/fold(init1_s_,init7_s1_1)##init1_s_/shape_type_compute._cast_inputs.1(mul)##init7_s1_1/Opset.make_node.1/Shape
Add(x, bias) -> add_
  Mul(add_, init1_s_::RSh1) -> output
output: name='output' type=dtype('float32') shape=['batch', 4]

new-tracing#

inputs: #2[(T1s3x4,),(T1s5x4,)]
shapes: dict(x:{0:Dim(batch)})

opset: domain='' version=21
input: name='x' type=dtype('float32') shape=['batch', 4]
init: name='bias' type=float32 shape=(1, 4) -- array([1., 1., 1., 1.], dtype=float32)-- DynamoInterpret.placeholder.0
init: name='init1_s_::RSh1' type=float32 shape=(1,) -- array([2.], dtype=float32)-- GraphBuilder.constant_folding.from/fold(init1_s_,init7_s1_1)##init1_s_/shape_type_compute._cast_inputs.1(mul_Tensor)##init7_s1_1/Opset.make_node.1/Shape
Add(x, bias) -> add__tensor
  Mul(add__tensor, init1_s_::RSh1) -> output
output: name='output' type=dtype('float32') shape=['batch', 4]

InplaceCloneAdd #

code: yobx.torch.testing._model_eval_cases.InplaceCloneAdd_

forward#

def forward(self, x):
    x = x.clone()
    x.add_(self.bias)
    return x

yobx#

inputs: #2[(T1s3x4,),(T1s5x4,)]
shapes: dict(x:{0:Dim(batch)})

opset: domain='' version=21
input: name='x' type=dtype('float32') shape=['batch', 4]
init: name='c_bias' type=float32 shape=(1, 4) -- array([1., 1., 1., 1.], dtype=float32)-- DynamoInterpret.placeholder.0
Add(x, c_bias) -> output_0
output: name='output_0' type=dtype('float32') shape=['batch', 4]

dynamo-ir#

inputs: #2[(T1s3x4,),(T1s5x4,)]
shapes: dict(x:{0:Dim(batch)})

opset: domain='' version=20
input: name='x' type=dtype('float32') shape=['batch', 4]
init: name='bias' type=float32 shape=(1, 4) -- array([1., 1., 1., 1.], dtype=float32)
Add(x, bias) -> add_6
output: name='add_6' type=dtype('float32') shape=['batch', 4]

tracing#

inputs: #2[(T1s3x4,),(T1s5x4,)]
shapes: dict(x:{0:Dim(batch)})

opset: domain='' version=21
input: name='x' type=dtype('float32') shape=['batch', 4]
init: name='bias' type=float32 shape=(1, 4) -- array([1., 1., 1., 1.], dtype=float32)-- DynamoInterpret.get_attr.0
Add(x, bias) -> output
output: name='output' type=dtype('float32') shape=['batch', 4]

new-tracing#

inputs: #2[(T1s3x4,),(T1s5x4,)]
shapes: dict(x:{0:Dim(batch)})

opset: domain='' version=21
input: name='x' type=dtype('float32') shape=['batch', 4]
init: name='bias' type=float32 shape=(1, 4) -- array([1., 1., 1., 1.], dtype=float32)-- DynamoInterpret.placeholder.0
Add(x, bias) -> output
output: name='output' type=dtype('float32') shape=['batch', 4]

InplaceSetItemEllipsis_1#

code: yobx.torch.testing._model_eval_cases.InplaceSetItemEllipsis_1

forward#

def forward(self, index, update):
    copy = update.new_zeros(self.params.shape)
    copy[..., index] = update
    return copy

yobx#

FAILED

L['update'].size()[0] = 8192 is not equal to L['index'].size()[0] = 4

The error above occurred when calling torch.export.export. If you would like to view some more information about this error, and get a list of all other errors that may occur in your export call, you can replace your `export()` call with `draft_export()`.

dynamo-ir#

FAILED

Failed to export the model with torch.export. [96mThis is step 1/3[0m of exporting the model to ONNX. Next steps:
- Modify the model code for `torch.export.export` to succeed. Refer to https://pytorch.org/docs/stable/generated/exportdb/index.html for more information.
- Debug `torch.export.export` and submit a PR to PyTorch.
- Create an issue in the PyTorch GitHub repository against the [96m*torch.export*[0m component and attach the full error stack as well as reproduction scripts.

## Exception summary

<class 'torch._dynamo.exc.UserError'>: L['update'].size()[0] = 8192 is not equal to L['index'].size()[0] = 4

The error above occurred when calling torch.export.export. If you would like to view some more information about this error, and get a list of all other errors that may occur in your export call, you can replace your `export()` call with `draft_export()`.

(Refer to the full stack trace above for more information.)

tracing#

inputs: #1[(T7s4,T1s8192x4)]
shapes: dict(index:{0:Dim(batch)},update:{0:Dim(batch),1:DYNAMIC})

opset: domain='' version=21
opset: domain='aten' version=1
input: name='index' type=dtype('int64') shape=['batch']
input: name='update' type=dtype('float32') shape=['batch', 'channel']
init: name='init7_s3_1_8192_4' type=int64 shape=(3,) -- array([   1, 8192,    4])-- Opset.make_node.1/Shape
ConstantOfShape(init7_s3_1_8192_4, value=[0.0]) -> new_zeros
  aten_setitem[aten](new_zeros, index, update) -> output
output: name='output' type=dtype('float32') shape=[1, 8192, 4]
----- function name=aten_setitem domain=aten
----- doc_string: -- function_options=FunctionOptions(export_as_function=...
opset: domain='' version=21
input: 'new_zeros'
input: 'index'
input: 'update'
Constant(value=[1, 1, -1]) -> init7_s3_1_1_-1
  Reshape(index, init7_s3_1_1_-1) -> index::RSh1x1x-1
Shape(new_zeros, end=2, start=0) -> new_zeros::Shape:2
Shape(index) -> index::Shape:
  Concat(new_zeros::Shape:2, index::Shape:, axis=0) -> _onx_concat_new_zeros::Shape:2
    Expand(index::RSh1x1x-1, _onx_concat_new_zeros::Shape:2) -> _onx_expand_index::RSh1x1x-1
Expand(update, _onx_concat_new_zeros::Shape:2) -> _onx_expand_update
  ScatterElements(new_zeros, _onx_expand_index::RSh1x1x-1, _onx_expand_update, axis=2) -> setitem
output: name='setitem' type=? shape=?

new-tracing#

inputs: #1[(T7s4,T1s8192x4)]
shapes: dict(index:{0:Dim(batch)},update:{0:Dim(batch),1:DYNAMIC})

opset: domain='' version=21
opset: domain='aten' version=1
input: name='index' type=dtype('int64') shape=['batch']
input: name='update' type=dtype('float32') shape=['batch', 'channel']
init: name='init7_s3_1_8192_4' type=int64 shape=(3,) -- array([   1, 8192,    4])-- Opset.make_node.1/Shape
ConstantOfShape(init7_s3_1_8192_4, value=[0.0]) -> new_zeros_default
  aten_setitem[aten](new_zeros_default, index, update) -> output
output: name='output' type=dtype('float32') shape=[1, 8192, 4]
----- function name=aten_setitem domain=aten
----- doc_string: -- function_options=FunctionOptions(export_as_function=...
opset: domain='' version=21
input: 'new_zeros_default'
input: 'index'
input: 'update'
Constant(value=[1, 1, -1]) -> init7_s3_1_1_-1
  Reshape(index, init7_s3_1_1_-1) -> index::RSh1x1x-1
Shape(new_zeros_default, end=2, start=0) -> new_zeros_default::Shape:2
Shape(index) -> index::Shape:
  Concat(new_zeros_default::Shape:2, index::Shape:, axis=0) -> _onx_concat_new_zeros_default::Shape:2
    Expand(index::RSh1x1x-1, _onx_concat_new_zeros_default::Shape:2) -> _onx_expand_index::RSh1x1x-1
Expand(update, _onx_concat_new_zeros_default::Shape:2) -> _onx_expand_update
  ScatterElements(new_zeros_default, _onx_expand_index::RSh1x1x-1, _onx_expand_update, axis=2) -> setitem
output: name='setitem' type=? shape=?

InplaceSetItemEllipsis_2#

code: yobx.torch.testing._model_eval_cases.InplaceSetItemEllipsis_2

forward#

def forward(self, index, update):
    copy = self.params.clone()
    copy[..., index] = update
    return copy

yobx#

FAILED

L['update'].size()[0] = 8192 is not equal to L['index'].size()[0] = 4

The error above occurred when calling torch.export.export. If you would like to view some more information about this error, and get a list of all other errors that may occur in your export call, you can replace your `export()` call with `draft_export()`.

dynamo-ir#

FAILED

Failed to export the model with torch.export. [96mThis is step 1/3[0m of exporting the model to ONNX. Next steps:
- Modify the model code for `torch.export.export` to succeed. Refer to https://pytorch.org/docs/stable/generated/exportdb/index.html for more information.
- Debug `torch.export.export` and submit a PR to PyTorch.
- Create an issue in the PyTorch GitHub repository against the [96m*torch.export*[0m component and attach the full error stack as well as reproduction scripts.

## Exception summary

<class 'torch._dynamo.exc.UserError'>: L['update'].size()[0] = 8192 is not equal to L['index'].size()[0] = 4

The error above occurred when calling torch.export.export. If you would like to view some more information about this error, and get a list of all other errors that may occur in your export call, you can replace your `export()` call with `draft_export()`.

(Refer to the full stack trace above for more information.)

tracing#

inputs: #1[(T7s4,T1s8192x4)]
shapes: dict(index:{0:Dim(batch)},update:{0:Dim(batch),1:DYNAMIC})

opset: domain='' version=21
opset: domain='aten' version=1
input: name='index' type=dtype('int64') shape=['batch']
input: name='update' type=dtype('float32') shape=['batch', 'channel']
init: name='_tensor_constant0' type=float32 shape=(1, 8192, 6)        -- DynamoInterpret.get_attr.0
aten_setitem[aten](_tensor_constant0, index, update) -> output
output: name='output' type=dtype('float32') shape=[1, 8192, 6]
----- function name=aten_setitem domain=aten
----- doc_string: -- function_options=FunctionOptions(export_as_function=...
opset: domain='' version=21
input: '_tensor_constant0'
input: 'index'
input: 'update'
Constant(value=[1, 1, -1]) -> init7_s3_1_1_-1
  Reshape(index, init7_s3_1_1_-1) -> index::RSh1x1x-1
Shape(_tensor_constant0, end=2, start=0) -> _tensor_constant0::Shape:2
Shape(index) -> index::Shape:
  Concat(_tensor_constant0::Shape:2, index::Shape:, axis=0) -> _onx_concat__tensor_constant0::Shape:2
    Expand(index::RSh1x1x-1, _onx_concat__tensor_constant0::Shape:2) -> _onx_expand_index::RSh1x1x-1
Expand(update, _onx_concat__tensor_constant0::Shape:2) -> _onx_expand_update
  ScatterElements(_tensor_constant0, _onx_expand_index::RSh1x1x-1, _onx_expand_update, axis=2) -> setitem
output: name='setitem' type=? shape=?

new-tracing#

inputs: #1[(T7s4,T1s8192x4)]
shapes: dict(index:{0:Dim(batch)},update:{0:Dim(batch),1:DYNAMIC})

opset: domain='' version=21
opset: domain='aten' version=1
input: name='index' type=dtype('int64') shape=['batch']
input: name='update' type=dtype('float32') shape=['batch', 'channel']
init: name='params' type=float32 shape=(1, 8192, 6)                   -- DynamoInterpret.placeholder.0
aten_setitem[aten](params, index, update) -> output
output: name='output' type=dtype('float32') shape=[1, 8192, 6]
----- function name=aten_setitem domain=aten
----- doc_string: -- function_options=FunctionOptions(export_as_function=...
opset: domain='' version=21
input: 'clone_default'
input: 'index'
input: 'update'
Constant(value=[1, 1, -1]) -> init7_s3_1_1_-1
  Reshape(index, init7_s3_1_1_-1) -> index::RSh1x1x-1
Shape(clone_default, end=2, start=0) -> clone_default::Shape:2
Shape(index) -> index::Shape:
  Concat(clone_default::Shape:2, index::Shape:, axis=0) -> _onx_concat_clone_default::Shape:2
    Expand(index::RSh1x1x-1, _onx_concat_clone_default::Shape:2) -> _onx_expand_index::RSh1x1x-1
Expand(update, _onx_concat_clone_default::Shape:2) -> _onx_expand_update
  ScatterElements(clone_default, _onx_expand_index::RSh1x1x-1, _onx_expand_update, axis=2) -> setitem
output: name='setitem' type=? shape=?

InplaceSetItemExp#

code: yobx.torch.testing._model_eval_cases.InplaceSetItemExp

forward#

def forward(self, x):
    K_33 = x.clone()
    torch.exp_(K_33[2:-2, 2:-2, :-1])
    return K_33

yobx#

FAILED

Constraints violated (batch)! For more information, run with TORCH_LOGS="+dynamic".
  - Not all values of batch = L['x'].size()[0] in the specified range satisfy the generated guard 6 <= L['x'].size()[0] and L['x'].size()[0] <= IntInfinity()
Suggested fixes:
  batch = Dim('batch', min=6)

The error above occurred when calling torch.export.export. If you would like to view some more information about this error, and get a list of all other errors that may occur in your export call, you can replace your `export()` call with `draft_export()`.

dynamo-ir#

inputs: #2[(T1s7x9x11,),(T1s8x9x11,)]
shapes: dict(x:{0:Dim(batch)})

opset: domain='' version=20
input: name='x' type=dtype('float32') shape=['batch', 9, 11]
init: name='val_11' type=int64 shape=(1,) -- array([2])
init: name='val_14' type=int64 shape=(1,) -- array([-2])
init: name='val_17' type=int64 shape=(1,) -- array([0])
init: name='val_28' type=int64 shape=(1,) -- array([1])
init: name='val_35' type=int64 shape=(1,) -- array([-1])
init: name='val_6' type=int64 shape=() -- array([0])
init: name='val_7' type=int64 shape=() -- array([2])
init: name='val_25' type=int64 shape=() -- array([1])
Shape(x, start=0) -> val_86
  Gather(val_86, val_6, axis=0) -> val_87
    Range(val_6, val_87, val_25) -> val_88
      Slice(val_88, val_11, val_14, val_17, val_28) -> val_92
        Unsqueeze(val_92, val_35) -> val_93
Slice(x, val_11, val_14, val_17, val_28) -> slice_1
  Slice(slice_1, val_11, val_14, val_28, val_28) -> slice_2
    Slice(slice_2, val_17, val_35, val_11, val_28) -> slice_3
      Exp(slice_3) -> exp
        Transpose(exp, perm=[2,1,0]) -> val_70
    Shape(slice_2, start=0) -> val_61
      Gather(val_61, val_7, axis=0) -> val_62
        Range(val_6, val_62, val_25) -> val_63
          Slice(val_63, val_17, val_35, val_17, val_28) -> val_67
            Unsqueeze(val_67, val_35) -> val_69
    Transpose(slice_2, perm=[2,1,0]) -> val_71
      ScatterND(val_71, val_69, val_70, reduction=b'none') -> val_72
        Transpose(val_72, perm=[1,2,0]) -> val_82
  Shape(slice_1, start=0) -> val_74
    Gather(val_74, val_25, axis=0) -> val_75
      Range(val_6, val_75, val_25) -> val_76
        Slice(val_76, val_11, val_14, val_17, val_28) -> val_80
          Unsqueeze(val_80, val_35) -> val_81
  Transpose(slice_1, perm=[1,0,2]) -> val_83
    ScatterND(val_83, val_81, val_82, reduction=b'none') -> val_84
      Transpose(val_84, perm=[1,0,2]) -> slice_scatter_1
        ScatterND(x, val_93, slice_scatter_1, reduction=b'none') -> slice_scatter_2
output: name='slice_scatter_2' type=dtype('float32') shape=['batch', 9, 11]

tracing#

inputs: #2[(T1s7x9x11,),(T1s8x9x11,)]
shapes: dict(x:{0:Dim(batch)})

opset: domain='' version=21
input: name='x' type=dtype('float32') shape=['batch', 9, 11]
init: name='init7_s3_2_2_0' type=int64 shape=(3,) -- array([2, 2, 0]) -- Opset.make_node.1/Shape
init: name='init7_s3_-2_-2_-1' type=int64 shape=(3,) -- array([-2, -2, -1])-- Opset.make_node.1/Shape
init: name='init7_s3_0_1_2' type=int64 shape=(3,) -- array([0, 1, 2]) -- Opset.make_node.1/Shape
init: name='init7_s6_' type=int64 shape=(6,)                          -- Opset.make_node.1/Shape##Opset.make_node.1/Shape
init: name='init1_s_' type=float32 shape=() -- array([1.], dtype=float32)-- Opset.make_node.1/Small
Slice(x, init7_s3_2_2_0, init7_s3_-2_-2_-1, init7_s3_0_1_2) -> _onx_slice_clone
  Exp(_onx_slice_clone) -> _onx_exp_slice_clone
    Shape(_onx_exp_slice_clone) -> _onx_exp_slice_clone::Shape:
      ConstantOfShape(_onx_exp_slice_clone::Shape:, value=[0.0]) -> _onx_constantofshape_exp_slice_clone::Shape:
        Pad(_onx_constantofshape_exp_slice_clone::Shape:, init7_s6_, init1_s_) -> clone_mask
          Mul(x, clone_mask) -> _onx_mul_clone
    Pad(_onx_exp_slice_clone, init7_s6_) -> _onx_exp_slice_clone_padded
      Add(_onx_mul_clone, _onx_exp_slice_clone_padded) -> output
output: name='output' type=dtype('float32') shape=['batch', 9, 11]

new-tracing#

inputs: #2[(T1s7x9x11,),(T1s8x9x11,)]
shapes: dict(x:{0:Dim(batch)})

opset: domain='' version=21
input: name='x' type=dtype('float32') shape=['batch', 9, 11]
init: name='init7_s3_2_2_0' type=int64 shape=(3,) -- array([2, 2, 0]) -- Opset.make_node.1/Shape
init: name='init7_s3_-2_-2_-1' type=int64 shape=(3,) -- array([-2, -2, -1])-- Opset.make_node.1/Shape
init: name='init7_s3_0_1_2' type=int64 shape=(3,) -- array([0, 1, 2]) -- Opset.make_node.1/Shape
init: name='init7_s6_' type=int64 shape=(6,)                          -- Opset.make_node.1/Shape##Opset.make_node.1/Shape
init: name='init1_s_' type=float32 shape=() -- array([1.], dtype=float32)-- Opset.make_node.1/Small
Slice(x, init7_s3_2_2_0, init7_s3_-2_-2_-1, init7_s3_0_1_2) -> _onx_slice_clone_default
  Exp(_onx_slice_clone_default) -> _onx_exp_slice_clone_default
    Shape(_onx_exp_slice_clone_default) -> _onx_exp_slice_clone_default::Shape:
      ConstantOfShape(_onx_exp_slice_clone_default::Shape:, value=[0.0]) -> _onx_constantofshape_exp_slice_clone_default::Shape:
        Pad(_onx_constantofshape_exp_slice_clone_default::Shape:, init7_s6_, init1_s_) -> clone_default_mask
          Mul(x, clone_default_mask) -> _onx_mul_clone_default
    Pad(_onx_exp_slice_clone_default, init7_s6_) -> _onx_exp_slice_clone_default_padded
      Add(_onx_mul_clone_default, _onx_exp_slice_clone_default_padded) -> output
output: name='output' type=dtype('float32') shape=['batch', 9, 11]

InplaceSetItemMask#

code: yobx.torch.testing._model_eval_cases.InplaceSetItemMask

forward#

def forward(self, x):
    mask = x.to(bool)
    x[mask] = 2
    return x

yobx#

inputs: #2[(T1s2x3x3,),(T1s3x3x3,)]
shapes: dict(x:{0:Dim(batch)})

opset: domain='' version=21
input: name='x' type=dtype('float32') shape=['batch', 3, 3]
init: name='c_lifted_tensor_0' type=float32 shape=() -- array([2.], dtype=float32)-- DynamoInterpret.placeholder.0
Cast(x, to=9) -> to
  Where(to, c_lifted_tensor_0, x) -> output_0
output: name='output_0' type=dtype('float32') shape=['batch', 3, 3]

dynamo-ir#

inputs: #2[(T1s2x3x3,),(T1s3x3x3,)]
shapes: dict(x:{0:Dim(batch)})

opset: domain='' version=20
input: name='x' type=dtype('float32') shape=['batch', 3, 3]
init: name='clone' type=float32 shape=() -- array([2.], dtype=float32)
Cast(x, to=9) -> _to_copy
  Where(_to_copy, clone, x) -> index_put
output: name='index_put' type=dtype('float32') shape=['batch', 3, 3]

tracing#

inputs: #2[(T1s2x3x3,),(T1s3x3x3,)]
shapes: dict(x:{0:Dim(batch)})

opset: domain='' version=21
opset: domain='aten' version=1
input: name='x' type=dtype('float32') shape=['batch', 3, 3]
Cast(x, to=9) -> to
  aten_setitem[aten](x, to) -> output
output: name='output' type=dtype('float32') shape=['batch', 3, 3]
----- function name=aten_setitem domain=aten
----- doc_string: -- function_options=FunctionOptions(export_as_function=...
opset: domain='' version=21
input: 'x'
input: 'to'
Constant(value=2.0) -> init1_s_
  Where(to, init1_s_, x) -> setitem
output: name='setitem' type=? shape=?

new-tracing#

inputs: #2[(T1s2x3x3,),(T1s3x3x3,)]
shapes: dict(x:{0:Dim(batch)})

opset: domain='' version=21
opset: domain='aten' version=1
input: name='x' type=dtype('float32') shape=['batch', 3, 3]
Cast(x, to=9) -> _to_copy_default
  aten_setitem[aten](x, _to_copy_default) -> output
output: name='output' type=dtype('float32') shape=['batch', 3, 3]
----- function name=aten_setitem domain=aten
----- doc_string: -- function_options=FunctionOptions(export_as_function=...
opset: domain='' version=21
input: 'x'
input: '_to_copy_default'
Constant(value=2.0) -> init1_s_
  Where(_to_copy_default, init1_s_, x) -> setitem
output: name='setitem' type=? shape=?

InplaceSetItemSquare#

code: yobx.torch.testing._model_eval_cases.InplaceSetItemSquare

forward#

def forward(self, x):
    x[:2, :3] = 1
    return x

yobx#

inputs: #2[(T1s5x5,),(T1s7x5,)]
shapes: dict(x:{0:Dim(batch)})

opset: domain='' version=21
input: name='x' type=dtype('float32') shape=['batch', 5]
init: name='init7_s1_0' type=int64 shape=(1,) -- array([0])           -- Opset.make_node.1/Shape##Opset.make_node.1/Shape##Opset.make_node.1/Shape##Opset.make_node.1/Shape##Opset.make_node.1/Shape##Opset.make_node.1/Shape##Opset.make_node.1/Shape
init: name='init7_s1_2' type=int64 shape=(1,) -- array([2])           -- Opset.make_node.1/Shape##Opset.make_node.1/Shape##Opset.make_node.1/Shape
init: name='init7_s1_1' type=int64 shape=(1,) -- array([1])           -- Opset.make_node.1/Shape##Opset.make_node.1/Shape##ReshapeIsSqueezePattern.m1
init: name='init7_s_0' type=int64 shape=() -- array([0])              -- Opset.make_node.1/Shape
init: name='init7_s_1' type=int64 shape=() -- array([1])              -- Opset.make_node.1/Shape
init: name='init7_s3_0_1_2::RSh-1x1' type=int64 shape=(3, 1) -- array([0, 1, 2])-- GraphBuilder.constant_folding.from/fold(init7_s2_-1_1,init7_s3_0_1_2)##init7_s3_0_1_2/Opset.make_node.1/Shape##init7_s2_-1_1/Opset.make_node.1/Shape##Opset.make_node.1/Shape
init: name='fill::T10' type=float32 shape=(3, 2)                      -- GraphBuilder.constant_folding.from/fold(fill)##fill/
Shape(x, end=1, start=0) -> x::Shape:1
  Squeeze(x::Shape:1) -> x::Shape:1::Sq
    Range(init7_s_0, x::Shape:1::Sq, init7_s_1) -> _onx_range_init7_s_0
      Slice(_onx_range_init7_s_0, init7_s1_0, init7_s1_2, init7_s1_0, init7_s1_1) -> _onx_slice_range_init7_s_0
        Unsqueeze(_onx_slice_range_init7_s_0, init7_s1_1) -> _onx_slice_range_init7_s_0::RSh-1x1
Slice(x, init7_s1_0, init7_s1_2, init7_s1_0) -> slice_3
  Transpose(slice_3, perm=[1,0]) -> slice_3::T10
    ScatterND(slice_3::T10, init7_s3_0_1_2::RSh-1x1, fill::T10) -> _onx_scatternd_slice_3::T10
      Transpose(_onx_scatternd_slice_3::T10, perm=[1,0]) -> slice_scatter
        ScatterND(x, _onx_slice_range_init7_s_0::RSh-1x1, slice_scatter) -> output_0
          Identity(output_0) -> output_1
output: name='output_1' type=dtype('float32') shape=['batch', 5]

dynamo-ir#

inputs: #2[(T1s5x5,),(T1s7x5,)]
shapes: dict(x:{0:Dim(batch)})

opset: domain='' version=20
input: name='x' type=dtype('float32') shape=['batch', 5]
init: name='val_3' type=int64 shape=(1,) -- array([0])
init: name='val_7' type=int64 shape=(1,) -- array([2])
init: name='val_18' type=int64 shape=(1,) -- array([3])
init: name='val_22' type=int64 shape=(1,) -- array([1])
init: name='value_0' type=float32 shape=() -- array([1.], dtype=float32)
init: name='val_0' type=int64 shape=() -- array([0])
init: name='val_19' type=int64 shape=() -- array([1])
init: name='val_42' type=int64 shape=(1,) -- array([-1])
Shape(x, start=0) -> val_48
  Gather(val_48, val_0, axis=0) -> val_49
    Range(val_0, val_49, val_19) -> val_50
      Slice(val_50, val_3, val_7, val_3, val_22) -> val_54
        Unsqueeze(val_54, val_42) -> val_55
Slice(x, val_3, val_7, val_3, val_22) -> slice_1
  Slice(slice_1, val_3, val_18, val_22, val_22) -> slice_2
    Shape(slice_2) -> shape
      Expand(value_0, shape) -> fill
        Transpose(fill, perm=[1,0]) -> val_44
  Shape(slice_1, start=0) -> val_35
    Gather(val_35, val_19, axis=0) -> val_36
      Range(val_0, val_36, val_19) -> val_37
        Slice(val_37, val_3, val_18, val_3, val_22) -> val_41
          Unsqueeze(val_41, val_42) -> val_43
  Transpose(slice_1, perm=[1,0]) -> val_45
    ScatterND(val_45, val_43, val_44, reduction=b'none') -> val_46
      Transpose(val_46, perm=[1,0]) -> slice_scatter
        ScatterND(x, val_55, slice_scatter, reduction=b'none') -> slice_scatter_1
output: name='slice_scatter_1' type=dtype('float32') shape=['batch', 5]

tracing#

inputs: #2[(T1s5x5,),(T1s7x5,)]
shapes: dict(x:{0:Dim(batch)})

opset: domain='' version=21
opset: domain='aten' version=1
input: name='x' type=dtype('float32') shape=['batch', 5]
aten_setitem[aten](x) -> output
output: name='output' type=dtype('float32') shape=['batch', 5]
----- function name=aten_setitem domain=aten
----- doc_string: -- function_options=FunctionOptions(export_as_function=...
opset: domain='' version=21
input: 'x'
Constant(value=1.0) -> setitem_val
Constant(value=[0]) -> init7_s1_0
  Concat(init7_s1_0, init7_s1_0, axis=0) -> _onx_concat_init7_s1_0
Constant(value=2) -> init7_s_2
Constant(value=3) -> init7_s_3
Shape(x, end=1, start=0) -> x::Shape:1
  Squeeze(x::Shape:1, init7_s1_0) -> x::Shape:1::Sq0
  Sub(x::Shape:1::Sq0, init7_s_2) -> _onx_sub_x::Shape:1::Sq0
  Unsqueeze(_onx_sub_x::Shape:1::Sq0, init7_s1_0) -> _onx_sub_x::Shape:1::Sq0::UnSq0
Shape(x, end=2, start=1) -> x::Shape1:2
  Squeeze(x::Shape1:2, init7_s1_0) -> x::Shape1:2::Sq0
  Sub(x::Shape1:2::Sq0, init7_s_3) -> _onx_sub_x::Shape1:2::Sq0
  Unsqueeze(_onx_sub_x::Shape1:2::Sq0, init7_s1_0) -> _onx_sub_x::Shape1:2::Sq0::UnSq0
    Concat(_onx_sub_x::Shape:1::Sq0::UnSq0, _onx_sub_x::Shape1:2::Sq0::UnSq0, axis=0) -> _onx_concat_sub_x::Shape:1::Sq0::UnSq0
    Add(_onx_concat_init7_s1_0, _onx_concat_sub_x::Shape:1::Sq0::UnSq0) -> _onx_add_concat_init7_s1_0
Shape(x) -> x::Shape:
  Sub(x::Shape:, _onx_add_concat_init7_s1_0) -> _onx_sub_x::Shape:
  Expand(setitem_val, _onx_sub_x::Shape:) -> _onx_expand_setitem_val
    Shape(_onx_expand_setitem_val) -> _onx_expand_setitem_val::Shape:
      ConstantOfShape(_onx_expand_setitem_val::Shape:, value=[0.0]) -> _onx_constantofshape_expand_setitem_val::Shape:
  Unsqueeze(_onx_sub_x::Shape:1::Sq0, init7_s1_0) -> _onx_sub_x::Shape:1::Sq0::UnSq02
Unsqueeze(_onx_sub_x::Shape1:2::Sq0, init7_s1_0) -> _onx_sub_x::Shape1:2::Sq0::UnSq02
  Concat(init7_s1_0, init7_s1_0, _onx_sub_x::Shape:1::Sq0::UnSq02, _onx_sub_x::Shape1:2::Sq0::UnSq02, axis=0) -> _onx_concat_init7_s1_02
    Pad(_onx_expand_setitem_val, _onx_concat_init7_s1_02) -> _onx_expand_setitem_val_padded
  Pad(_onx_constantofshape_expand_setitem_val::Shape:, _onx_concat_init7_s1_02, setitem_val) -> x_mask
    Mul(x, x_mask) -> _onx_mul_x
      Add(_onx_mul_x, _onx_expand_setitem_val_padded) -> setitem
output: name='setitem' type=? shape=?

new-tracing#

inputs: #2[(T1s5x5,),(T1s7x5,)]
shapes: dict(x:{0:Dim(batch)})

opset: domain='' version=21
opset: domain='aten' version=1
input: name='x' type=dtype('float32') shape=['batch', 5]
aten_setitem[aten](x) -> output
output: name='output' type=dtype('float32') shape=['batch', 5]
----- function name=aten_setitem domain=aten
----- doc_string: -- function_options=FunctionOptions(export_as_function=...
opset: domain='' version=21
input: 'x'
Constant(value=1.0) -> setitem_val
Constant(value=[0]) -> init7_s1_0
  Concat(init7_s1_0, init7_s1_0, axis=0) -> _onx_concat_init7_s1_0
Constant(value=2) -> init7_s_2
Constant(value=3) -> init7_s_3
Shape(x, end=1, start=0) -> x::Shape:1
  Squeeze(x::Shape:1, init7_s1_0) -> x::Shape:1::Sq0
  Sub(x::Shape:1::Sq0, init7_s_2) -> _onx_sub_x::Shape:1::Sq0
  Unsqueeze(_onx_sub_x::Shape:1::Sq0, init7_s1_0) -> _onx_sub_x::Shape:1::Sq0::UnSq0
Shape(x, end=2, start=1) -> x::Shape1:2
  Squeeze(x::Shape1:2, init7_s1_0) -> x::Shape1:2::Sq0
  Sub(x::Shape1:2::Sq0, init7_s_3) -> _onx_sub_x::Shape1:2::Sq0
  Unsqueeze(_onx_sub_x::Shape1:2::Sq0, init7_s1_0) -> _onx_sub_x::Shape1:2::Sq0::UnSq0
    Concat(_onx_sub_x::Shape:1::Sq0::UnSq0, _onx_sub_x::Shape1:2::Sq0::UnSq0, axis=0) -> _onx_concat_sub_x::Shape:1::Sq0::UnSq0
    Add(_onx_concat_init7_s1_0, _onx_concat_sub_x::Shape:1::Sq0::UnSq0) -> _onx_add_concat_init7_s1_0
Shape(x) -> x::Shape:
  Sub(x::Shape:, _onx_add_concat_init7_s1_0) -> _onx_sub_x::Shape:
  Expand(setitem_val, _onx_sub_x::Shape:) -> _onx_expand_setitem_val
    Shape(_onx_expand_setitem_val) -> _onx_expand_setitem_val::Shape:
      ConstantOfShape(_onx_expand_setitem_val::Shape:, value=[0.0]) -> _onx_constantofshape_expand_setitem_val::Shape:
  Unsqueeze(_onx_sub_x::Shape:1::Sq0, init7_s1_0) -> _onx_sub_x::Shape:1::Sq0::UnSq02
Unsqueeze(_onx_sub_x::Shape1:2::Sq0, init7_s1_0) -> _onx_sub_x::Shape1:2::Sq0::UnSq02
  Concat(init7_s1_0, init7_s1_0, _onx_sub_x::Shape:1::Sq0::UnSq02, _onx_sub_x::Shape1:2::Sq0::UnSq02, axis=0) -> _onx_concat_init7_s1_02
    Pad(_onx_expand_setitem_val, _onx_concat_init7_s1_02) -> _onx_expand_setitem_val_padded
  Pad(_onx_constantofshape_expand_setitem_val::Shape:, _onx_concat_init7_s1_02, setitem_val) -> x_mask
    Mul(x, x_mask) -> _onx_mul_x
      Add(_onx_mul_x, _onx_expand_setitem_val_padded) -> setitem
output: name='setitem' type=? shape=?

InplaceSetItemSquareAdd#

code: yobx.torch.testing._model_eval_cases.InplaceSetItemSquareAdd

forward#

def forward(self, x):
    x[:2, :3] = 1
    return x + 2

yobx#

inputs: #2[(T1s5x5,),(T1s7x5,)]
shapes: dict(x:{0:Dim(batch)})

opset: domain='' version=21
input: name='x' type=dtype('float32') shape=['batch', 5]
init: name='init7_s1_0' type=int64 shape=(1,) -- array([0])           -- Opset.make_node.1/Shape##Opset.make_node.1/Shape##Opset.make_node.1/Shape##Opset.make_node.1/Shape##Opset.make_node.1/Shape##Opset.make_node.1/Shape##Opset.make_node.1/Shape
init: name='init7_s1_2' type=int64 shape=(1,) -- array([2])           -- Opset.make_node.1/Shape##Opset.make_node.1/Shape##Opset.make_node.1/Shape
init: name='init7_s1_1' type=int64 shape=(1,) -- array([1])           -- Opset.make_node.1/Shape##Opset.make_node.1/Shape##Opset.make_node.1/Shape##ReshapeIsSqueezePattern.m1
init: name='init7_s_0' type=int64 shape=() -- array([0])              -- Opset.make_node.1/Shape
init: name='init7_s_1' type=int64 shape=() -- array([1])              -- Opset.make_node.1/Shape
init: name='init7_s3_0_1_2::RSh-1x1' type=int64 shape=(3, 1) -- array([0, 1, 2])-- GraphBuilder.constant_folding.from/fold(init7_s2_-1_1,init7_s3_0_1_2)##init7_s3_0_1_2/Opset.make_node.1/Shape##init7_s2_-1_1/Opset.make_node.1/Shape##Opset.make_node.1/Shape
init: name='fill::T10' type=float32 shape=(3, 2)                      -- GraphBuilder.constant_folding.from/fold(fill)##fill/
init: name='init1_s_::RSh1' type=float32 shape=(1,) -- array([2.], dtype=float32)-- GraphBuilder.constant_folding.from/fold(init1_s_,init7_s1_1)##init1_s_/shape_type_compute._cast_inputs.0##init7_s1_1/Opset.make_node.1/Shape##Opset.make_node.1/Shape##Opset.make_node.1/Shape
Shape(x, end=1, start=0) -> x::Shape:1
  Squeeze(x::Shape:1) -> x::Shape:1::Sq
    Range(init7_s_0, x::Shape:1::Sq, init7_s_1) -> _onx_range_init7_s_0
      Slice(_onx_range_init7_s_0, init7_s1_0, init7_s1_2, init7_s1_0, init7_s1_1) -> _onx_slice_range_init7_s_0
        Unsqueeze(_onx_slice_range_init7_s_0, init7_s1_1) -> _onx_slice_range_init7_s_0::RSh-1x1
Slice(x, init7_s1_0, init7_s1_2, init7_s1_0) -> slice_3
  Transpose(slice_3, perm=[1,0]) -> slice_3::T10
    ScatterND(slice_3::T10, init7_s3_0_1_2::RSh-1x1, fill::T10) -> _onx_scatternd_slice_3::T10
      Transpose(_onx_scatternd_slice_3::T10, perm=[1,0]) -> slice_scatter
        ScatterND(x, _onx_slice_range_init7_s_0::RSh-1x1, slice_scatter) -> output_0
          Add(output_0, init1_s_::RSh1) -> output_1
output: name='output_1' type=dtype('float32') shape=['batch', 5]

dynamo-ir#

inputs: #2[(T1s5x5,),(T1s7x5,)]
shapes: dict(x:{0:Dim(batch)})

opset: domain='' version=20
input: name='x' type=dtype('float32') shape=['batch', 5]
init: name='val_3' type=int64 shape=(1,) -- array([0])
init: name='val_7' type=int64 shape=(1,) -- array([2])
init: name='val_18' type=int64 shape=(1,) -- array([3])
init: name='val_22' type=int64 shape=(1,) -- array([1])
init: name='value_0' type=float32 shape=() -- array([1.], dtype=float32)
init: name='scalar_tensor_default' type=float32 shape=() -- array([2.], dtype=float32)
init: name='val_0' type=int64 shape=() -- array([0])
init: name='val_19' type=int64 shape=() -- array([1])
init: name='val_42' type=int64 shape=(1,) -- array([-1])
Shape(x, start=0) -> val_48
  Gather(val_48, val_0, axis=0) -> val_49
    Range(val_0, val_49, val_19) -> val_50
      Slice(val_50, val_3, val_7, val_3, val_22) -> val_54
        Unsqueeze(val_54, val_42) -> val_55
Slice(x, val_3, val_7, val_3, val_22) -> slice_1
  Slice(slice_1, val_3, val_18, val_22, val_22) -> slice_2
    Shape(slice_2) -> shape
      Expand(value_0, shape) -> fill
        Transpose(fill, perm=[1,0]) -> val_44
  Shape(slice_1, start=0) -> val_35
    Gather(val_35, val_19, axis=0) -> val_36
      Range(val_0, val_36, val_19) -> val_37
        Slice(val_37, val_3, val_18, val_3, val_22) -> val_41
          Unsqueeze(val_41, val_42) -> val_43
  Transpose(slice_1, perm=[1,0]) -> val_45
    ScatterND(val_45, val_43, val_44, reduction=b'none') -> val_46
      Transpose(val_46, perm=[1,0]) -> slice_scatter
        ScatterND(x, val_55, slice_scatter, reduction=b'none') -> slice_scatter_1
          Add(slice_scatter_1, scalar_tensor_default) -> add_12
output: name='add_12' type=dtype('float32') shape=['batch', 5]

tracing#

inputs: #2[(T1s5x5,),(T1s7x5,)]
shapes: dict(x:{0:Dim(batch)})

opset: domain='' version=21
opset: domain='aten' version=1
input: name='x' type=dtype('float32') shape=['batch', 5]
init: name='init1_s_::RSh1' type=float32 shape=(1,) -- array([2.], dtype=float32)-- GraphBuilder.constant_folding.from/fold(init1_s_,init7_s1_1)##init1_s_/shape_type_compute._cast_inputs.1(add)##init7_s1_1/Opset.make_node.1/Shape
aten_setitem[aten](x) -> setitem
  Add(setitem, init1_s_::RSh1) -> output
output: name='output' type=dtype('float32') shape=['batch', 5]
----- function name=aten_setitem domain=aten
----- doc_string: -- function_options=FunctionOptions(export_as_function=...
opset: domain='' version=21
input: 'x'
Constant(value=1.0) -> setitem_val
Constant(value=[0]) -> init7_s1_0
  Concat(init7_s1_0, init7_s1_0, axis=0) -> _onx_concat_init7_s1_0
Constant(value=2) -> init7_s_2
Constant(value=3) -> init7_s_3
Shape(x, end=1, start=0) -> x::Shape:1
  Squeeze(x::Shape:1, init7_s1_0) -> x::Shape:1::Sq0
  Sub(x::Shape:1::Sq0, init7_s_2) -> _onx_sub_x::Shape:1::Sq0
  Unsqueeze(_onx_sub_x::Shape:1::Sq0, init7_s1_0) -> _onx_sub_x::Shape:1::Sq0::UnSq0
Shape(x, end=2, start=1) -> x::Shape1:2
  Squeeze(x::Shape1:2, init7_s1_0) -> x::Shape1:2::Sq0
  Sub(x::Shape1:2::Sq0, init7_s_3) -> _onx_sub_x::Shape1:2::Sq0
  Unsqueeze(_onx_sub_x::Shape1:2::Sq0, init7_s1_0) -> _onx_sub_x::Shape1:2::Sq0::UnSq0
    Concat(_onx_sub_x::Shape:1::Sq0::UnSq0, _onx_sub_x::Shape1:2::Sq0::UnSq0, axis=0) -> _onx_concat_sub_x::Shape:1::Sq0::UnSq0
    Add(_onx_concat_init7_s1_0, _onx_concat_sub_x::Shape:1::Sq0::UnSq0) -> _onx_add_concat_init7_s1_0
Shape(x) -> x::Shape:
  Sub(x::Shape:, _onx_add_concat_init7_s1_0) -> _onx_sub_x::Shape:
  Expand(setitem_val, _onx_sub_x::Shape:) -> _onx_expand_setitem_val
    Shape(_onx_expand_setitem_val) -> _onx_expand_setitem_val::Shape:
      ConstantOfShape(_onx_expand_setitem_val::Shape:, value=[0.0]) -> _onx_constantofshape_expand_setitem_val::Shape:
  Unsqueeze(_onx_sub_x::Shape:1::Sq0, init7_s1_0) -> _onx_sub_x::Shape:1::Sq0::UnSq02
Unsqueeze(_onx_sub_x::Shape1:2::Sq0, init7_s1_0) -> _onx_sub_x::Shape1:2::Sq0::UnSq02
  Concat(init7_s1_0, init7_s1_0, _onx_sub_x::Shape:1::Sq0::UnSq02, _onx_sub_x::Shape1:2::Sq0::UnSq02, axis=0) -> _onx_concat_init7_s1_02
    Pad(_onx_expand_setitem_val, _onx_concat_init7_s1_02) -> _onx_expand_setitem_val_padded
  Pad(_onx_constantofshape_expand_setitem_val::Shape:, _onx_concat_init7_s1_02, setitem_val) -> x_mask
    Mul(x, x_mask) -> _onx_mul_x
      Add(_onx_mul_x, _onx_expand_setitem_val_padded) -> setitem
output: name='setitem' type=? shape=?

new-tracing#

inputs: #2[(T1s5x5,),(T1s7x5,)]
shapes: dict(x:{0:Dim(batch)})

opset: domain='' version=21
opset: domain='aten' version=1
input: name='x' type=dtype('float32') shape=['batch', 5]
init: name='init1_s_::RSh1' type=float32 shape=(1,) -- array([2.], dtype=float32)-- GraphBuilder.constant_folding.from/fold(init1_s_,init7_s1_1)##init1_s_/shape_type_compute._cast_inputs.0##init7_s1_1/Opset.make_node.1/Shape
aten_setitem[aten](x) -> setitem
  Add(setitem, init1_s_::RSh1) -> output
output: name='output' type=dtype('float32') shape=['batch', 5]
----- function name=aten_setitem domain=aten
----- doc_string: -- function_options=FunctionOptions(export_as_function=...
opset: domain='' version=21
input: 'x'
Constant(value=1.0) -> setitem_val
Constant(value=[0]) -> init7_s1_0
  Concat(init7_s1_0, init7_s1_0, axis=0) -> _onx_concat_init7_s1_0
Constant(value=2) -> init7_s_2
Constant(value=3) -> init7_s_3
Shape(x, end=1, start=0) -> x::Shape:1
  Squeeze(x::Shape:1, init7_s1_0) -> x::Shape:1::Sq0
  Sub(x::Shape:1::Sq0, init7_s_2) -> _onx_sub_x::Shape:1::Sq0
  Unsqueeze(_onx_sub_x::Shape:1::Sq0, init7_s1_0) -> _onx_sub_x::Shape:1::Sq0::UnSq0
Shape(x, end=2, start=1) -> x::Shape1:2
  Squeeze(x::Shape1:2, init7_s1_0) -> x::Shape1:2::Sq0
  Sub(x::Shape1:2::Sq0, init7_s_3) -> _onx_sub_x::Shape1:2::Sq0
  Unsqueeze(_onx_sub_x::Shape1:2::Sq0, init7_s1_0) -> _onx_sub_x::Shape1:2::Sq0::UnSq0
    Concat(_onx_sub_x::Shape:1::Sq0::UnSq0, _onx_sub_x::Shape1:2::Sq0::UnSq0, axis=0) -> _onx_concat_sub_x::Shape:1::Sq0::UnSq0
    Add(_onx_concat_init7_s1_0, _onx_concat_sub_x::Shape:1::Sq0::UnSq0) -> _onx_add_concat_init7_s1_0
Shape(x) -> x::Shape:
  Sub(x::Shape:, _onx_add_concat_init7_s1_0) -> _onx_sub_x::Shape:
  Expand(setitem_val, _onx_sub_x::Shape:) -> _onx_expand_setitem_val
    Shape(_onx_expand_setitem_val) -> _onx_expand_setitem_val::Shape:
      ConstantOfShape(_onx_expand_setitem_val::Shape:, value=[0.0]) -> _onx_constantofshape_expand_setitem_val::Shape:
  Unsqueeze(_onx_sub_x::Shape:1::Sq0, init7_s1_0) -> _onx_sub_x::Shape:1::Sq0::UnSq02
Unsqueeze(_onx_sub_x::Shape1:2::Sq0, init7_s1_0) -> _onx_sub_x::Shape1:2::Sq0::UnSq02
  Concat(init7_s1_0, init7_s1_0, _onx_sub_x::Shape:1::Sq0::UnSq02, _onx_sub_x::Shape1:2::Sq0::UnSq02, axis=0) -> _onx_concat_init7_s1_02
    Pad(_onx_expand_setitem_val, _onx_concat_init7_s1_02) -> _onx_expand_setitem_val_padded
  Pad(_onx_constantofshape_expand_setitem_val::Shape:, _onx_concat_init7_s1_02, setitem_val) -> x_mask
    Mul(x, x_mask) -> _onx_mul_x
      Add(_onx_mul_x, _onx_expand_setitem_val_padded) -> setitem
output: name='setitem' type=? shape=?

InplaceSetItemSquareAdd2#

code: yobx.torch.testing._model_eval_cases.InplaceSetItemSquareAdd2

forward#

def forward(self, x):
    x[:2, :3] = 1
    return x + 2, x + 3

yobx#

inputs: #2[(T1s5x5,),(T1s7x5,)]
shapes: dict(x:{0:Dim(batch)})

opset: domain='' version=21
input: name='x' type=dtype('float32') shape=['batch', 5]
init: name='init7_s1_0' type=int64 shape=(1,) -- array([0])           -- Opset.make_node.1/Shape##Opset.make_node.1/Shape##Opset.make_node.1/Shape##Opset.make_node.1/Shape##Opset.make_node.1/Shape##Opset.make_node.1/Shape##Opset.make_node.1/Shape
init: name='init7_s1_2' type=int64 shape=(1,) -- array([2])           -- Opset.make_node.1/Shape##Opset.make_node.1/Shape##Opset.make_node.1/Shape
init: name='init7_s1_1' type=int64 shape=(1,) -- array([1])           -- Opset.make_node.1/Shape##Opset.make_node.1/Shape##Opset.make_node.1/Shape##Opset.make_node.1/Shape##ReshapeIsSqueezePattern.m1
init: name='init7_s_0' type=int64 shape=() -- array([0])              -- Opset.make_node.1/Shape
init: name='init7_s_1' type=int64 shape=() -- array([1])              -- Opset.make_node.1/Shape
init: name='init7_s3_0_1_2::RSh-1x1' type=int64 shape=(3, 1) -- array([0, 1, 2])-- GraphBuilder.constant_folding.from/fold(init7_s2_-1_1,init7_s3_0_1_2)##init7_s3_0_1_2/Opset.make_node.1/Shape##init7_s2_-1_1/Opset.make_node.1/Shape##Opset.make_node.1/Shape
init: name='fill::T10' type=float32 shape=(3, 2)                      -- GraphBuilder.constant_folding.from/fold(fill)##fill/
init: name='init1_s_::RSh1' type=float32 shape=(1,) -- array([2.], dtype=float32)-- GraphBuilder.constant_folding.from/fold(init1_s_,init7_s1_1)##init1_s_/shape_type_compute._cast_inputs.0##init7_s1_1/Opset.make_node.1/Shape##Opset.make_node.1/Shape##Opset.make_node.1/Shape##Opset.make_node.1/Shape
init: name='init1_s_2::RSh1' type=float32 shape=(1,) -- array([3.], dtype=float32)-- GraphBuilder.constant_folding.from/fold(init1_s_2,init7_s1_1)##init1_s_2/shape_type_compute._cast_inputs.0##init7_s1_1/Opset.make_node.1/Shape##Opset.make_node.1/Shape##Opset.make_node.1/Shape##Opset.make_node.1/Shape
Shape(x, end=1, start=0) -> x::Shape:1
  Squeeze(x::Shape:1) -> x::Shape:1::Sq
    Range(init7_s_0, x::Shape:1::Sq, init7_s_1) -> _onx_range_init7_s_0
      Slice(_onx_range_init7_s_0, init7_s1_0, init7_s1_2, init7_s1_0, init7_s1_1) -> _onx_slice_range_init7_s_0
        Unsqueeze(_onx_slice_range_init7_s_0, init7_s1_1) -> _onx_slice_range_init7_s_0::RSh-1x1
Slice(x, init7_s1_0, init7_s1_2, init7_s1_0) -> slice_3
  Transpose(slice_3, perm=[1,0]) -> slice_3::T10
    ScatterND(slice_3::T10, init7_s3_0_1_2::RSh-1x1, fill::T10) -> _onx_scatternd_slice_3::T10
      Transpose(_onx_scatternd_slice_3::T10, perm=[1,0]) -> slice_scatter
        ScatterND(x, _onx_slice_range_init7_s_0::RSh-1x1, slice_scatter) -> output_0
          Add(output_0, init1_s_::RSh1) -> output_1
Add(output_0, init1_s_2::RSh1) -> output_2
output: name='output_1' type=dtype('float32') shape=['batch', 5]
output: name='output_2' type=dtype('float32') shape=['batch', 5]

dynamo-ir#

inputs: #2[(T1s5x5,),(T1s7x5,)]
shapes: dict(x:{0:Dim(batch)})

opset: domain='' version=20
input: name='x' type=dtype('float32') shape=['batch', 5]
init: name='val_3' type=int64 shape=(1,) -- array([0])
init: name='val_7' type=int64 shape=(1,) -- array([2])
init: name='val_18' type=int64 shape=(1,) -- array([3])
init: name='val_22' type=int64 shape=(1,) -- array([1])
init: name='value_0' type=float32 shape=() -- array([1.], dtype=float32)
init: name='scalar_tensor_default' type=float32 shape=() -- array([2.], dtype=float32)
init: name='scalar_tensor_default_1' type=float32 shape=() -- array([3.], dtype=float32)
init: name='val_0' type=int64 shape=() -- array([0])
init: name='val_19' type=int64 shape=() -- array([1])
init: name='val_42' type=int64 shape=(1,) -- array([-1])
Shape(x, start=0) -> val_48
  Gather(val_48, val_0, axis=0) -> val_49
    Range(val_0, val_49, val_19) -> val_50
      Slice(val_50, val_3, val_7, val_3, val_22) -> val_54
        Unsqueeze(val_54, val_42) -> val_55
Slice(x, val_3, val_7, val_3, val_22) -> slice_1
  Slice(slice_1, val_3, val_18, val_22, val_22) -> slice_2
    Shape(slice_2) -> shape
      Expand(value_0, shape) -> fill
        Transpose(fill, perm=[1,0]) -> val_44
  Shape(slice_1, start=0) -> val_35
    Gather(val_35, val_19, axis=0) -> val_36
      Range(val_0, val_36, val_19) -> val_37
        Slice(val_37, val_3, val_18, val_3, val_22) -> val_41
          Unsqueeze(val_41, val_42) -> val_43
  Transpose(slice_1, perm=[1,0]) -> val_45
    ScatterND(val_45, val_43, val_44, reduction=b'none') -> val_46
      Transpose(val_46, perm=[1,0]) -> slice_scatter
        ScatterND(x, val_55, slice_scatter, reduction=b'none') -> slice_scatter_1
          Add(slice_scatter_1, scalar_tensor_default) -> add_12
Add(slice_scatter_1, scalar_tensor_default_1) -> add_16
output: name='add_12' type=dtype('float32') shape=['batch', 5]
output: name='add_16' type=dtype('float32') shape=['batch', 5]

tracing#

inputs: #2[(T1s5x5,),(T1s7x5,)]
shapes: dict(x:{0:Dim(batch)})

opset: domain='' version=21
opset: domain='aten' version=1
input: name='x' type=dtype('float32') shape=['batch', 5]
init: name='init1_s_::RSh1' type=float32 shape=(1,) -- array([2.], dtype=float32)-- GraphBuilder.constant_folding.from/fold(init1_s_,init7_s1_1)##init1_s_/shape_type_compute._cast_inputs.1(add)##init7_s1_1/Opset.make_node.1/Shape##Opset.make_node.1/Shape
init: name='init1_s_2::RSh1' type=float32 shape=(1,) -- array([3.], dtype=float32)-- GraphBuilder.constant_folding.from/fold(init1_s_2,init7_s1_1)##init1_s_2/shape_type_compute._cast_inputs.1(add)##init7_s1_1/Opset.make_node.1/Shape##Opset.make_node.1/Shape
aten_setitem[aten](x) -> setitem
  Add(setitem, init1_s_::RSh1) -> output_0
Add(setitem, init1_s_2::RSh1) -> output_1
output: name='output_0' type=dtype('float32') shape=['batch', 5]
output: name='output_1' type=dtype('float32') shape=['batch', 5]
----- function name=aten_setitem domain=aten
----- doc_string: -- function_options=FunctionOptions(export_as_function=...
opset: domain='' version=21
input: 'x'
Constant(value=1.0) -> setitem_val
Constant(value=[0]) -> init7_s1_0
  Concat(init7_s1_0, init7_s1_0, axis=0) -> _onx_concat_init7_s1_0
Constant(value=2) -> init7_s_2
Constant(value=3) -> init7_s_3
Shape(x, end=1, start=0) -> x::Shape:1
  Squeeze(x::Shape:1, init7_s1_0) -> x::Shape:1::Sq0
  Sub(x::Shape:1::Sq0, init7_s_2) -> _onx_sub_x::Shape:1::Sq0
  Unsqueeze(_onx_sub_x::Shape:1::Sq0, init7_s1_0) -> _onx_sub_x::Shape:1::Sq0::UnSq0
Shape(x, end=2, start=1) -> x::Shape1:2
  Squeeze(x::Shape1:2, init7_s1_0) -> x::Shape1:2::Sq0
  Sub(x::Shape1:2::Sq0, init7_s_3) -> _onx_sub_x::Shape1:2::Sq0
  Unsqueeze(_onx_sub_x::Shape1:2::Sq0, init7_s1_0) -> _onx_sub_x::Shape1:2::Sq0::UnSq0
    Concat(_onx_sub_x::Shape:1::Sq0::UnSq0, _onx_sub_x::Shape1:2::Sq0::UnSq0, axis=0) -> _onx_concat_sub_x::Shape:1::Sq0::UnSq0
    Add(_onx_concat_init7_s1_0, _onx_concat_sub_x::Shape:1::Sq0::UnSq0) -> _onx_add_concat_init7_s1_0
Shape(x) -> x::Shape:
  Sub(x::Shape:, _onx_add_concat_init7_s1_0) -> _onx_sub_x::Shape:
  Expand(setitem_val, _onx_sub_x::Shape:) -> _onx_expand_setitem_val
    Shape(_onx_expand_setitem_val) -> _onx_expand_setitem_val::Shape:
      ConstantOfShape(_onx_expand_setitem_val::Shape:, value=[0.0]) -> _onx_constantofshape_expand_setitem_val::Shape:
  Unsqueeze(_onx_sub_x::Shape:1::Sq0, init7_s1_0) -> _onx_sub_x::Shape:1::Sq0::UnSq02
Unsqueeze(_onx_sub_x::Shape1:2::Sq0, init7_s1_0) -> _onx_sub_x::Shape1:2::Sq0::UnSq02
  Concat(init7_s1_0, init7_s1_0, _onx_sub_x::Shape:1::Sq0::UnSq02, _onx_sub_x::Shape1:2::Sq0::UnSq02, axis=0) -> _onx_concat_init7_s1_02
    Pad(_onx_expand_setitem_val, _onx_concat_init7_s1_02) -> _onx_expand_setitem_val_padded
  Pad(_onx_constantofshape_expand_setitem_val::Shape:, _onx_concat_init7_s1_02, setitem_val) -> x_mask
    Mul(x, x_mask) -> _onx_mul_x
      Add(_onx_mul_x, _onx_expand_setitem_val_padded) -> setitem
output: name='setitem' type=? shape=?

new-tracing#

inputs: #2[(T1s5x5,),(T1s7x5,)]
shapes: dict(x:{0:Dim(batch)})

opset: domain='' version=21
opset: domain='aten' version=1
input: name='x' type=dtype('float32') shape=['batch', 5]
init: name='init1_s_::RSh1' type=float32 shape=(1,) -- array([2.], dtype=float32)-- GraphBuilder.constant_folding.from/fold(init1_s_,init7_s1_1)##init1_s_/shape_type_compute._cast_inputs.0##init7_s1_1/Opset.make_node.1/Shape##Opset.make_node.1/Shape
init: name='init1_s_2::RSh1' type=float32 shape=(1,) -- array([3.], dtype=float32)-- GraphBuilder.constant_folding.from/fold(init1_s_2,init7_s1_1)##init1_s_2/shape_type_compute._cast_inputs.0##init7_s1_1/Opset.make_node.1/Shape##Opset.make_node.1/Shape
aten_setitem[aten](x) -> setitem
  Add(setitem, init1_s_::RSh1) -> output_0
Add(setitem, init1_s_2::RSh1) -> output_1
output: name='output_0' type=dtype('float32') shape=['batch', 5]
output: name='output_1' type=dtype('float32') shape=['batch', 5]
----- function name=aten_setitem domain=aten
----- doc_string: -- function_options=FunctionOptions(export_as_function=...
opset: domain='' version=21
input: 'x'
Constant(value=1.0) -> setitem_val
Constant(value=[0]) -> init7_s1_0
  Concat(init7_s1_0, init7_s1_0, axis=0) -> _onx_concat_init7_s1_0
Constant(value=2) -> init7_s_2
Constant(value=3) -> init7_s_3
Shape(x, end=1, start=0) -> x::Shape:1
  Squeeze(x::Shape:1, init7_s1_0) -> x::Shape:1::Sq0
  Sub(x::Shape:1::Sq0, init7_s_2) -> _onx_sub_x::Shape:1::Sq0
  Unsqueeze(_onx_sub_x::Shape:1::Sq0, init7_s1_0) -> _onx_sub_x::Shape:1::Sq0::UnSq0
Shape(x, end=2, start=1) -> x::Shape1:2
  Squeeze(x::Shape1:2, init7_s1_0) -> x::Shape1:2::Sq0
  Sub(x::Shape1:2::Sq0, init7_s_3) -> _onx_sub_x::Shape1:2::Sq0
  Unsqueeze(_onx_sub_x::Shape1:2::Sq0, init7_s1_0) -> _onx_sub_x::Shape1:2::Sq0::UnSq0
    Concat(_onx_sub_x::Shape:1::Sq0::UnSq0, _onx_sub_x::Shape1:2::Sq0::UnSq0, axis=0) -> _onx_concat_sub_x::Shape:1::Sq0::UnSq0
    Add(_onx_concat_init7_s1_0, _onx_concat_sub_x::Shape:1::Sq0::UnSq0) -> _onx_add_concat_init7_s1_0
Shape(x) -> x::Shape:
  Sub(x::Shape:, _onx_add_concat_init7_s1_0) -> _onx_sub_x::Shape:
  Expand(setitem_val, _onx_sub_x::Shape:) -> _onx_expand_setitem_val
    Shape(_onx_expand_setitem_val) -> _onx_expand_setitem_val::Shape:
      ConstantOfShape(_onx_expand_setitem_val::Shape:, value=[0.0]) -> _onx_constantofshape_expand_setitem_val::Shape:
  Unsqueeze(_onx_sub_x::Shape:1::Sq0, init7_s1_0) -> _onx_sub_x::Shape:1::Sq0::UnSq02
Unsqueeze(_onx_sub_x::Shape1:2::Sq0, init7_s1_0) -> _onx_sub_x::Shape1:2::Sq0::UnSq02
  Concat(init7_s1_0, init7_s1_0, _onx_sub_x::Shape:1::Sq0::UnSq02, _onx_sub_x::Shape1:2::Sq0::UnSq02, axis=0) -> _onx_concat_init7_s1_02
    Pad(_onx_expand_setitem_val, _onx_concat_init7_s1_02) -> _onx_expand_setitem_val_padded
  Pad(_onx_constantofshape_expand_setitem_val::Shape:, _onx_concat_init7_s1_02, setitem_val) -> x_mask
    Mul(x, x_mask) -> _onx_mul_x
      Add(_onx_mul_x, _onx_expand_setitem_val_padded) -> setitem
output: name='setitem' type=? shape=?

LayerNorm#

code: yobx.torch.testing._model_eval_cases.LayerNorm

forward#

def forward(self, x):
    return self.layer_norm(x)

yobx#

inputs: #2[(T1s3x4,),(T1s5x4,)]
shapes: dict(x:{0:Dim(batch)})

opset: domain='' version=21
input: name='x' type=dtype('float32') shape=['batch', 4]
init: name='init1_s4_' type=float32 shape=(4,) -- array([1., 1., 1., 1.], dtype=float32)-- LayerNormalizationPattern.apply.scale
init: name='init1_s4_2' type=float32 shape=(4,) -- array([0., 0., 0., 0.], dtype=float32)-- LayerNormalizationPattern.apply.bias
LayerNormalization(x, init1_s4_, init1_s4_2, axis=-1, epsilon=0.00, stash_type=1) -> output_0
output: name='output_0' type=dtype('float32') shape=['batch', 4]

dynamo-ir#

inputs: #2[(T1s3x4,),(T1s5x4,)]
shapes: dict(x:{0:Dim(batch)})

opset: domain='' version=20
input: name='x' type=dtype('float32') shape=['batch', 4]
init: name='layer_norm.weight' type=float32 shape=(4,) -- array([1., 1., 1., 1.], dtype=float32)
init: name='layer_norm.bias' type=float32 shape=(4,) -- array([0., 0., 0., 0.], dtype=float32)
LayerNormalization(x, layer_norm.weight, layer_norm.bias, stash_type=1, epsilon=0.00, axis=-1) -> layer_norm
output: name='layer_norm' type=dtype('float32') shape=['batch', 4]

tracing#

inputs: #2[(T1s3x4,),(T1s5x4,)]
shapes: dict(x:{0:Dim(batch)})

opset: domain='' version=21
input: name='x' type=dtype('float32') shape=['batch', 4]
init: name='init1_s4_' type=float32 shape=(4,) -- array([1., 1., 1., 1.], dtype=float32)-- LayerNormalizationPattern.apply.scale
init: name='init1_s4_2' type=float32 shape=(4,) -- array([0., 0., 0., 0.], dtype=float32)-- LayerNormalizationPattern.apply.bias
LayerNormalization(x, init1_s4_, init1_s4_2, axis=-1, epsilon=0.00, stash_type=1) -> output
output: name='output' type=dtype('float32') shape=['batch', 4]

new-tracing#

inputs: #2[(T1s3x4,),(T1s5x4,)]
shapes: dict(x:{0:Dim(batch)})

opset: domain='' version=21
input: name='x' type=dtype('float32') shape=['batch', 4]
init: name='layer_norm.weight' type=float32 shape=(4,) -- array([1., 1., 1., 1.], dtype=float32)-- DynamoInterpret.placeholder.1/P(layer_norm.weight)
init: name='layer_norm.bias' type=float32 shape=(4,) -- array([0., 0., 0., 0.], dtype=float32)-- DynamoInterpret.placeholder.1/P(layer_norm.bias)
LayerNormalization(x, layer_norm.weight, layer_norm.bias, axis=-1, epsilon=0.00) -> output, native_layer_norm_default#1, native_layer_norm_default#2
output: name='output' type=dtype('float32') shape=['batch', 4]

ShapeAndTypeAndDeviceBased#

code: yobx.torch.testing._model_eval_cases.ShapeAndTypeAndDeviceBased

forward#

def forward(self, x):
    shape = x.shape
    new_shape = (shape[0], shape[1] + 1)
    dtype = x.dtype
    if dtype == torch.float64 or dtype == torch.float16:
        dtype = torch.float32
    device = x.device
    return torch.zeros(new_shape, dtype=dtype, device=device)

yobx#

inputs: #2[(T10s3x4,),(T10s5x4,)]
shapes: dict(x:{0:Dim(batch),1:Dim(seq)})

opset: domain='' version=21
input: name='x' type=dtype('float16') shape=['batch', 'seq']
init: name='SqueezeBinaryUnsqueezePattern_init7_s_1' type=int64 shape=(1,) -- array([1])-- GraphBuilder.constant_folding.from/fold(init7_s1_0,init7_s_1)##init7_s_1/shape_type_compute._cast_inputs.1(add)##init7_s1_0/Opset.make_node.1/Shape##Opset.make_node.1/Shape
Shape(x, end=1, start=0) -> x::Shape:1
Shape(x, end=2, start=1) -> x::Shape1:2
  Add(x::Shape1:2, SqueezeBinaryUnsqueezePattern_init7_s_1) -> add::UnSq0
  Concat(x::Shape:1, add::UnSq0, axis=0) -> _onx_concat_sym_size_int_2::UnSq0
    ConstantOfShape(_onx_concat_sym_size_int_2::UnSq0, value=[0.0]) -> output_0
output: name='output_0' type=dtype('float32') shape=['batch', 'seq']

dynamo-ir#

inputs: #2[(T10s3x4,),(T10s5x4,)]
shapes: dict(x:{0:Dim(batch),1:Dim(seq)})

opset: domain='' version=20
input: name='x' type=dtype('float16') shape=['batch', 'seq']
init: name='val_2' type=int64 shape=() -- array([1])
init: name='val_3' type=float32 shape=() -- array([0.], dtype=float32)
init: name='val_4' type=int64 shape=(1,) -- array([-1])
Shape(x, end=1, start=0) -> val_0
Shape(x, end=2, start=1) -> val_1
  Squeeze(val_1) -> sym_size_int_3
    Add(sym_size_int_3, val_2) -> add
      Reshape(add, val_4, allowzero=0) -> val_6
  Concat(val_0, val_6, axis=0) -> val_7
    Expand(val_3, val_7) -> zeros
output: name='zeros' type=dtype('float32') shape=['batch', 'seq + 1']

tracing#

inputs: #2[(T10s3x4,),(T10s5x4,)]
shapes: dict(x:{0:Dim(batch),1:Dim(seq)})

opset: domain='' version=21
input: name='x' type=dtype('float16') shape=['batch', 'seq']
init: name='init7_s1_0' type=int64 shape=(1,) -- array([0])           -- Opset.make_node.1/Shape##Opset.make_node.1/Shape##Opset.make_node.1/Shape##Opset.make_node.1/Shape##Opset.make_node.1/Shape
init: name='init7_s_1' type=int64 shape=() -- array([1])              -- shape_type_compute._cast_inputs.1(add)
Shape(x, end=1, start=0) -> _onx_gather_size
Shape(x, end=2, start=1) -> _onx_gather_size2
  Squeeze(_onx_gather_size2, init7_s1_0) -> getitem_1
    Add(getitem_1, init7_s_1) -> _onx_add_getitem_1
      Unsqueeze(_onx_add_getitem_1, init7_s1_0) -> add::UnSq0
  Concat(_onx_gather_size, add::UnSq0, axis=0) -> _onx_concat_getitem::UnSq0
    ConstantOfShape(_onx_concat_getitem::UnSq0, value=[0.0]) -> output
output: name='output' type=dtype('float32') shape=['batch', 'add']

new-tracing#

inputs: #2[(T10s3x4,),(T10s5x4,)]
shapes: dict(x:{0:Dim(batch),1:Dim(seq)})

opset: domain='' version=21
input: name='x' type=dtype('float16') shape=['batch', 'seq']
init: name='init7_s1_1' type=int64 shape=(1,) -- array([1])           -- GraphBuilder.get_dimension_as_result.axis_name##GraphBuilder._eval_dim_expr_node_as_1d.const
Shape(x, end=2, start=1) -> seq
  Add(seq, init7_s1_1) -> _onx_add_seq
Shape(x, end=1, start=0) -> x::Shape:1
  Concat(x::Shape:1, _onx_add_seq, axis=0) -> _onx_concat_sym_size_int::UnSq0
    ConstantOfShape(_onx_concat_sym_size_int::UnSq0, value=[0.0]) -> output
output: name='output' type=dtype('float32') shape=['batch', 'seq+1']

ShapeAndTypeBased#

code: yobx.torch.testing._model_eval_cases.ShapeAndTypeBased

forward#

def forward(self, x):
    shape = x.shape
    new_shape = (shape[0], shape[1] + 1)
    dtype = x.dtype
    if dtype == torch.float64 or dtype == torch.float16:
        dtype = torch.float32
    return torch.zeros(new_shape, dtype=dtype)

yobx#

inputs: #2[(T10s3x4,),(T10s5x4,)]
shapes: dict(x:{0:Dim(batch),1:Dim(seq)})

opset: domain='' version=21
input: name='x' type=dtype('float16') shape=['batch', 'seq']
init: name='SqueezeBinaryUnsqueezePattern_init7_s_1' type=int64 shape=(1,) -- array([1])-- GraphBuilder.constant_folding.from/fold(init7_s1_0,init7_s_1)##init7_s_1/shape_type_compute._cast_inputs.1(add)##init7_s1_0/Opset.make_node.1/Shape##Opset.make_node.1/Shape
Shape(x, end=1, start=0) -> x::Shape:1
Shape(x, end=2, start=1) -> x::Shape1:2
  Add(x::Shape1:2, SqueezeBinaryUnsqueezePattern_init7_s_1) -> add::UnSq0
  Concat(x::Shape:1, add::UnSq0, axis=0) -> _onx_concat_sym_size_int_2::UnSq0
    ConstantOfShape(_onx_concat_sym_size_int_2::UnSq0, value=[0.0]) -> output_0
output: name='output_0' type=dtype('float32') shape=['batch', 'seq']

dynamo-ir#

inputs: #2[(T10s3x4,),(T10s5x4,)]
shapes: dict(x:{0:Dim(batch),1:Dim(seq)})

opset: domain='' version=20
input: name='x' type=dtype('float16') shape=['batch', 'seq']
init: name='val_2' type=int64 shape=() -- array([1])
init: name='val_3' type=float32 shape=() -- array([0.], dtype=float32)
init: name='val_4' type=int64 shape=(1,) -- array([-1])
Shape(x, end=1, start=0) -> val_0
Shape(x, end=2, start=1) -> val_1
  Squeeze(val_1) -> sym_size_int_3
    Add(sym_size_int_3, val_2) -> add
      Reshape(add, val_4, allowzero=0) -> val_6
  Concat(val_0, val_6, axis=0) -> val_7
    Expand(val_3, val_7) -> zeros
output: name='zeros' type=dtype('float32') shape=['batch', 'seq + 1']

tracing#

inputs: #2[(T10s3x4,),(T10s5x4,)]
shapes: dict(x:{0:Dim(batch),1:Dim(seq)})

opset: domain='' version=21
input: name='x' type=dtype('float16') shape=['batch', 'seq']
init: name='init7_s1_0' type=int64 shape=(1,) -- array([0])           -- Opset.make_node.1/Shape##Opset.make_node.1/Shape##Opset.make_node.1/Shape##Opset.make_node.1/Shape##Opset.make_node.1/Shape
init: name='init7_s_1' type=int64 shape=() -- array([1])              -- shape_type_compute._cast_inputs.1(add)
Shape(x, end=1, start=0) -> _onx_gather_size
Shape(x, end=2, start=1) -> _onx_gather_size2
  Squeeze(_onx_gather_size2, init7_s1_0) -> getitem_1
    Add(getitem_1, init7_s_1) -> _onx_add_getitem_1
      Unsqueeze(_onx_add_getitem_1, init7_s1_0) -> add::UnSq0
  Concat(_onx_gather_size, add::UnSq0, axis=0) -> _onx_concat_getitem::UnSq0
    ConstantOfShape(_onx_concat_getitem::UnSq0, value=[0.0]) -> output
output: name='output' type=dtype('float32') shape=['batch', 'add']

new-tracing#

inputs: #2[(T10s3x4,),(T10s5x4,)]
shapes: dict(x:{0:Dim(batch),1:Dim(seq)})

opset: domain='' version=21
input: name='x' type=dtype('float16') shape=['batch', 'seq']
init: name='init7_s1_1' type=int64 shape=(1,) -- array([1])           -- GraphBuilder.get_dimension_as_result.axis_name##GraphBuilder._eval_dim_expr_node_as_1d.const
Shape(x, end=2, start=1) -> seq
  Add(seq, init7_s1_1) -> _onx_add_seq
Shape(x, end=1, start=0) -> x::Shape:1
  Concat(x::Shape:1, _onx_add_seq, axis=0) -> _onx_concat_sym_size_int::UnSq0
    ConstantOfShape(_onx_concat_sym_size_int::UnSq0, value=[0.0]) -> output
output: name='output' type=dtype('float32') shape=['batch', 'seq+1']

ShapeBased#

code: yobx.torch.testing._model_eval_cases.ShapeBased

forward#

def forward(self, x):
    shape = x.shape
    new_shape = (shape[0], shape[1] + 1)
    return torch.zeros(new_shape, dtype=torch.float32)

yobx#

inputs: #2[(T1s3x4,),(T1s5x4,)]
shapes: dict(x:{0:Dim(batch),1:Dim(seq)})

opset: domain='' version=21
input: name='x' type=dtype('float32') shape=['batch', 'seq']
init: name='SqueezeBinaryUnsqueezePattern_init7_s_1' type=int64 shape=(1,) -- array([1])-- GraphBuilder.constant_folding.from/fold(init7_s1_0,init7_s_1)##init7_s_1/shape_type_compute._cast_inputs.1(add)##init7_s1_0/Opset.make_node.1/Shape##Opset.make_node.1/Shape
Shape(x, end=1, start=0) -> x::Shape:1
Shape(x, end=2, start=1) -> x::Shape1:2
  Add(x::Shape1:2, SqueezeBinaryUnsqueezePattern_init7_s_1) -> add::UnSq0
  Concat(x::Shape:1, add::UnSq0, axis=0) -> _onx_concat_sym_size_int_2::UnSq0
    ConstantOfShape(_onx_concat_sym_size_int_2::UnSq0, value=[0.0]) -> output_0
output: name='output_0' type=dtype('float32') shape=['batch', 'seq']

dynamo-ir#

inputs: #2[(T1s3x4,),(T1s5x4,)]
shapes: dict(x:{0:Dim(batch),1:Dim(seq)})

opset: domain='' version=20
input: name='x' type=dtype('float32') shape=['batch', 'seq']
init: name='val_2' type=int64 shape=() -- array([1])
init: name='val_3' type=float32 shape=() -- array([0.], dtype=float32)
init: name='val_4' type=int64 shape=(1,) -- array([-1])
Shape(x, end=1, start=0) -> val_0
Shape(x, end=2, start=1) -> val_1
  Squeeze(val_1) -> sym_size_int_3
    Add(sym_size_int_3, val_2) -> add
      Reshape(add, val_4, allowzero=0) -> val_6
  Concat(val_0, val_6, axis=0) -> val_7
    Expand(val_3, val_7) -> zeros
output: name='zeros' type=dtype('float32') shape=['batch', 'seq + 1']

tracing#

inputs: #2[(T1s3x4,),(T1s5x4,)]
shapes: dict(x:{0:Dim(batch),1:Dim(seq)})

opset: domain='' version=21
input: name='x' type=dtype('float32') shape=['batch', 'seq']
init: name='init7_s1_0' type=int64 shape=(1,) -- array([0])           -- Opset.make_node.1/Shape##Opset.make_node.1/Shape##Opset.make_node.1/Shape##Opset.make_node.1/Shape##Opset.make_node.1/Shape
init: name='init7_s_1' type=int64 shape=() -- array([1])              -- shape_type_compute._cast_inputs.1(add)
Shape(x, end=1, start=0) -> _onx_gather_size
Shape(x, end=2, start=1) -> _onx_gather_size2
  Squeeze(_onx_gather_size2, init7_s1_0) -> getitem_1
    Add(getitem_1, init7_s_1) -> _onx_add_getitem_1
      Unsqueeze(_onx_add_getitem_1, init7_s1_0) -> add::UnSq0
  Concat(_onx_gather_size, add::UnSq0, axis=0) -> _onx_concat_getitem::UnSq0
    ConstantOfShape(_onx_concat_getitem::UnSq0, value=[0.0]) -> output
output: name='output' type=dtype('float32') shape=['batch', 'add']

new-tracing#

inputs: #2[(T1s3x4,),(T1s5x4,)]
shapes: dict(x:{0:Dim(batch),1:Dim(seq)})

opset: domain='' version=21
input: name='x' type=dtype('float32') shape=['batch', 'seq']
init: name='init7_s1_1' type=int64 shape=(1,) -- array([1])           -- GraphBuilder.get_dimension_as_result.axis_name##GraphBuilder._eval_dim_expr_node_as_1d.const
Shape(x, end=2, start=1) -> seq
  Add(seq, init7_s1_1) -> _onx_add_seq
Shape(x, end=1, start=0) -> x::Shape:1
  Concat(x::Shape:1, _onx_add_seq, axis=0) -> _onx_concat_sym_size_int::UnSq0
    ConstantOfShape(_onx_concat_sym_size_int::UnSq0, value=[0.0]) -> output
output: name='output' type=dtype('float32') shape=['batch', 'seq+1']

SignatureFloat1#

code: yobx.torch.testing._model_eval_cases.SignatureFloat1

forward#

def forward(self, x, alpha: float = 2.0):
    return torch.sigmoid(self.linear(x)) - self.buff * alpha

yobx#

inputs: #2[(T1s4x3,float),(T1s8x3,float)]
shapes: ({0:Dim(batch)},None)

opset: domain='' version=21
input: name='x' type=dtype('float32') shape=['batch', 3]
init: name='mul' type=float32 shape=(1,) -- array([0.75], dtype=float32)-- GraphBuilder.constant_folding.from/fold(_onx_mul_b_buff)##_onx_mul_b_buff/GraphBuilder.constant_folding.from/fold(b_buff,init1_s_::RSh1)##b_buff/DynamoInterpret.placeholder.0##init1_s_::RSh1/GraphBuilder.constant_folding.from/fold(init1_s_,init7_s1_1)##init1_s_/shape_type_compute._cast_inputs.1(mul_Tensor)##init7_s1_1/Opset.make_node.1/Shape
init: name='GemmTransposePattern--p_linear_weight::T10' type=float32 shape=(1, 3) -- array([-0.40779606, -0.18821955, -0.0236439 ], dtype=float32)-- GraphBuilder.constant_folding.from/fold(init7_s2_1_3,p_linear_weight::T10)##p_linear_weight::T10/GraphBuilder.constant_folding.from/fold(p_linear_weight)##p_linear_weight/DynamoInterpret.placeholder.1/P(linear.weight)##init7_s2_1_3/TransposeEqualReshapePattern.apply.new_shape
init: name='linear.bias' type=float32 shape=(1,) -- array([-0.18471454], dtype=float32)-- DynamoInterpret.placeholder.1/P(linear.bias)
Gemm(x, GemmTransposePattern--p_linear_weight::T10, linear.bias, transB=1) -> _onx_add_matmul_x
  Sigmoid(_onx_add_matmul_x) -> sigmoid
    Sub(sigmoid, mul) -> output_0
output: name='output_0' type=dtype('float32') shape=['batch', 1]

FAILED

diff.1

dynamo-ir#

inputs: #2[(T1s4x3,float),(T1s8x3,float)]
shapes: ({0:Dim(batch)},None)

opset: domain='' version=20
input: name='x' type=dtype('float32') shape=['s77', 3]
init: name='linear.weight' type=float32 shape=(1, 3) -- array([-0.28443968,  0.13885555,  0.13746437], dtype=float32)
init: name='linear.bias' type=float32 shape=(1,) -- array([0.4556576], dtype=float32)
init: name='mul_2' type=float32 shape=(1,) -- array([0.75], dtype=float32)
Gemm(x, linear.weight, linear.bias, beta=1.00, transB=1, alpha=1.00, transA=0) -> linear
  Sigmoid(linear) -> sigmoid
    Sub(sigmoid, mul_2) -> sub_2
output: name='sub_2' type=dtype('float32') shape=['s77', 1]

FAILED

diff.1

tracing#

FAILED

Unable to interpret method 'aten_meth_mul', args=(buff, alpha), kwargs={}, dispatcher=None
--DEBUG--
-- to print the exported program: PRINT_EXPORTED_PROGRAM=1

[GraphBuilder-OAA] Message starts, there are 3 initializers, 8 nodes, 2 inputs, 2 outputs.
input_names=['x', 'alpha']
output_names=[]
--CONSTRAINTS--
    batch = {'s26'}
    s26 = {'batch'}
--SHAPE--
_dynamic_examples=
dynamic_objects=
   batch = WrapSym(batch)
dynamic_objects_rev=
   'batch' = <class 'list'>
     tuple
       'batch'
       ERR**: <class 'torch.SymInt'>:'batch'
dynamic_dimensions_source={'batch': [{'axis': 0, 'input_name': 'x'}]}
dynamic_dimensions_source_flat=[0]
output_dynamic_dimensions_source_flat=None
dynamic_alias={'s26': 'batch'}
dynamic_shapes=({0: Dim('batch', min=1, max=1024)}, None)
_known_shapes={'_sub_ime__linear__onx_add_matmul_input_1': ('batch', 1),
 '_sub_ime__linear__onx_matmul_input_1': ('batch', 1),
 '_sub_ime__linear_input_1': ('batch', 3),
 '_sub_ime__linear_linear': ('batch', 1),
 '_sub_ime__linear_output': ('batch', 1),
 '_sub_ime__linear_weight::T10': (3, 1),
 'alpha': (),
 'buff': (1,),
 'linear': ('batch', 1),
 'linear.bias': (1,),
 'linear.weight': (1, 3),
 'sigmoid': ('batch', 1),
 'x': ('batch', 3)}
_known_types={'_sub_ime__linear__onx_add_matmul_input_1': 1,
 '_sub_ime__linear__onx_matmul_input_1': 1,
 '_sub_ime__linear_input_1': 1,
 '_sub_ime__linear_linear': 1,
 '_sub_ime__linear_output': 1,
 '_sub_ime__linear_weight::T10': 1,
 'alpha': 1,
 'buff': 1,
 'linear': 1,
 'linear.bias': 1,
 'linear.weight': 1,
 'sigmoid': 1,
 'x': 1}
_known_devices={'_sub_ime__linear_input_1': -1, 'alpha': -1, 'x': -1}
_context=[]
_known_value_shape={}
_known_constants=['_sub_ime__linear_weight::T10', 'buff', 'linear.bias', 'linear.weight']
_known_ranks (with no shape)={}
--PARAMETERS--
_parameter_renaming=
--TORCH-USERS--
    alpha -> {mul}
    buff -> {mul}
    linear -> {sigmoid}
    mul -> {sub}
    sigmoid -> {sub}
    x -> {linear}
--TORCH-SHAPES--
    x: ('run_node', (('example_value', torch.float32, torch.Size([4, 3])), ('val', torch.float32, torch.Size([s26, 3])))) --- 1:2:('batch', 3):
    alpha: ('run_node', (('example_value', torch.float32, torch.Size([])), ('val', torch.float32, torch.Size([])))) --- 1:0:():
    linear: ('run_node', ('', '')) --- 1:2:('batch', 1):
    sigmoid: ('run_node', ('', '')) --- 1:2:('batch', 1):
    buff: ('run_node', ('', '')) --- 1:1:(1,):
    mul: ('run_node', ('', '')) --- :::
--ONNX--
-- EXEPATH --
export
export_options=ExportOptions(tracing=TracingMode.TRACING, aten_as_function=('aten.histc.default', 'aten.index_copy.default', 'aten.index_put.default', 'aten._grouped_mm.default', 'aten.setitem', <built-in function setitem>))
function_options=None
-- process.graph_module --
SignatureFloat1(
  (linear): Linear(in_features=3, out_features=1, bias=True)
)



def forward(self, x, alpha : float = 2.0):
    linear = self.linear(x);  x = None
    sigmoid = torch.sigmoid(linear);  linear = None
    buff = self.buff
    mul = buff.mul(alpha);  buff = alpha = None
    sub = sigmoid - mul;  sigmoid = mul = None
    return sub

# To see more debug info, please use `graph_module.print_readable()`
-- process.graph_module.graph --
graph():
    %x : [num_users=1] = placeholder[target=x]
    %alpha : float [num_users=1] = placeholder[target=alpha](default=2.0)
    %linear : [num_users=1] = call_module[target=linear](args = (%x,), kwargs = {})
    %sigmoid : [num_users=1] = call_function[target=torch.sigmoid](args = (%linear,), kwargs = {})
    %buff : [num_users=1] = get_attr[target=buff]
    %mul : [num_users=1] = call_method[target=mul](args = (%buff, %alpha), kwargs = {})
    %sub : [num_users=1] = call_function[target=operator.sub](args = (%sigmoid, %mul), kwargs = {})
    return sub
-- process.inputs_to_remove --
set()
-- process.progress --
node 5/8 target=mul
-- 2 INPUTS
[GraphBuilder-OAA.1.make_tensor_input] x[1:batchx3]
[GraphBuilder-OAA.1.make_tensor_input] alpha[1:]
-- 3 INITIALIZERS
[GraphBuilder-OAA.1.make_initializer] linear.weight[torch.float32:torch.float32:[-0.4114268124103546, -0.2296745926141739, 0.4496452808380127]] - SOURCE: GraphBuilder.make_nodes/fromlinear.weight##DynamoInterpret.get_attr.1/P(linear.weight)
[GraphBuilder-OAA.1.make_initializer] linear.bias[torch.float32:torch.float32:[-0.007972127757966518]] - SOURCE: GraphBuilder.make_nodes/fromlinear.bias##DynamoInterpret.get_attr.1/P(linear.bias)
[GraphBuilder-OAA.1.make_initializer] buff[torch.float32:torch.float32:[0.5]] - SOURCE: DynamoInterpret.get_attr.0
[GraphBuilder-OAA.4.make_node] .make_nodes     [@:@   ] Identity:['x']->['_sub_ime__linear_input_1']
[GraphBuilder-OAA.4.make_node] linear          [#:#   ] Transpose:['linear.weight']->['_sub_ime__linear_weight::T10']
[GraphBuilder-OAA.4.make_node] linear2         [@#:#  ] MatMul:['_sub_ime__linear_input_1', '_sub_ime__linear_weight::T10']->['_sub_ime__linear__onx_matmul_input_1']
[GraphBuilder-OAA.4.make_node] linear3         [##:#  ] Add:['_sub_ime__linear__onx_matmul_input_1', 'linear.bias']->['_sub_ime__linear__onx_add_matmul_input_1']
[GraphBuilder-OAA.4.make_node] linear4         [#:#   ] Identity:['_sub_ime__linear__onx_add_matmul_input_1']->['_sub_ime__linear_linear']
[GraphBuilder-OAA.4.make_node] .output         [#:#   ] Identity:['_sub_ime__linear_linear']->['_sub_ime__linear_output']
[GraphBuilder-OAA.4.make_node] .make_nodes2    [#:#   ] Identity:['_sub_ime__linear_output']->['linear']
[GraphBuilder-OAA.4.make_node] sigmoid         [#:#   ] Sigmoid:['linear']->['sigmoid']
-- 0 OUTPUTS
[GraphBuilder-OAA] Message completed, there are 3 initializers, 8 nodes, 2 inputs, 2 outputs.

new-tracing#

inputs: #2[(T1s4x3,float),(T1s8x3,float)]
shapes: ({0:Dim(batch)},None)

opset: domain='' version=21
input: name='x' type=dtype('float32') shape=['batch', 3]
init: name='param_2' type=float32 shape=(1,) -- array([0.75], dtype=float32)-- DynamoInterpret.placeholder.0
init: name='GemmTransposePattern--param_1' type=float32 shape=(1, 3) -- array([-0.19185105,  0.19220296,  0.4914421 ], dtype=float32)-- GraphBuilder.constant_folding.from/fold(init7_s2_1_3,param_1)##param_1/DynamoInterpret.placeholder.0##init7_s2_1_3/TransposeEqualReshapePattern.apply.new_shape
init: name='linear.bias' type=float32 shape=(1,) -- array([0.53566164], dtype=float32)-- DynamoInterpret.placeholder.1/P(linear.bias)
Gemm(x, GemmTransposePattern--param_1, linear.bias, transB=1) -> addmm_default
  Sigmoid(addmm_default) -> sigmoid_default
    Sub(sigmoid_default, param_2) -> output
output: name='output' type=dtype('float32') shape=['batch', 1]

FAILED

diff.1

SignatureInt1#

code: yobx.torch.testing._model_eval_cases.SignatureInt1

forward#

def forward(self, x, i: int = 2):
    return torch.sigmoid(self.linear(x)) - self.buff + x[:, i : i + 1]

yobx#

inputs: #2[(T1s4x3,int),(T1s8x3,int)]
shapes: ({0:Dim(batch)},None)

opset: domain='' version=21
input: name='x' type=dtype('float32') shape=['batch', 3]
init: name='b_buff' type=float32 shape=(1,) -- array([0.5], dtype=float32)-- DynamoInterpret.placeholder.0
init: name='init7_s1_1' type=int64 shape=(1,) -- array([1])           -- Opset.make_node.1/Shape##Opset.make_node.1/Shape
init: name='init7_s1_2' type=int64 shape=(1,) -- array([2])           -- Opset.make_node.1/Shape
init: name='GemmTransposePattern--p_linear_weight::T10' type=float32 shape=(1, 3) -- array([-0.07115567, -0.22352682, -0.52554685], dtype=float32)-- GraphBuilder.constant_folding.from/fold(init7_s2_1_3,p_linear_weight::T10)##p_linear_weight::T10/GraphBuilder.constant_folding.from/fold(p_linear_weight)##p_linear_weight/DynamoInterpret.placeholder.1/P(linear.weight)##init7_s2_1_3/TransposeEqualReshapePattern.apply.new_shape
init: name='linear.bias' type=float32 shape=(1,) -- array([-0.20124099], dtype=float32)-- DynamoInterpret.placeholder.1/P(linear.bias)
Gemm(x, GemmTransposePattern--p_linear_weight::T10, linear.bias, transB=1) -> _onx_add_matmul_x
  Sigmoid(_onx_add_matmul_x) -> sigmoid
    Sub(sigmoid, b_buff) -> sub
Slice(x, init7_s1_1, init7_s1_2, init7_s1_1) -> slice_1
  Add(sub, slice_1) -> output_0
output: name='output_0' type=dtype('float32') shape=['batch', 1]

FAILED

diff.1

dynamo-ir#

inputs: #2[(T1s4x3,int),(T1s8x3,int)]
shapes: ({0:Dim(batch)},None)

opset: domain='' version=20
input: name='x' type=dtype('float32') shape=['s77', 3]
init: name='linear.weight' type=float32 shape=(1, 3) -- array([-0.52504116,  0.10829681,  0.28061345], dtype=float32)
init: name='linear.bias' type=float32 shape=(1,) -- array([0.33598346], dtype=float32)
init: name='buff' type=float32 shape=(1,) -- array([0.5], dtype=float32)
init: name='val_3' type=int64 shape=(1,) -- array([1])
init: name='val_7' type=int64 shape=(1,) -- array([2])
Gemm(x, linear.weight, linear.bias, beta=1.00, transB=1, alpha=1.00, transA=0) -> linear
  Sigmoid(linear) -> sigmoid
    Sub(sigmoid, buff) -> sub_2
Slice(x, val_3, val_7, val_3, val_3) -> slice_1
  Add(sub_2, slice_1) -> add_12
output: name='add_12' type=dtype('float32') shape=['s77', 1]

FAILED

diff.1

tracing#

FAILED

[ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Non-zero status code returned while running Concat node. Name:'_getitem_slicenSD' Status Message: ~/github/onnxruntime/onnxruntime/core/providers/cpu/tensor/concatbase.h:118 onnxruntime::common::Status onnxruntime::ConcatBase::PrepareForComputeImpl(KernelContextType*, const InlinedTensorsVector&, onnxruntime::Prepare&) const [with KernelContextType = onnxruntime::OpKernelContext; InlinedTensorsVector = absl::lts_20250814::InlinedVector<const onnxruntime::Tensor*, 5, std::allocator<const onnxruntime::Tensor*> >] input_rank == reference_rank was false. Ranks of input data are different, cannot concatenate them. expected rank: 1 got: 2

new-tracing#

inputs: #2[(T1s4x3,int),(T1s8x3,int)]
shapes: ({0:Dim(batch)},None)

opset: domain='' version=21
input: name='x' type=dtype('float32') shape=['batch', 3]
init: name='buff' type=float32 shape=(1,) -- array([0.5], dtype=float32)-- DynamoInterpret.placeholder.0
init: name='init7_s1_1' type=int64 shape=(1,) -- array([1])           -- Opset.make_node.1/Shape##Opset.make_node.1/Shape
init: name='init7_s1_2' type=int64 shape=(1,) -- array([2])           -- Opset.make_node.1/Shape
init: name='GemmTransposePattern--param_1' type=float32 shape=(1, 3) -- array([ 0.1572107 , -0.06829563,  0.06585996], dtype=float32)-- GraphBuilder.constant_folding.from/fold(init7_s2_1_3,param_1)##param_1/DynamoInterpret.placeholder.0##init7_s2_1_3/TransposeEqualReshapePattern.apply.new_shape
init: name='linear.bias' type=float32 shape=(1,) -- array([0.21279328], dtype=float32)-- DynamoInterpret.placeholder.1/P(linear.bias)
Gemm(x, GemmTransposePattern--param_1, linear.bias, transB=1) -> addmm_default
  Sigmoid(addmm_default) -> sigmoid_default
    Sub(sigmoid_default, buff) -> sub_tensor
Slice(x, init7_s1_1, init7_s1_2, init7_s1_1) -> slice_tensor
  Add(sub_tensor, slice_tensor) -> output
output: name='output' type=dtype('float32') shape=['batch', 1]

FAILED

diff.1

SignatureInt2#

code: yobx.torch.testing._model_eval_cases.SignatureInt2

forward#

def forward(self, x, i: int = 2):
    return torch.sigmoid(self.linear(x)) - self.buff + x[:, i]

yobx#

inputs: #1[(T1s4x3,int)]
shapes: dict(x:{0:Dim(batch)},i:None)

opset: domain='' version=21
input: name='x' type=dtype('float32') shape=['batch', 3]
init: name='b_buff' type=float32 shape=(1,) -- array([0.5], dtype=float32)-- DynamoInterpret.placeholder.0
init: name='init7_s_1' type=int64 shape=() -- array([1])              -- Opset.make_node.1/Shape
init: name='GemmTransposePattern--p_linear_weight::T10' type=float32 shape=(1, 3) -- array([0.04344961, 0.3944711 , 0.34804332], dtype=float32)-- GraphBuilder.constant_folding.from/fold(init7_s2_1_3,p_linear_weight::T10)##p_linear_weight::T10/GraphBuilder.constant_folding.from/fold(p_linear_weight)##p_linear_weight/DynamoInterpret.placeholder.1/P(linear.weight)##init7_s2_1_3/TransposeEqualReshapePattern.apply.new_shape
init: name='linear.bias' type=float32 shape=(1,) -- array([-0.15095033], dtype=float32)-- DynamoInterpret.placeholder.1/P(linear.bias)
Gather(x, init7_s_1, axis=1) -> select
Gemm(x, GemmTransposePattern--p_linear_weight::T10, linear.bias, transB=1) -> _onx_add_matmul_x
  Sigmoid(_onx_add_matmul_x) -> sigmoid
    Sub(sigmoid, b_buff) -> sub
  Add(sub, select) -> output_0
output: name='output_0' type=dtype('float32') shape=['batch', 'batch']

dynamo-ir#

inputs: #1[(T1s4x3,int)]
shapes: dict(x:{0:Dim(batch)},i:None)

opset: domain='' version=20
input: name='x' type=dtype('float32') shape=['s77', 3]
init: name='linear.weight' type=float32 shape=(1, 3) -- array([-0.55287105, -0.28999823,  0.42486408], dtype=float32)
init: name='linear.bias' type=float32 shape=(1,) -- array([0.4001918], dtype=float32)
init: name='buff' type=float32 shape=(1,) -- array([0.5], dtype=float32)
init: name='val_12' type=int64 shape=() -- array([1])
Gather(x, val_12, axis=1) -> select
Gemm(x, linear.weight, linear.bias, beta=1.00, transB=1, alpha=1.00, transA=0) -> linear
  Sigmoid(linear) -> sigmoid
    Sub(sigmoid, buff) -> sub_2
  Add(sub_2, select) -> add_14
output: name='add_14' type=dtype('float32') shape=['s77', 's77']

tracing#

inputs: #1[(T1s4x3,int)]
shapes: dict(x:{0:Dim(batch)},i:None)

opset: domain='' version=21
input: name='x' type=dtype('float32') shape=['batch', 3]
input: name='i' type=dtype('int64') shape=None
init: name='linear.bias' type=float32 shape=(1,) -- array([0.52044857], dtype=float32)-- GraphBuilder.make_nodes/fromlinear.bias##DynamoInterpret.get_attr.1/P(linear.bias)
init: name='buff' type=float32 shape=(1,) -- array([0.5], dtype=float32)-- DynamoInterpret.get_attr.0
init: name='getitem_axis' type=int64 shape=(2,) -- array([0, 1])      -- _getitem_slice.axis.1
init: name='getitem_axis_0' type=int64 shape=(1,) -- array([0])       -- _getitem_slice.axis.2##Opset.make_node.1/Shape##Opset.make_node.1/Shape##Opset.make_node.1/Shape
init: name='init7_s_1' type=int64 shape=() -- array([1])              -- _getitem_slice.int_end
init: name='getitem_step' type=int64 shape=(2,) -- array([1, 1])      -- _getitem_slice.3
init: name='init7_s1_1' type=int64 shape=(1,) -- array([1])           -- Opset.make_node.1/Shape
init: name='GemmTransposePattern--_sub_ime__linear_weight::T10' type=float32 shape=(1, 3) -- array([-0.01572291, -0.42851564, -0.208283  ], dtype=float32)-- GraphBuilder.constant_folding.from/fold(_sub_ime__linear_weight::T10,init7_s2_1_3)##_sub_ime__linear_weight::T10/GraphBuilder.constant_folding.from/fold(linear.weight)##linear.weight/GraphBuilder.make_nodes/fromlinear.weight##DynamoInterpret.get_attr.1/P(linear.weight)##init7_s2_1_3/TransposeEqualReshapePattern.apply.new_shape
Add(i, init7_s_1) -> getitem_slice_end_1
  Unsqueeze(getitem_slice_end_1, getitem_axis_0) -> getitem_slice_end_1::UnSq0
Shape(x) -> getitem_shape
  GatherElements(getitem_shape, getitem_axis_0) -> getitem_end
    Concat(getitem_end, getitem_slice_end_1::UnSq0, axis=0) -> _onx_concat_getitem_end
Gemm(x, GemmTransposePattern--_sub_ime__linear_weight::T10, linear.bias, transB=1) -> _sub_ime__linear__onx_add_matmul_input_1
  Sigmoid(_sub_ime__linear__onx_add_matmul_input_1) -> sigmoid
    Sub(sigmoid, buff) -> sub
Unsqueeze(i, getitem_axis_0) -> i::UnSq0
  Concat(getitem_axis_0, i::UnSq0, axis=0) -> _onx_concat_getitem_axis_0
    Slice(x, _onx_concat_getitem_axis_0, _onx_concat_getitem_end, getitem_axis, getitem_step) -> getitem_sliced
      Squeeze(getitem_sliced, init7_s1_1) -> getitem
      Add(sub, getitem) -> output
output: name='output' type=dtype('float32') shape=['batch', 'NEWDIM_slice']

FAILED

[ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Non-zero status code returned while running Concat node. Name:'_getitem_slicenSD' Status Message: ~/github/onnxruntime/onnxruntime/core/providers/cpu/tensor/concatbase.h:118 onnxruntime::common::Status onnxruntime::ConcatBase::PrepareForComputeImpl(KernelContextType*, const InlinedTensorsVector&, onnxruntime::Prepare&) const [with KernelContextType = onnxruntime::OpKernelContext; InlinedTensorsVector = absl::lts_20250814::InlinedVector<const onnxruntime::Tensor*, 5, std::allocator<const onnxruntime::Tensor*> >] input_rank == reference_rank was false. Ranks of input data are different, cannot concatenate them. expected rank: 1 got: 2

new-tracing#

inputs: #1[(T1s4x3,int)]
shapes: dict(x:{0:Dim(batch)},i:None)

opset: domain='' version=21
input: name='x' type=dtype('float32') shape=['batch', 3]
init: name='buff' type=float32 shape=(1,) -- array([0.5], dtype=float32)-- DynamoInterpret.placeholder.0
init: name='init7_s_1' type=int64 shape=() -- array([1])              -- Opset.make_node.1/Shape
init: name='GemmTransposePattern--param_1' type=float32 shape=(1, 3) -- array([ 0.38627467,  0.50826746, -0.53939   ], dtype=float32)-- GraphBuilder.constant_folding.from/fold(init7_s2_1_3,param_1)##param_1/DynamoInterpret.placeholder.0##init7_s2_1_3/TransposeEqualReshapePattern.apply.new_shape
init: name='linear.bias' type=float32 shape=(1,) -- array([0.2202102], dtype=float32)-- DynamoInterpret.placeholder.1/P(linear.bias)
Gather(x, init7_s_1, axis=1) -> select_int
Gemm(x, GemmTransposePattern--param_1, linear.bias, transB=1) -> addmm_default
  Sigmoid(addmm_default) -> sigmoid_default
    Sub(sigmoid_default, buff) -> sub_tensor
  Add(sub_tensor, select_int) -> output
output: name='output' type=dtype('float32') shape=['batch', 'batch']

SignatureListFixedLength#

code: yobx.torch.testing._model_eval_cases.SignatureListFixedLength

forward#

def forward(self, x, lx: list):
    return torch.sigmoid(self.linear(x)) - self.buff + lx[0] * lx[1].sum(axis=1, keepdim=True)

yobx#

inputs: #2[(T1s4x3,#2[T1s4x1,T1s4x2]),(T1s8x3,#2[T1s8x1,T1s8x2])]
shapes: dict(x:{0:Dim(batch)},lx:#2[{0:Dim(batch)},{0:Dim(batch)}])

opset: domain='' version=21
input: name='x' type=dtype('float32') shape=['batch', 3]
input: name='lx_0' type=dtype('float32') shape=['batch', 1]
input: name='lx_1' type=dtype('float32') shape=['batch', 2]
init: name='b_buff' type=float32 shape=(1,) -- array([0.5], dtype=float32)-- DynamoInterpret.placeholder.0
init: name='init7_s1_1' type=int64 shape=(1,) -- array([1])           -- Opset.make_node.1/Shape
init: name='GemmTransposePattern--p_linear_weight::T10' type=float32 shape=(1, 3) -- array([ 0.13923305,  0.17565916, -0.00872061], dtype=float32)-- GraphBuilder.constant_folding.from/fold(init7_s2_1_3,p_linear_weight::T10)##p_linear_weight::T10/GraphBuilder.constant_folding.from/fold(p_linear_weight)##p_linear_weight/DynamoInterpret.placeholder.1/P(linear.weight)##init7_s2_1_3/TransposeEqualReshapePattern.apply.new_shape
init: name='linear.bias' type=float32 shape=(1,) -- array([-0.19147561], dtype=float32)-- DynamoInterpret.placeholder.1/P(linear.bias)
Gemm(x, GemmTransposePattern--p_linear_weight::T10, linear.bias, transB=1) -> _onx_add_matmul_x
  Sigmoid(_onx_add_matmul_x) -> sigmoid
    Sub(sigmoid, b_buff) -> sub
ReduceSum(lx_1, init7_s1_1, keepdims=1) -> sum_1
  Mul(lx_0, sum_1) -> mul
    Add(sub, mul) -> output_0
output: name='output_0' type=dtype('float32') shape=['batch', 1]

dynamo-ir#

inputs: #2[(T1s4x3,#2[T1s4x1,T1s4x2]),(T1s8x3,#2[T1s8x1,T1s8x2])]
shapes: dict(x:{0:Dim(batch)},lx:#2[{0:Dim(batch)},{0:Dim(batch)}])

opset: domain='' version=20
input: name='x' type=dtype('float32') shape=['batch', 3]
input: name='lx_0' type=dtype('float32') shape=['batch', 1]
input: name='lx_1' type=dtype('float32') shape=['batch', 2]
init: name='linear.weight' type=float32 shape=(1, 3) -- array([-0.3585697,  0.3834251, -0.0500835], dtype=float32)
init: name='linear.bias' type=float32 shape=(1,) -- array([-0.2717338], dtype=float32)
init: name='buff' type=float32 shape=(1,) -- array([0.5], dtype=float32)
init: name='val_3' type=int64 shape=(1,) -- array([1])
Gemm(x, linear.weight, linear.bias, beta=1.00, transB=1, alpha=1.00, transA=0) -> linear
  Sigmoid(linear) -> sigmoid
    Sub(sigmoid, buff) -> sub_2
ReduceSum(lx_1, val_3, noop_with_empty_axes=0, keepdims=1) -> sum_1
  Mul(lx_0, sum_1) -> mul_4
    Add(sub_2, mul_4) -> add_15
output: name='add_15' type=dtype('float32') shape=['batch', 1]

tracing#

inputs: #2[(T1s4x3,#2[T1s4x1,T1s4x2]),(T1s8x3,#2[T1s8x1,T1s8x2])]
shapes: dict(x:{0:Dim(batch)},lx:#2[{0:Dim(batch)},{0:Dim(batch)}])

opset: domain='' version=21
input: name='x' type=dtype('float32') shape=['batch', 3]
input: name='lx_0' type=dtype('float32') shape=['batch', 1]
input: name='lx_1' type=dtype('float32') shape=['batch', 2]
init: name='_traced_m2.linear.bias' type=float32 shape=(1,) -- array([-0.42563033], dtype=float32)-- GraphBuilder.make_nodes/from_traced_m2.linear.bias##DynamoInterpret.get_attr.1/P(_traced_m2.linear.bias)
init: name='_traced_m2_buff' type=float32 shape=(1,) -- array([0.5], dtype=float32)-- DynamoInterpret.get_attr.0
init: name='init7_s1_1' type=int64 shape=(1,) -- array([1])           -- Opset.make_node.1/Shape
init: name='GemmTransposePattern--_sub_ime___traced_m2_linear_weight::T10' type=float32 shape=(1, 3) -- array([ 0.57207966, -0.32817608,  0.03737363], dtype=float32)-- GraphBuilder.constant_folding.from/fold(_sub_ime___traced_m2_linear_weight::T10,init7_s2_1_3)##_sub_ime___traced_m2_linear_weight::T10/GraphBuilder.constant_folding.from/fold(_traced_m2.linear.weight)##_traced_m2.linear.weight/GraphBuilder.make_nodes/from_traced_m2.linear.weight##DynamoInterpret.get_attr.1/P(_traced_m2.linear.weight)##init7_s2_1_3/TransposeEqualReshapePattern.apply.new_shape
Gemm(x, GemmTransposePattern--_sub_ime___traced_m2_linear_weight::T10, _traced_m2.linear.bias, transB=1) -> _sub_ime___traced_m2_linear__onx_add_matmul_input_1
  Sigmoid(_sub_ime___traced_m2_linear__onx_add_matmul_input_1) -> sigmoid
    Sub(sigmoid, _traced_m2_buff) -> sub
ReduceSum(lx_1, init7_s1_1, keepdims=1) -> sum_1
  Mul(lx_0, sum_1) -> mul
    Add(sub, mul) -> output
output: name='output' type=dtype('float32') shape=['batch', 1]

new-tracing#

inputs: #2[(T1s4x3,#2[T1s4x1,T1s4x2]),(T1s8x3,#2[T1s8x1,T1s8x2])]
shapes: dict(x:{0:Dim(batch)},lx:#2[{0:Dim(batch)},{0:Dim(batch)}])

opset: domain='' version=21
input: name='x' type=dtype('float32') shape=['batch', 3]
input: name='lx_0' type=dtype('float32') shape=['batch', 1]
input: name='lx_1' type=dtype('float32') shape=['batch', 2]
init: name='buff' type=float32 shape=(1,) -- array([0.5], dtype=float32)-- DynamoInterpret.placeholder.0
init: name='init7_s1_1' type=int64 shape=(1,) -- array([1])           -- Opset.make_node.1/Shape
init: name='GemmTransposePattern--param_1' type=float32 shape=(1, 3) -- array([ 0.5291415 ,  0.13330814, -0.1590451 ], dtype=float32)-- GraphBuilder.constant_folding.from/fold(init7_s2_1_3,param_1)##param_1/DynamoInterpret.placeholder.0##init7_s2_1_3/TransposeEqualReshapePattern.apply.new_shape
init: name='linear.bias' type=float32 shape=(1,) -- array([0.47899845], dtype=float32)-- DynamoInterpret.placeholder.1/P(linear.bias)
Gemm(x, GemmTransposePattern--param_1, linear.bias, transB=1) -> addmm_default
  Sigmoid(addmm_default) -> sigmoid_default
    Sub(sigmoid_default, buff) -> sub_tensor
ReduceSum(lx_1, init7_s1_1, keepdims=1) -> sum_dim_int_list
  Mul(lx_0, sum_dim_int_list) -> mul_tensor
    Add(sub_tensor, mul_tensor) -> output
output: name='output' type=dtype('float32') shape=['batch', 1]

SignatureListFixedWithNone#

code: yobx.torch.testing._model_eval_cases.SignatureListFixedWithNone

forward#

def forward(self, lx):
    x = lx[0]
    if lx[1] is not None:
        x += lx[1]
    if lx[2] is not None:
        x += lx[2]
    return x

yobx#

FAILED

Detected mismatch between the structure of `inputs` and `dynamic_shapes`: `inputs['lx']` has 3 elements, but `dynamic_shapes['lx']` has 2 elements
For more information about this error, see: https://pytorch.org/docs/main/generated/exportdb/index.html#dynamic-shapes-validation

The error above occurred when calling torch.export.export. If you would like to view some more information about this error, and get a list of all other errors that may occur in your export call, you can replace your `export()` call with `draft_export()`.

dynamo-ir#

FAILED

Failed to export the model with torch.export. [96mThis is step 1/3[0m of exporting the model to ONNX. Next steps:
- Modify the model code for `torch.export.export` to succeed. Refer to https://pytorch.org/docs/stable/generated/exportdb/index.html for more information.
- Debug `torch.export.export` and submit a PR to PyTorch.
- Create an issue in the PyTorch GitHub repository against the [96m*torch.export*[0m component and attach the full error stack as well as reproduction scripts.

## Exception summary

<class 'torch._dynamo.exc.UserError'>: Detected mismatch between the structure of `inputs` and `dynamic_shapes`: `inputs['lx']` has 3 elements, but `dynamic_shapes['lx']` has 2 elements
For more information about this error, see: https://pytorch.org/docs/main/generated/exportdb/index.html#dynamic-shapes-validation

The error above occurred when calling torch.export.export. If you would like to view some more information about this error, and get a list of all other errors that may occur in your export call, you can replace your `export()` call with `draft_export()`.

(Refer to the full stack trace above for more information.)

tracing#

FAILED

Length mismatch between x (len=3) and dynamic_shapes (len=2); dynamic_shapes must have one entry per element of x, or be None to use no dynamic dimensions, dynamic_shapes=[{0: Dim('batch', min=0)}, {0: Dim('batch', min=0)}]

new-tracing#

FAILED

Length mismatch between arg (3) and the dynamic_shapes (2), name='lx'

SignatureListVariableLength#

code: yobx.torch.testing._model_eval_cases.SignatureListVariableLength

forward#

def forward(self, x, lx: list):
    t = torch.cat(lx, dim=1).sum(dim=1, keepdim=True)
    return torch.sigmoid(self.linear(x)) - self.buff + t

yobx#

inputs: #2[(T1s4x3,#2[T1s4x1,T1s4x2]),(T1s8x3,#3[T1s8x1,T1s8x2,T1s8x3])]
shapes: dict(x:{0:Dim(batch)},lx:#2[{0:Dim(batch)},{0:Dim(batch)}])

opset: domain='' version=21
input: name='x' type=dtype('float32') shape=['batch', 3]
input: name='lx_0' type=dtype('float32') shape=['batch', 1]
input: name='lx_1' type=dtype('float32') shape=['batch', 2]
init: name='b_buff' type=float32 shape=(1,) -- array([0.5], dtype=float32)-- DynamoInterpret.placeholder.0
init: name='init7_s1_1' type=int64 shape=(1,) -- array([1])           -- Opset.make_node.1/Shape
init: name='GemmTransposePattern--p_linear_weight::T10' type=float32 shape=(1, 3) -- array([-0.25930914, -0.15288831,  0.42998803], dtype=float32)-- GraphBuilder.constant_folding.from/fold(init7_s2_1_3,p_linear_weight::T10)##p_linear_weight::T10/GraphBuilder.constant_folding.from/fold(p_linear_weight)##p_linear_weight/DynamoInterpret.placeholder.1/P(linear.weight)##init7_s2_1_3/TransposeEqualReshapePattern.apply.new_shape
init: name='linear.bias' type=float32 shape=(1,) -- array([-0.05488036], dtype=float32)-- DynamoInterpret.placeholder.1/P(linear.bias)
Concat(lx_0, lx_1, axis=1) -> cat
  ReduceSum(cat, init7_s1_1, keepdims=1) -> sum_1
Gemm(x, GemmTransposePattern--p_linear_weight::T10, linear.bias, transB=1) -> _onx_add_matmul_x
  Sigmoid(_onx_add_matmul_x) -> sigmoid
    Sub(sigmoid, b_buff) -> sub
    Add(sub, sum_1) -> output_0
output: name='output_0' type=dtype('float32') shape=['batch', 1]

FAILED

diff.1

dynamo-ir#

inputs: #2[(T1s4x3,#2[T1s4x1,T1s4x2]),(T1s8x3,#3[T1s8x1,T1s8x2,T1s8x3])]
shapes: dict(x:{0:Dim(batch)},lx:#2[{0:Dim(batch)},{0:Dim(batch)}])

opset: domain='' version=20
input: name='x' type=dtype('float32') shape=['batch', 3]
input: name='lx_0' type=dtype('float32') shape=['batch', 1]
input: name='lx_1' type=dtype('float32') shape=['batch', 2]
init: name='linear.weight' type=float32 shape=(1, 3) -- array([-0.38115522,  0.27257815, -0.23105936], dtype=float32)
init: name='linear.bias' type=float32 shape=(1,) -- array([-0.1743596], dtype=float32)
init: name='buff' type=float32 shape=(1,) -- array([0.5], dtype=float32)
init: name='val_3' type=int64 shape=(1,) -- array([1])
Concat(lx_0, lx_1, axis=1) -> cat
  ReduceSum(cat, val_3, noop_with_empty_axes=0, keepdims=1) -> sum_1
Gemm(x, linear.weight, linear.bias, beta=1.00, transB=1, alpha=1.00, transA=0) -> linear
  Sigmoid(linear) -> sigmoid
    Sub(sigmoid, buff) -> sub_4
    Add(sub_4, sum_1) -> add_15
output: name='add_15' type=dtype('float32') shape=['batch', 1]

FAILED

diff.1

tracing#

inputs: #2[(T1s4x3,#2[T1s4x1,T1s4x2]),(T1s8x3,#3[T1s8x1,T1s8x2,T1s8x3])]
shapes: dict(x:{0:Dim(batch)},lx:#2[{0:Dim(batch)},{0:Dim(batch)}])

opset: domain='' version=21
input: name='x' type=dtype('float32') shape=['batch', 3]
input: name='lx_0' type=dtype('float32') shape=['batch', 1]
input: name='lx_1' type=dtype('float32') shape=['batch', 2]
init: name='init7_s1_1' type=int64 shape=(1,) -- array([1])           -- Opset.make_node.1/Shape
init: name='_traced_m2.linear.bias' type=float32 shape=(1,) -- array([-0.3475805], dtype=float32)-- GraphBuilder.make_nodes/from_traced_m2.linear.bias##DynamoInterpret.get_attr.1/P(_traced_m2.linear.bias)
init: name='_traced_m2_buff' type=float32 shape=(1,) -- array([0.5], dtype=float32)-- DynamoInterpret.get_attr.0
init: name='GemmTransposePattern--_sub_ime___traced_m2_linear_weight::T10' type=float32 shape=(1, 3) -- array([0.30823436, 0.08077445, 0.23983806], dtype=float32)-- GraphBuilder.constant_folding.from/fold(_sub_ime___traced_m2_linear_weight::T10,init7_s2_1_3)##_sub_ime___traced_m2_linear_weight::T10/GraphBuilder.constant_folding.from/fold(_traced_m2.linear.weight)##_traced_m2.linear.weight/GraphBuilder.make_nodes/from_traced_m2.linear.weight##DynamoInterpret.get_attr.1/P(_traced_m2.linear.weight)##init7_s2_1_3/TransposeEqualReshapePattern.apply.new_shape
Concat(lx_0, lx_1, axis=1) -> cat
  ReduceSum(cat, init7_s1_1, keepdims=1) -> sum_1
Gemm(x, GemmTransposePattern--_sub_ime___traced_m2_linear_weight::T10, _traced_m2.linear.bias, transB=1) -> _sub_ime___traced_m2_linear__onx_add_matmul_input_1
  Sigmoid(_sub_ime___traced_m2_linear__onx_add_matmul_input_1) -> sigmoid
    Sub(sigmoid, _traced_m2_buff) -> sub
    Add(sub, sum_1) -> output
output: name='output' type=dtype('float32') shape=['batch', 1]

FAILED

diff.1

new-tracing#

inputs: #2[(T1s4x3,#2[T1s4x1,T1s4x2]),(T1s8x3,#3[T1s8x1,T1s8x2,T1s8x3])]
shapes: dict(x:{0:Dim(batch)},lx:#2[{0:Dim(batch)},{0:Dim(batch)}])

opset: domain='' version=21
input: name='x' type=dtype('float32') shape=['batch', 3]
input: name='lx_0' type=dtype('float32') shape=['batch', 1]
input: name='lx_1' type=dtype('float32') shape=['batch', 2]
init: name='buff' type=float32 shape=(1,) -- array([0.5], dtype=float32)-- DynamoInterpret.placeholder.0
init: name='init7_s1_1' type=int64 shape=(1,) -- array([1])           -- Opset.make_node.1/Shape
init: name='GemmTransposePattern--param_1' type=float32 shape=(1, 3) -- array([ 0.4909917 , -0.54415566, -0.29012865], dtype=float32)-- GraphBuilder.constant_folding.from/fold(init7_s2_1_3,param_1)##param_1/DynamoInterpret.placeholder.0##init7_s2_1_3/TransposeEqualReshapePattern.apply.new_shape
init: name='linear.bias' type=float32 shape=(1,) -- array([0.24830414], dtype=float32)-- DynamoInterpret.placeholder.1/P(linear.bias)
Concat(lx_0, lx_1, axis=1) -> cat_default
  ReduceSum(cat_default, init7_s1_1, keepdims=1) -> sum_dim_int_list
Gemm(x, GemmTransposePattern--param_1, linear.bias, transB=1) -> addmm_default
  Sigmoid(addmm_default) -> sigmoid_default
    Sub(sigmoid_default, buff) -> sub_tensor
    Add(sub_tensor, sum_dim_int_list) -> output
output: name='output' type=dtype('float32') shape=['batch', 1]

FAILED

diff.1

SignatureShapeAsIndex#

code: yobx.torch.testing._model_eval_cases.SignatureShapeAsIndex

forward#

def forward(self, x, y):
    t = torch.sigmoid(self.linear(x)) + x
    return t[:, : y.shape[1]]

yobx#

inputs: #1[(T1s4x3,T1s4x2)]
shapes: dict(x:{0:Dim(batch)},y:{0:Dim(batch),1:Dim(length)})

opset: domain='' version=21
input: name='x' type=dtype('float32') shape=['batch', 3]
input: name='y' type=dtype('float32') shape=['batch', 'length']
init: name='init7_s1_0' type=int64 shape=(1,) -- array([0])           -- Opset.make_node.1/Shape##Opset.make_node.1/Shape
init: name='init7_s1_1' type=int64 shape=(1,) -- array([1])           -- Opset.make_node.1/Shape
init: name='GemmTransposePattern--p_linear_weight::T10' type=float32 shape=(1, 3) -- array([0.34414497, 0.20510678, 0.43299755], dtype=float32)-- GraphBuilder.constant_folding.from/fold(init7_s2_1_3,p_linear_weight::T10)##p_linear_weight::T10/GraphBuilder.constant_folding.from/fold(p_linear_weight)##p_linear_weight/DynamoInterpret.placeholder.1/P(linear.weight)##init7_s2_1_3/TransposeEqualReshapePattern.apply.new_shape
init: name='linear.bias' type=float32 shape=(1,) -- array([0.09058628], dtype=float32)-- DynamoInterpret.placeholder.1/P(linear.bias)
Gemm(x, GemmTransposePattern--p_linear_weight::T10, linear.bias, transB=1) -> _onx_add_matmul_x
  Sigmoid(_onx_add_matmul_x) -> sigmoid
    Add(sigmoid, x) -> add
Shape(y, end=2, start=1) -> y::Shape1:2
  Slice(add, init7_s1_0, y::Shape1:2, init7_s1_1) -> output_0
output: name='output_0' type=dtype('float32') shape=['batch', 'length']

dynamo-ir#

inputs: #1[(T1s4x3,T1s4x2)]
shapes: dict(x:{0:Dim(batch)},y:{0:Dim(batch),1:Dim(length)})

opset: domain='' version=20
input: name='x' type=dtype('float32') shape=['batch', 3]
input: name='y' type=dtype('float32') shape=['batch', 'length']
init: name='linear.weight' type=float32 shape=(1, 3) -- array([ 0.19873925, -0.1482717 , -0.19476023], dtype=float32)
init: name='linear.bias' type=float32 shape=(1,) -- array([-0.45374033], dtype=float32)
init: name='val_8' type=int64 shape=(1,) -- array([1])
init: name='val_1' type=int64 shape=(1,) -- array([0])
Gemm(x, linear.weight, linear.bias, beta=1.00, transB=1, alpha=1.00, transA=0) -> linear
  Sigmoid(linear) -> sigmoid
    Add(sigmoid, x) -> add_6
Shape(y, end=2, start=1) -> val_0
  Slice(add_6, val_1, val_0, val_8, val_8) -> slice_1
output: name='slice_1' type=dtype('float32') shape=['batch', 'length']

tracing#

inputs: #1[(T1s4x3,T1s4x2)]
shapes: dict(x:{0:Dim(batch)},y:{0:Dim(batch),1:Dim(length)})

opset: domain='' version=21
input: name='x' type=dtype('float32') shape=['batch', 3]
input: name='y' type=dtype('float32') shape=['batch', 'length']
init: name='linear.bias' type=float32 shape=(1,) -- array([0.49676564], dtype=float32)-- GraphBuilder.make_nodes/fromlinear.bias##DynamoInterpret.get_attr.1/P(linear.bias)
init: name='init7_s1_0' type=int64 shape=(1,) -- array([0])           -- Opset.make_node.1/Shape##Opset.make_node.1/Shape##Opset.make_node.1/Shape##Opset.make_node.1/Shape
init: name='getitem_2_axis' type=int64 shape=(2,) -- array([0, 1])    -- _getitem_slice.axis.1
init: name='getitem_2_start' type=int64 shape=(2,) -- array([0, 0])   -- _getitem_slice.2
init: name='getitem_2_step' type=int64 shape=(2,) -- array([1, 1])    -- _getitem_slice.3
init: name='GemmTransposePattern--_sub_ime__linear_weight::T10' type=float32 shape=(1, 3) -- array([ 0.19826373, -0.04755141,  0.11293001], dtype=float32)-- GraphBuilder.constant_folding.from/fold(_sub_ime__linear_weight::T10,init7_s2_1_3)##_sub_ime__linear_weight::T10/GraphBuilder.constant_folding.from/fold(linear.weight)##linear.weight/GraphBuilder.make_nodes/fromlinear.weight##DynamoInterpret.get_attr.1/P(linear.weight)##init7_s2_1_3/TransposeEqualReshapePattern.apply.new_shape
Gemm(x, GemmTransposePattern--_sub_ime__linear_weight::T10, linear.bias, transB=1) -> _sub_ime__linear__onx_add_matmul_input_1
  Sigmoid(_sub_ime__linear__onx_add_matmul_input_1) -> sigmoid
    Add(sigmoid, x) -> add
      Shape(add) -> getitem_2_shape
        GatherElements(getitem_2_shape, init7_s1_0) -> getitem_2_end
Shape(y, end=2, start=1) -> _onx_gather_size2
  Concat(getitem_2_end, _onx_gather_size2, axis=0) -> _onx_concat_getitem_2_end
    Slice(add, getitem_2_start, _onx_concat_getitem_2_end, getitem_2_axis, getitem_2_step) -> output
output: name='output' type=dtype('float32') shape=['NEWDIM_slice', 'NEWDIM_slice1']

new-tracing#

inputs: #1[(T1s4x3,T1s4x2)]
shapes: dict(x:{0:Dim(batch)},y:{0:Dim(batch),1:Dim(length)})

opset: domain='' version=21
input: name='x' type=dtype('float32') shape=['batch', 3]
input: name='y' type=dtype('float32') shape=['batch', 'length']
init: name='init7_s1_0' type=int64 shape=(1,) -- array([0])           -- Opset.make_node.1/Shape##Opset.make_node.1/Shape
init: name='init7_s1_1' type=int64 shape=(1,) -- array([1])           -- Opset.make_node.1/Shape
init: name='GemmTransposePattern--param_1' type=float32 shape=(1, 3) -- array([ 0.25528988, -0.27648592,  0.14595792], dtype=float32)-- GraphBuilder.constant_folding.from/fold(init7_s2_1_3,param_1)##param_1/DynamoInterpret.placeholder.0##init7_s2_1_3/TransposeEqualReshapePattern.apply.new_shape
init: name='linear.bias' type=float32 shape=(1,) -- array([-0.37533975], dtype=float32)-- DynamoInterpret.placeholder.1/P(linear.bias)
Gemm(x, GemmTransposePattern--param_1, linear.bias, transB=1) -> addmm_default
  Sigmoid(addmm_default) -> sigmoid_default
    Add(sigmoid_default, x) -> add_tensor
Shape(y, end=2, start=1) -> y::Shape1:2
  Slice(add_tensor, init7_s1_0, y::Shape1:2, init7_s1_1) -> output
output: name='output' type=dtype('float32') shape=['batch', 'length']

TinyLLM#

code: yobx.torch.testing._model_eval_cases.TinyLLM

forward#

def forward(self, input_ids, attention_mask, position_ids, past_key_0, past_value_0):
    """Performs the forward pass and returns the logits tensor."""
    past_key_values = make_dynamic_cache([(past_key_0, past_value_0)])
    return self._model(
        input_ids=input_ids,
        attention_mask=attention_mask,
        position_ids=position_ids,
        past_key_values=past_key_values,
    ).logits

yobx#

inputs: #1[(T7s2x3,T7s2x33,T7s2x3,T1s2x1x30x96,T1s2x1x30x96)]
shapes: dict(input_ids:{0:DYNAMIC,1:DYNAMIC},attention_mask:{0:DYNAMIC,1:DYNAMIC},position_ids:{0:DYNAMIC,1:DYNAMIC},past_key_0:{0:DYNAMIC,2:DYNAMIC},past_value_0:{0:DYNAMIC,2:DYNAMIC})

opset: domain='' version=21
opset: domain='intermediate' version=1
opset: domain='local_functions' version=1
input: name='input_ids' type=dtype('int64') shape=['batch', 'channel']
input: name='attention_mask' type=dtype('int64') shape=['batch_3', 'channel_3']
input: name='position_ids' type=dtype('int64') shape=['batch_4', 'channel_4']
input: name='past_key_0' type=dtype('float32') shape=['batch', 1, 'D0', 96]
input: name='past_value_0' type=dtype('float32') shape=['batch', 1, 'D0_2', 96]
init: name='init7_s1_1' type=int64 shape=(1,) -- array([1])           -- Opset.make_node.1/Shape##Opset.make_node.1/Shape##Opset.make_node.1/Shape##Opset.make_node.1/Shape##Opset.make_node.1/Shape##Opset.make_node.1/Shape##Opset.make_node.1/Shape##Opset.make_node.1/Shape##GraphBuilder.make_shape_from_results.conc##Opset.make_node.1/Shape##Opset.make_node.1/Shape##Opset.make_node.1/Shape##GraphBuilder.make_shape_from_results.conc##Opset.make_node.1/Shape##Opset.make_node.1/Shape##GraphBuilder.make_shape_from_results.conc##Opset.make_node.1/Shape##Opset.make_node.1/Shape##ReshapeIsSqueezePattern.m1##SwapUnsqueezeTransposePattern.apply.new_shape##SwapUnsqueezeTransposePattern.apply.new_shape
init: name='init7_s1_2' type=int64 shape=(1,) -- array([2])           -- Opset.make_node.1/Shape##Opset.make_node.1/Shape##Opset.make_node.1/Shape##GraphBuilder.make_shape_from_results.conc##GraphBuilder.make_shape_from_results.conc##Opset.make_node.1/Shape##GraphBuilder.make_shape_from_results.conc##GraphBuilder.make_shape_from_results.conc##ReshapeIsSqueezePattern.m1##ReshapeIsSqueezePattern.m1
init: name='init7_s1_-1' type=int64 shape=(1,) -- array([-1])         -- Opset.make_node.1/Shape##Opset.make_node.1/Shape##Opset.make_node.1/Shape##GraphBuilder.make_shape_from_results.conc##GraphBuilder.make_shape_from_results.conc##Opset.make_node.1/Shape##Opset.make_node.1/Shape##ConcatReshapePattern.m1
init: name='init1_s_' type=float32 shape=() -- array([2.], dtype=float32)-- Opset.make_node.1/Small##Opset.make_node.1/Small##Opset.make_node.1/Small
init: name='init7_s1_96' type=int64 shape=(1,) -- array([96])         -- GraphBuilder.make_shape_from_results.conc##GraphBuilder.make_shape_from_results.conc##GraphBuilder.make_shape_from_results.conc##GraphBuilder.make_shape_from_results.conc##GraphBuilder.make_shape_from_results.conc
init: name='init1_s1_' type=float32 shape=(1,) -- array([0.31947157], dtype=float32)-- Opset.make_node.1/Small##Opset.make_node.1/Small
init: name='init1_s1_2' type=float32 shape=(1,) -- array([0.], dtype=float32)-- Opset.make_node.1/Small##Opset.make_node.1/Small
init: name='init1_s1_3' type=float32 shape=(1,) -- array([-inf], dtype=float32)-- Opset.make_node.1/Small
init: name='init1_s_2::RSh1' type=float32 shape=(1,) -- array([1.e-05], dtype=float32)-- GraphBuilder.constant_folding.from/fold(init1_s_2,init7_s1_1)##init1_s_2/shape_type_compute._cast_inputs.0##shape_type_compute._cast_inputs.0##shape_type_compute._cast_inputs.0##init7_s1_1/Opset.make_node.1/Shape##Opset.make_node.1/Shape##Opset.make_node.1/Shape##Opset.make_node.1/Shape##Opset.make_node.1/Shape##Opset.make_node.1/Shape##Opset.make_node.1/Shape##Opset.make_node.1/Shape##GraphBuilder.make_shape_from_results.conc##Opset.make_node.1/Shape##Opset.make_node.1/Shape##Opset.make_node.1/Shape##GraphBuilder.make_shape_from_results.conc##Opset.make_node.1/Shape##Opset.make_node.1/Shape##GraphBuilder.make_shape_from_results.conc##Opset.make_node.1/Shape##Opset.make_node.1/Shape
init: name='p__model_model_layers_0_self_attn_q_proj_weight::T10' type=float32 shape=(192, 192)-- GraphBuilder.constant_folding.from/fold(p__model_model_layers_0_self_attn_q_proj_weight)##p__model_model_layers_0_self_attn_q_proj_weight/DynamoInterpret.placeholder.1/P(_model.model.layers.0.self_attn.q_proj.weight)
init: name='p__model_model_layers_0_self_attn_k_proj_weight::T10' type=float32 shape=(192, 96)-- GraphBuilder.constant_folding.from/fold(p__model_model_layers_0_self_attn_k_proj_weight)##p__model_model_layers_0_self_attn_k_proj_weight/DynamoInterpret.placeholder.1/P(_model.model.layers.0.self_attn.k_proj.weight)
init: name='p__model_model_layers_0_self_attn_v_proj_weight::T10' type=float32 shape=(192, 96)-- GraphBuilder.constant_folding.from/fold(p__model_model_layers_0_self_attn_v_proj_weight)##p__model_model_layers_0_self_attn_v_proj_weight/DynamoInterpret.placeholder.1/P(_model.model.layers.0.self_attn.v_proj.weight)
init: name='p__model_model_layers_0_self_attn_o_proj_weight::T10' type=float32 shape=(192, 192)-- GraphBuilder.constant_folding.from/fold(p__model_model_layers_0_self_attn_o_proj_weight)##p__model_model_layers_0_self_attn_o_proj_weight/DynamoInterpret.placeholder.1/P(_model.model.layers.0.self_attn.o_proj.weight)
init: name='p__model_model_layers_0_mlp_gate_proj_weight::T10' type=float32 shape=(192, 1024)-- GraphBuilder.constant_folding.from/fold(p__model_model_layers_0_mlp_gate_proj_weight)##p__model_model_layers_0_mlp_gate_proj_weight/DynamoInterpret.placeholder.1/P(_model.model.layers.0.mlp.gate_proj.weight)
init: name='p__model_model_layers_0_mlp_up_proj_weight::T10' type=float32 shape=(192, 1024)-- GraphBuilder.constant_folding.from/fold(p__model_model_layers_0_mlp_up_proj_weight)##p__model_model_layers_0_mlp_up_proj_weight/DynamoInterpret.placeholder.1/P(_model.model.layers.0.mlp.up_proj.weight)
init: name='p__model_model_layers_0_mlp_down_proj_weight::T10' type=float32 shape=(1024, 192)-- GraphBuilder.constant_folding.from/fold(p__model_model_layers_0_mlp_down_proj_weight)##p__model_model_layers_0_mlp_down_proj_weight/DynamoInterpret.placeholder.1/P(_model.model.layers.0.mlp.down_proj.weight)
init: name='p__model_lm_head_weight::T10' type=float32 shape=(192, 32000)-- GraphBuilder.constant_folding.from/fold(p__model_lm_head_weight)##p__model_lm_head_weight/DynamoInterpret.placeholder.1/P(_model.lm_head.weight)
init: name='expand_122' type=float32 shape=(1, 1, 48)                 -- GraphBuilder.constant_folding.from/fold(expand_1-ZEROS2,unsqueeze_13)##unsqueeze_13/GraphBuilder.constant_folding.from/fold(b__model_model_rotary_emb_inv_freq,init7_s2_0_2)##b__model_model_rotary_emb_inv_freq/DynamoInterpret.placeholder.0##init7_s2_0_2/##expand_1-ZEROS2/
init: name='init7_s5_1_1_2_1_1' type=int64 shape=(5,)                 -- ShapeBasedStaticExpandPattern.m1
init: name='init7_s4_0_0_2_96' type=int64 shape=(4,) -- array([ 0,  0,  2, 96])-- EditDistanceReshapePattern.m1
init: name='_model.model.embed_tokens.weight' type=float32 shape=(32000, 192)-- DynamoInterpret.placeholder.1/P(_model.model.embed_tokens.weight)
init: name='_model.model.layers.0.input_layernorm.weight' type=float32 shape=(192,)-- DynamoInterpret.placeholder.1/P(_model.model.layers.0.input_layernorm.weight)
init: name='_model.model.layers.0.post_attention_layernorm.weight' type=float32 shape=(192,)-- DynamoInterpret.placeholder.1/P(_model.model.layers.0.post_attention_layernorm.weight)
init: name='_model.model.norm.weight' type=float32 shape=(192,)       -- DynamoInterpret.placeholder.1/P(_model.model.norm.weight)
Cast(attention_mask, to=9) -> to
  Shape(to, start=-1) -> to::Shape-1:
Shape(input_ids, end=2, start=1) -> input_ids::Shape1:2
Shape(past_key_0, end=1, start=0) -> past_key_0::Shape:1
  Concat(past_key_0::Shape:1, init7_s1_1, init7_s1_1, axis=0) -> _onx_concat_sym_size_int_20::UnSq08
    Expand(expand_122, _onx_concat_sym_size_int_20::UnSq08) -> expand_12
      CosSinCache_p1[intermediate](position_ids, expand_12) -> uoutput_0, uoutput_1
        Unsqueeze(uoutput_0, init7_s1_1) -> uunsqueeze_15
          Concat(uunsqueeze_15, uunsqueeze_15, axis=-1) -> unsqueeze_15
Shape(past_key_0, end=3, start=2) -> past_key_0::Shape2:3
  Add(input_ids::Shape1:2, past_key_0::Shape2:3) -> SqueezeAddPattern_SwapRangeAddScalarPattern--sym_size_int_15
  CausalMask[intermediate](past_key_0::Shape2:3, SqueezeAddPattern_SwapRangeAddScalarPattern--sym_size_int_15) -> le_3
Gather(_model.model.embed_tokens.weight, input_ids) -> embedding
  Pow(embedding, init1_s_) -> pow_1
    ReduceMean(pow_1, init7_s1_-1, keepdims=1) -> mean
      Add(mean, init1_s_2::RSh1) -> add_4
        Sqrt(add_4) -> _onx_sqrt_add_4
          Reciprocal(_onx_sqrt_add_4) -> rsqrt
  Mul(embedding, rsqrt) -> mul_2
    Mul(_model.model.layers.0.input_layernorm.weight, mul_2) -> mul_3
      MatMul(mul_3, p__model_model_layers_0_self_attn_q_proj_weight::T10) -> _onx_matmul_mul_3
        Reshape(_onx_matmul_mul_3, init7_s4_0_0_2_96) -> view
          Transpose(view, perm=[0,2,1,3]) -> transpose_1
  CausalMaskMulAdd[intermediate](SqueezeAddPattern_SwapRangeAddScalarPattern--sym_size_int_15, past_key_0::Shape:1, to::Shape-1:) -> _onx_add_unsqueeze_11
    Shape(_onx_add_unsqueeze_11) -> _onx_add_unsqueeze_11::Shape:
  Reshape(to, init7_s1_-1) -> to::RSh-1
Reshape(_onx_add_unsqueeze_11, init7_s1_-1) -> _onx_add_unsqueeze_11::RSh-1
  Gather(to::RSh-1, _onx_add_unsqueeze_11::RSh-1) -> _onx_gather_to::RSh-1
    Reshape(_onx_gather_to::RSh-1, _onx_add_unsqueeze_11::Shape:) -> index
    And(le_3, index) -> and_2
Unsqueeze(uoutput_1, init7_s1_1) -> uunsqueeze_16
  Concat(uunsqueeze_16, uunsqueeze_16, axis=-1) -> unsqueeze_16
    HalfRotaryEmbedding[intermediate](transpose_1, unsqueeze_15, unsqueeze_16) -> add_5
      Mul(add_5, init1_s1_) -> _onx_mul_add_5
MatMul(mul_3, p__model_model_layers_0_self_attn_k_proj_weight::T10) -> _onx_matmul_mul_32
  Unsqueeze(_onx_matmul_mul_32, init7_s1_1) -> transpose_2
    HalfRotaryEmbedding[intermediate](transpose_2, unsqueeze_15, unsqueeze_16) -> add_6
      Concat(past_key_0, add_6, axis=-2) -> cat_3
        Unsqueeze(cat_3, init7_s1_2) -> unsqueeze_17
      MatMul(mul_3, p__model_model_layers_0_self_attn_v_proj_weight::T10) -> _onx_matmul_mul_33
        Unsqueeze(_onx_matmul_mul_33, init7_s1_1) -> transpose_3
          Concat(past_value_0, transpose_3, axis=-2) -> cat_4
            Unsqueeze(cat_4, init7_s1_2) -> unsqueeze_18
              Expand(unsqueeze_18, init7_s5_1_1_2_1_1) -> expand_3
                Squeeze(expand_3, init7_s1_1) -> reshape_1
  Concat(past_key_0::Shape:1, init7_s1_1, init7_s1_2, SqueezeAddPattern_SwapRangeAddScalarPattern--sym_size_int_15, init7_s1_96, axis=0) -> _onx_concat_sym_size_int_20::UnSq03
    Expand(unsqueeze_17, _onx_concat_sym_size_int_20::UnSq03) -> expand_2
      Mul(expand_2, init1_s1_) -> SwapUnaryPattern--reshape
  Concat(past_key_0::Shape:1, init7_s1_2, init7_s1_-1, init7_s1_96, axis=0) -> _onx_concat_sym_size_int_20::UnSq04--concat
    Reshape(SwapUnaryPattern--reshape, _onx_concat_sym_size_int_20::UnSq04--concat) -> _onx_mul_reshape
      Transpose(_onx_mul_reshape, perm=[0,1,3,2]) -> _onx_mul_reshape::T0132
        MatMul(_onx_mul_add_5, _onx_mul_reshape::T0132) -> _onx_matmul_mul_add_5
      Where(and_2, _onx_matmul_mul_add_5, init1_s1_3) -> _onx_add_matmul_mul_add_5
        Softmax(_onx_add_matmul_mul_add_5, axis=-1) -> _onx_softmax_add_matmul_mul_add_5
          IsNaN(_onx_softmax_add_matmul_mul_add_5) -> _onx_isnan_softmax_add_matmul_mul_add_5
          Where(_onx_isnan_softmax_add_matmul_mul_add_5, init1_s1_2, _onx_softmax_add_matmul_mul_add_5) -> _onx_where_isnan_softmax_add_matmul_mul_add_5
            MatMul(_onx_where_isnan_softmax_add_matmul_mul_add_5, reshape_1) -> scaled_dot_product_attention
              Transpose(scaled_dot_product_attention, perm=[0,2,1,3]) -> transpose_4
  Concat(past_key_0::Shape:1, input_ids::Shape1:2, init7_s1_-1, axis=0) -> _onx_concat_sym_size_int_20::UnSq07
    Reshape(transpose_4, _onx_concat_sym_size_int_20::UnSq07) -> reshape_2
      MatMul(reshape_2, p__model_model_layers_0_self_attn_o_proj_weight::T10) -> _onx_matmul_reshape_2
  Add(embedding, _onx_matmul_reshape_2) -> add_7
    Pow(add_7, init1_s_) -> pow_2
      ReduceMean(pow_2, init7_s1_-1, keepdims=1) -> mean_1
        Add(mean_1, init1_s_2::RSh1) -> add_8
          Sqrt(add_8) -> _onx_sqrt_add_8
            Reciprocal(_onx_sqrt_add_8) -> rsqrt_1
    Mul(add_7, rsqrt_1) -> mul_16
      Mul(_model.model.layers.0.post_attention_layernorm.weight, mul_16) -> mul_17
        MatMul(mul_17, p__model_model_layers_0_mlp_gate_proj_weight::T10) -> _onx_matmul_mul_17
          Sigmoid(_onx_matmul_mul_17) -> _onx_sigmoid_linear_4
          Mul(_onx_matmul_mul_17, _onx_sigmoid_linear_4) -> silu
        MatMul(mul_17, p__model_model_layers_0_mlp_up_proj_weight::T10) -> _onx_matmul_mul_172
          Mul(silu, _onx_matmul_mul_172) -> mul_18
            MatMul(mul_18, p__model_model_layers_0_mlp_down_proj_weight::T10) -> _onx_matmul_mul_18
    Add(add_7, _onx_matmul_mul_18) -> add_9
      Pow(add_9, init1_s_) -> pow_3
        ReduceMean(pow_3, init7_s1_-1, keepdims=1) -> mean_2
          Add(mean_2, init1_s_2::RSh1) -> add_10
            Sqrt(add_10) -> _onx_sqrt_add_10
              Reciprocal(_onx_sqrt_add_10) -> rsqrt_2
      Mul(add_9, rsqrt_2) -> mul_19
        Mul(_model.model.norm.weight, mul_19) -> mul_20
          MatMul(mul_20, p__model_lm_head_weight::T10) -> output_0
output: name='output_0' type=dtype('float32') shape=['batch', 'channel', 32000]
----- function name=CosSinCache_p1 domain=intermediate
----- doc_string: -- function_options=FunctionOptions(export_as_function=...
opset: domain='' version=21
input: 'position_ids'
input: 'weights'
Constant(value=[1]) -> init7_s1_1
  Unsqueeze(position_ids, init7_s1_1) -> position_ids::UnSq1
    Cast(position_ids::UnSq1, to=1) -> position_ids::UnSq1::C1
Constant(value=[0, -1, 1]) -> init7_s3_0_-1_1
  Reshape(position_ids::UnSq1::C1, init7_s3_0_-1_1) -> position_ids::UnSq1::C1::RSh0x-1x1
    Mul(weights, position_ids::UnSq1::C1::RSh0x-1x1) -> _onx_mul_weights
      Cos(_onx_mul_weights) -> cos
      Sin(_onx_mul_weights) -> sin
output: name='cos' type=? shape=?
output: name='sin' type=? shape=?
----- function name=HalfRotaryEmbedding domain=intermediate
----- doc_string: -- function_options=FunctionOptions(export_as_function=...
opset: domain='' version=21
input: 'X'
input: 'cos_cache'
input: 'sin_cache'
Mul(X, cos_cache) -> _onx_mul_X
Split(X, axis=-1, num_outputs=2) -> _onx_split_X_0, _onx_split_X_1
  Neg(_onx_split_X_1) -> _onx_neg_split_X_1
  Concat(_onx_neg_split_X_1, _onx_split_X_0, axis=-1) -> _onx_concat_neg_split_X_1
    Mul(_onx_concat_neg_split_X_1, sin_cache) -> _onx_mul_concat_neg_split_X_1
  Add(_onx_mul_X, _onx_mul_concat_neg_split_X_1) -> Y
output: name='Y' type=? shape=?
----- function name=CausalMask domain=intermediate
----- doc_string: -- function_options=FunctionOptions(export_as_function=...
opset: domain='' version=21
input: 'A'
input: 'B'
Constant(value=0) -> init7_s_0
Constant(value=1) -> init7_s_1
Constant(value=[0, 1, 2]) -> init7_s3_0_1_2
Constant(value=[0, 1, 3]) -> init7_s3_0_1_3
Squeeze(A) -> A::Sq
Squeeze(B) -> B::Sq
  Range(init7_s_0, B::Sq, init7_s_1) -> _onx_range_init7_s_0
  Unsqueeze(_onx_range_init7_s_0, init7_s3_0_1_2) -> _onx_range_init7_s_0::UnSq0x1x2
Range(A::Sq, B::Sq, init7_s_1) -> _onx_range_A::Sq
  Unsqueeze(_onx_range_A::Sq, init7_s3_0_1_3) -> _onx_range_A::Sq::UnSq0x1x3
    LessOrEqual(_onx_range_init7_s_0::UnSq0x1x2, _onx_range_A::Sq::UnSq0x1x3) -> mask
output: name='mask' type=? shape=?
----- function name=CausalMaskMulAdd domain=intermediate
----- doc_string: -- function_options=FunctionOptions(export_as_function=...
opset: domain='' version=21
input: 'A'
input: 'B'
input: 'C'
Constant(value=0) -> init7_s_0
Constant(value=1) -> init7_s_1
Constant(value=[0, 1, 2]) -> init7_s3_0_1_2
Constant(value=[1, 2, 3]) -> init7_s3_1_2_3
Squeeze(A) -> A::Sq
  Range(init7_s_0, A::Sq, init7_s_1) -> _onx_range_init7_s_0
  Unsqueeze(_onx_range_init7_s_0, init7_s3_0_1_2) -> _onx_range_init7_s_0::UnSq0x1x2
Squeeze(B) -> B::Sq
  Range(init7_s_0, B::Sq, init7_s_1) -> _onx_range_init7_s_02
  Unsqueeze(_onx_range_init7_s_02, init7_s3_1_2_3) -> _onx_range_init7_s_02::UnSq1x2x3
    Mul(_onx_range_init7_s_02::UnSq1x2x3, C) -> _onx_mul_range_init7_s_02::UnSq1x2x3
    Add(_onx_mul_range_init7_s_02::UnSq1x2x3, _onx_range_init7_s_0::UnSq0x1x2) -> mask
output: name='mask' type=? shape=?

dynamo-ir#

inputs: #1[(T7s2x3,T7s2x33,T7s2x3,T1s2x1x30x96,T1s2x1x30x96)]
shapes: dict(input_ids:{0:DYNAMIC,1:DYNAMIC},attention_mask:{0:DYNAMIC,1:DYNAMIC},position_ids:{0:DYNAMIC,1:DYNAMIC},past_key_0:{0:DYNAMIC,2:DYNAMIC},past_value_0:{0:DYNAMIC,2:DYNAMIC})

opset: domain='' version=20
input: name='input_ids' type=dtype('int64') shape=['s2', 's70']
input: name='attention_mask' type=dtype('int64') shape=['s43', 's53']
input: name='position_ids' type=dtype('int64') shape=['s2', 's70']
input: name='past_key_0' type=dtype('float32') shape=['s2', 1, 's15', 96]
input: name='past_value_0' type=dtype('float32') shape=['s2', 1, 's95', 96]
init: name='_model.model.embed_tokens.weight' type=float32 shape=(32000, 192)
init: name='_model.model.layers.0.input_layernorm.weight' type=float32 shape=(192,)
init: name='val_11' type=int64 shape=() -- array([0])
init: name='val_13' type=int64 shape=() -- array([1])
init: name='new_ones' type=bool shape=() -- array([ True])
init: name='unsqueeze_13' type=float32 shape=(1, 48, 1)
init: name='val_102' type=int64 shape=(1,) -- array([-1])
init: name='val_105' type=float32 shape=(192, 192)
init: name='val_112' type=float32 shape=(192, 96)
init: name='val_119' type=float32 shape=(192, 96)
init: name='val_128' type=int64 shape=(1,) -- array([0])
init: name='val_132' type=int64 shape=(1,) -- array([48])
init: name='val_135' type=int64 shape=(1,) -- array([3])
init: name='val_142' type=int64 shape=(1,) -- array([9223372036854775807])
init: name='val_248' type=float32 shape=() -- array([0.31947157], dtype=float32)
init: name='val_265' type=float32 shape=(192, 192)
init: name='val_270' type=float32 shape=(192, 1024)
init: name='val_272' type=float32 shape=(192, 1024)
init: name='val_273' type=float32 shape=(1024, 192)
init: name='val_298' type=float32 shape=(192, 32000)
init: name='val_30' type=int64 shape=(1,) -- array([1])
init: name='val_31' type=int64 shape=(1,) -- array([2])
init: name='val_299' type=int64 shape=(2,) -- array([1, 2])
init: name='val_300' type=int64 shape=(2,) -- array([0, 1])
init: name='val_100' type=float32 shape=() -- array([2.], dtype=float32)
init: name='val_103' type=float32 shape=() -- array([1.e-05], dtype=float32)
init: name='val_110' type=int64 shape=(1,) -- array([96])
init: name='val_238' type=int64 shape=(1,) -- array([-2])
init: name='val_240' type=int64 shape=(1,) -- array([-9223372036854775808])
init: name='val_252' type=float32 shape=() -- array([0.], dtype=float32)
init: name='val_253' type=float32 shape=() -- array([-3.4028235e+38], dtype=float32)
Cast(attention_mask, to=9) -> _to_copy
Shape(input_ids, end=2, start=1) -> val_1
  Squeeze(val_1) -> sym_size_int_27
    Range(val_11, sym_size_int_27, val_13) -> arange_2
Shape(past_key_0, end=1, start=0) -> val_5
  Squeeze(val_5) -> sym_size_int_32
    Range(val_11, sym_size_int_32, val_13) -> arange
      Unsqueeze(arange, val_299) -> unsqueeze_1
        Unsqueeze(unsqueeze_1, val_135) -> unsqueeze_2
Shape(past_key_0, end=3, start=2) -> val_6
  Squeeze(val_6) -> sym_size_int_33
    Add(sym_size_int_33, sym_size_int_27) -> add_441
      Range(val_11, add_441, val_13) -> arange_3
        Unsqueeze(arange_3, val_300) -> unsqueeze_10
          Unsqueeze(unsqueeze_10, val_31) -> unsqueeze_11
          Max(unsqueeze_2, unsqueeze_11) -> val_60
            Shape(val_60, start=0) -> val_61
          Expand(unsqueeze_2, val_61) -> val_62
            Unsqueeze(val_62, val_102) -> val_64
Shape(past_value_0, end=3, start=2) -> val_8
  Squeeze(val_8) -> sym_size_int_35
    Add(sym_size_int_27, sym_size_int_35) -> add_442
      Reshape(add_442, val_102, allowzero=0) -> val_224
  Concat(val_5, val_30, val_31, val_224, val_110, axis=0) -> val_226
Gather(_model.model.embed_tokens.weight, input_ids, axis=0) -> embedding
  Pow(embedding, val_100) -> pow_1
    ReduceMean(pow_1, val_102, noop_with_empty_axes=0, keepdims=1) -> mean
      Add(mean, val_103) -> add_124
        Sqrt(add_124) -> val_104
          Reciprocal(val_104) -> rsqrt
  Mul(embedding, rsqrt) -> mul_96
    Mul(_model.model.layers.0.input_layernorm.weight, mul_96) -> mul_100
      MatMul(mul_100, val_105) -> linear
    Add(arange_2, sym_size_int_33) -> add_13
      Unsqueeze(add_13, val_300) -> unsqueeze_7
        Unsqueeze(unsqueeze_7, val_135) -> unsqueeze_8
          LessOrEqual(unsqueeze_11, unsqueeze_8) -> le_2
            And(new_ones, le_2) -> bitwise_and
Expand(unsqueeze_11, val_61) -> val_65
  Unsqueeze(val_65, val_102) -> val_66
    Concat(val_64, val_66, axis=-1) -> val_67
  GatherND(_to_copy, val_67, batch_dims=0) -> val_68
    And(bitwise_and, val_68) -> bitwise_and_1
Reshape(add_441, val_102, allowzero=0) -> val_73
  Concat(val_5, val_30, val_1, val_73, axis=0) -> val_74
    Expand(bitwise_and_1, val_74) -> expand
      Where(expand, val_252, val_253) -> val_254
  Concat(val_5, val_30, val_30, axis=0) -> val_79
    Expand(unsqueeze_13, val_79) -> expand_1
Unsqueeze(position_ids, val_30) -> unsqueeze_14
  Cast(unsqueeze_14, to=1) -> _to_copy_1
    MatMul(expand_1, _to_copy_1) -> matmul
      Transpose(matmul, perm=[0,2,1]) -> transpose
        Concat(transpose, transpose, axis=-1) -> cat
          Cos(cat) -> cos
            Unsqueeze(cos, val_30) -> unsqueeze_15
          Sin(cat) -> sin
            Unsqueeze(sin, val_30) -> unsqueeze_16
  Concat(val_5, val_1, val_102, val_110, axis=0) -> val_111
    Reshape(linear, val_111, allowzero=1) -> view
      Transpose(view, perm=[0,2,1,3]) -> transpose_1
        Mul(transpose_1, unsqueeze_15) -> mul_139
      MatMul(mul_100, val_112) -> linear_1
    Reshape(linear_1, val_111, allowzero=1) -> view_1
      Transpose(view_1, perm=[0,2,1,3]) -> transpose_2
        Mul(transpose_2, unsqueeze_15) -> mul_162
      MatMul(mul_100, val_119) -> linear_2
    Reshape(linear_2, val_111, allowzero=1) -> view_2
      Transpose(view_2, perm=[0,2,1,3]) -> transpose_3
        Concat(past_value_0, transpose_3, axis=-2) -> cat_4
          Unsqueeze(cat_4, val_31) -> unsqueeze_18
    Expand(unsqueeze_18, val_226) -> expand_3
Slice(transpose_1, val_128, val_132, val_135, val_30) -> slice_6
Slice(transpose_1, val_132, val_142, val_135, val_30) -> slice_7
  Neg(slice_7) -> neg
  Concat(neg, slice_6, axis=-1) -> cat_1
    Mul(cat_1, unsqueeze_16) -> mul_155
      Add(mul_139, mul_155) -> add_223
        Mul(add_223, val_248) -> val_249
Slice(transpose_2, val_128, val_132, val_135, val_30) -> slice_8
Slice(transpose_2, val_132, val_142, val_135, val_30) -> slice_9
  Neg(slice_9) -> neg_1
  Concat(neg_1, slice_8, axis=-1) -> cat_2
    Mul(cat_2, unsqueeze_16) -> mul_180
      Add(mul_162, mul_180) -> add_259
        Concat(past_key_0, add_259, axis=-2) -> cat_3
          Unsqueeze(cat_3, val_31) -> unsqueeze_17
  Concat(val_5, val_30, val_31, val_73, val_110, axis=0) -> val_193
    Expand(unsqueeze_17, val_193) -> expand_2
  Concat(val_5, val_31, val_73, val_110, axis=0) -> val_199
    Reshape(expand_2, val_199, allowzero=1) -> view_3
      Shape(view_3, start=0) -> val_235
        Slice(val_235, val_102, val_142) -> val_237
  Concat(val_5, val_31, val_224, val_110, axis=0) -> val_232
    Reshape(expand_3, val_232, allowzero=1) -> view_4
Slice(val_235, val_238, val_102) -> val_239
  Concat(val_102, val_239, val_237, axis=0) -> val_243
    Reshape(view_3, val_243, allowzero=0) -> val_244
      Transpose(val_244, perm=[0,2,1]) -> val_245
Slice(val_235, val_240, val_238) -> val_241
  Concat(val_241, val_237, val_239, axis=0) -> val_246
    Reshape(val_245, val_246, allowzero=0) -> val_247
      Mul(val_247, val_248) -> val_251
        MatMul(val_249, val_251) -> val_255
        Add(val_255, val_254) -> val_256
          Softmax(val_256, axis=-1) -> val_257
            IsNaN(val_257) -> val_258
            Where(val_258, val_252, val_257) -> val_259
      MatMul(val_259, view_4) -> scaled_dot_product_attention
        Transpose(scaled_dot_product_attention, perm=[0,2,1,3]) -> transpose_4
  Concat(val_5, val_1, val_102, axis=0) -> val_264
    Reshape(transpose_4, val_264, allowzero=1) -> view_5
      MatMul(view_5, val_265) -> linear_3
  Add(embedding, linear_3) -> add_349
    Pow(add_349, val_100) -> pow_2
      ReduceMean(pow_2, val_102, noop_with_empty_axes=0, keepdims=1) -> mean_1
        Add(mean_1, val_103) -> add_362
          Sqrt(add_362) -> val_269
            Reciprocal(val_269) -> rsqrt_1
    Mul(add_349, rsqrt_1) -> mul_304
      Mul(_model.model.layers.0.input_layernorm.weight, mul_304) -> mul_308
        MatMul(mul_308, val_270) -> linear_4
          Sigmoid(linear_4) -> val_271
          Mul(linear_4, val_271) -> silu
        MatMul(mul_308, val_272) -> linear_5
          Mul(silu, linear_5) -> mul_321
            MatMul(mul_321, val_273) -> linear_6
    Add(add_349, linear_6) -> add_399
      Pow(add_399, val_100) -> pow_3
        ReduceMean(pow_3, val_102, noop_with_empty_axes=0, keepdims=1) -> mean_2
          Add(mean_2, val_103) -> add_412
            Sqrt(add_412) -> val_277
              Reciprocal(val_277) -> rsqrt_2
      Mul(add_399, rsqrt_2) -> mul_340
        Mul(_model.model.layers.0.input_layernorm.weight, mul_340) -> mul_344
          MatMul(mul_344, val_298) -> linear_7
output: name='linear_7' type=dtype('float32') shape=['s2', 's70', 32000]

tracing#

FAILED

symbolically traced variables cannot be used as inputs to control flow

new-tracing#

FAILED

TracingBool('96*_dyn_60*_dyn_61==0') cannot be converted to a Python bool; the result depends on a symbolic/dynamic dimension. Use torch._check(condition) to assert the condition holds.

TinyLLMfp16#

code: yobx.torch.testing._model_eval_cases.TinyLLMfp16

forward#

def forward(self, input_ids, attention_mask, position_ids, past_key_0, past_value_0):
    """Performs the forward pass and returns the logits tensor."""
    past_key_values = make_dynamic_cache([(past_key_0, past_value_0)])
    return self._model(
        input_ids=input_ids,
        attention_mask=attention_mask,
        position_ids=position_ids,
        past_key_values=past_key_values,
    ).logits

yobx#

inputs: #1[(T7s2x3,T7s2x33,T7s2x3,T10s2x1x30x96,T10s2x1x30x96)]
shapes: dict(input_ids:{0:DYNAMIC,1:DYNAMIC},attention_mask:{0:DYNAMIC,1:DYNAMIC},position_ids:{0:DYNAMIC,1:DYNAMIC},past_key_0:{0:DYNAMIC,2:DYNAMIC},past_value_0:{0:DYNAMIC,2:DYNAMIC})

opset: domain='' version=21
opset: domain='intermediate' version=1
opset: domain='local_functions' version=1
input: name='input_ids' type=dtype('int64') shape=['batch', 'channel']
input: name='attention_mask' type=dtype('int64') shape=['batch_3', 'channel_3']
input: name='position_ids' type=dtype('int64') shape=['batch_4', 'channel_4']
input: name='past_key_0' type=dtype('float16') shape=['batch', 1, 'D0', 96]
input: name='past_value_0' type=dtype('float16') shape=['batch', 1, 'D0_2', 96]
init: name='init7_s1_1' type=int64 shape=(1,) -- array([1])           -- Opset.make_node.1/Shape##Opset.make_node.1/Shape##Opset.make_node.1/Shape##Opset.make_node.1/Shape##Opset.make_node.1/Shape##Opset.make_node.1/Shape##Opset.make_node.1/Shape##Opset.make_node.1/Shape##GraphBuilder.make_shape_from_results.conc##Opset.make_node.1/Shape##Opset.make_node.1/Shape##Opset.make_node.1/Shape##GraphBuilder.make_shape_from_results.conc##Opset.make_node.1/Shape##Opset.make_node.1/Shape##GraphBuilder.make_shape_from_results.conc##Opset.make_node.1/Shape##Opset.make_node.1/Shape##ReshapeIsSqueezePattern.m1##SwapUnsqueezeTransposePattern.apply.new_shape##SwapUnsqueezeTransposePattern.apply.new_shape
init: name='init7_s1_2' type=int64 shape=(1,) -- array([2])           -- Opset.make_node.1/Shape##Opset.make_node.1/Shape##Opset.make_node.1/Shape##GraphBuilder.make_shape_from_results.conc##GraphBuilder.make_shape_from_results.conc##Opset.make_node.1/Shape##GraphBuilder.make_shape_from_results.conc##GraphBuilder.make_shape_from_results.conc##ReshapeIsSqueezePattern.m1##ReshapeIsSqueezePattern.m1
init: name='init7_s1_-1' type=int64 shape=(1,) -- array([-1])         -- Opset.make_node.1/Shape##Opset.make_node.1/Shape##Opset.make_node.1/Shape##GraphBuilder.make_shape_from_results.conc##GraphBuilder.make_shape_from_results.conc##Opset.make_node.1/Shape##Opset.make_node.1/Shape##ConcatReshapePattern.m1
init: name='init1_s_' type=float32 shape=() -- array([2.], dtype=float32)-- Opset.make_node.1/Small##Opset.make_node.1/Small##Opset.make_node.1/Small
init: name='init7_s1_96' type=int64 shape=(1,) -- array([96])         -- GraphBuilder.make_shape_from_results.conc##GraphBuilder.make_shape_from_results.conc##GraphBuilder.make_shape_from_results.conc##GraphBuilder.make_shape_from_results.conc##GraphBuilder.make_shape_from_results.conc
init: name='init10_s1_' type=float16 shape=(1,) -- array([0.3193], dtype=float16)-- Opset.make_node.1/Small##Opset.make_node.1/Small
init: name='init10_s1_2' type=float16 shape=(1,) -- array([0.], dtype=float16)-- Opset.make_node.1/Small##Opset.make_node.1/Small
init: name='init10_s1_3' type=float16 shape=(1,) -- array([-inf], dtype=float16)-- Opset.make_node.1/Small
init: name='init1_s_2::RSh1' type=float32 shape=(1,) -- array([1.e-05], dtype=float32)-- GraphBuilder.constant_folding.from/fold(init1_s_2,init7_s1_1)##init1_s_2/shape_type_compute._cast_inputs.0##shape_type_compute._cast_inputs.0##shape_type_compute._cast_inputs.0##init7_s1_1/Opset.make_node.1/Shape##Opset.make_node.1/Shape##Opset.make_node.1/Shape##Opset.make_node.1/Shape##Opset.make_node.1/Shape##Opset.make_node.1/Shape##Opset.make_node.1/Shape##Opset.make_node.1/Shape##GraphBuilder.make_shape_from_results.conc##Opset.make_node.1/Shape##Opset.make_node.1/Shape##Opset.make_node.1/Shape##GraphBuilder.make_shape_from_results.conc##Opset.make_node.1/Shape##Opset.make_node.1/Shape##GraphBuilder.make_shape_from_results.conc##Opset.make_node.1/Shape##Opset.make_node.1/Shape
init: name='p__model_model_layers_0_self_attn_q_proj_weight::T10' type=float16 shape=(192, 192)-- GraphBuilder.constant_folding.from/fold(p__model_model_layers_0_self_attn_q_proj_weight)##p__model_model_layers_0_self_attn_q_proj_weight/DynamoInterpret.placeholder.1/P(_model.model.layers.0.self_attn.q_proj.weight)
init: name='p__model_model_layers_0_self_attn_k_proj_weight::T10' type=float16 shape=(192, 96)-- GraphBuilder.constant_folding.from/fold(p__model_model_layers_0_self_attn_k_proj_weight)##p__model_model_layers_0_self_attn_k_proj_weight/DynamoInterpret.placeholder.1/P(_model.model.layers.0.self_attn.k_proj.weight)
init: name='p__model_model_layers_0_self_attn_v_proj_weight::T10' type=float16 shape=(192, 96)-- GraphBuilder.constant_folding.from/fold(p__model_model_layers_0_self_attn_v_proj_weight)##p__model_model_layers_0_self_attn_v_proj_weight/DynamoInterpret.placeholder.1/P(_model.model.layers.0.self_attn.v_proj.weight)
init: name='p__model_model_layers_0_self_attn_o_proj_weight::T10' type=float16 shape=(192, 192)-- GraphBuilder.constant_folding.from/fold(p__model_model_layers_0_self_attn_o_proj_weight)##p__model_model_layers_0_self_attn_o_proj_weight/DynamoInterpret.placeholder.1/P(_model.model.layers.0.self_attn.o_proj.weight)
init: name='p__model_model_layers_0_mlp_gate_proj_weight::T10' type=float16 shape=(192, 1024)-- GraphBuilder.constant_folding.from/fold(p__model_model_layers_0_mlp_gate_proj_weight)##p__model_model_layers_0_mlp_gate_proj_weight/DynamoInterpret.placeholder.1/P(_model.model.layers.0.mlp.gate_proj.weight)
init: name='p__model_model_layers_0_mlp_up_proj_weight::T10' type=float16 shape=(192, 1024)-- GraphBuilder.constant_folding.from/fold(p__model_model_layers_0_mlp_up_proj_weight)##p__model_model_layers_0_mlp_up_proj_weight/DynamoInterpret.placeholder.1/P(_model.model.layers.0.mlp.up_proj.weight)
init: name='p__model_model_layers_0_mlp_down_proj_weight::T10' type=float16 shape=(1024, 192)-- GraphBuilder.constant_folding.from/fold(p__model_model_layers_0_mlp_down_proj_weight)##p__model_model_layers_0_mlp_down_proj_weight/DynamoInterpret.placeholder.1/P(_model.model.layers.0.mlp.down_proj.weight)
init: name='p__model_lm_head_weight::T10' type=float16 shape=(192, 32000)-- GraphBuilder.constant_folding.from/fold(p__model_lm_head_weight)##p__model_lm_head_weight/DynamoInterpret.placeholder.1/P(_model.lm_head.weight)
init: name='expand_122' type=float32 shape=(1, 1, 48)                 -- GraphBuilder.constant_folding.from/fold(expand_1-ZEROS2,to_3)##to_3/GraphBuilder.constant_folding.from/fold(unsqueeze_13)##unsqueeze_13/GraphBuilder.constant_folding.from/fold(b__model_model_rotary_emb_inv_freq,init7_s2_0_2)##b__model_model_rotary_emb_inv_freq/DynamoInterpret.placeholder.0##init7_s2_0_2/##expand_1-ZEROS2/
init: name='init7_s5_1_1_2_1_1' type=int64 shape=(5,)                 -- ShapeBasedStaticExpandPattern.m1
init: name='init7_s4_0_0_2_96' type=int64 shape=(4,) -- array([ 0,  0,  2, 96])-- EditDistanceReshapePattern.m1
init: name='_model.model.embed_tokens.weight' type=float16 shape=(32000, 192)-- DynamoInterpret.placeholder.1/P(_model.model.embed_tokens.weight)
init: name='_model.model.layers.0.input_layernorm.weight' type=float16 shape=(192,)-- DynamoInterpret.placeholder.1/P(_model.model.layers.0.input_layernorm.weight)
init: name='_model.model.layers.0.post_attention_layernorm.weight' type=float16 shape=(192,)-- DynamoInterpret.placeholder.1/P(_model.model.layers.0.post_attention_layernorm.weight)
init: name='_model.model.norm.weight' type=float16 shape=(192,)       -- DynamoInterpret.placeholder.1/P(_model.model.norm.weight)
Cast(attention_mask, to=9) -> to
  Shape(to, start=-1) -> to::Shape-1:
Shape(input_ids, end=2, start=1) -> input_ids::Shape1:2
Shape(past_key_0, end=1, start=0) -> past_key_0::Shape:1
  Concat(past_key_0::Shape:1, init7_s1_1, init7_s1_1, axis=0) -> _onx_concat_sym_size_int_20::UnSq08
    Expand(expand_122, _onx_concat_sym_size_int_20::UnSq08) -> expand_12
      CosSinCache_to10_p1[intermediate](position_ids, expand_12) -> uoutput_0, uoutput_1
        Unsqueeze(uoutput_0, init7_s1_1) -> uunsqueeze_15
          Concat(uunsqueeze_15, uunsqueeze_15, axis=-1) -> unsqueeze_15
Shape(past_key_0, end=3, start=2) -> past_key_0::Shape2:3
  Add(input_ids::Shape1:2, past_key_0::Shape2:3) -> SqueezeAddPattern_SwapRangeAddScalarPattern--sym_size_int_15
  CausalMask[intermediate](past_key_0::Shape2:3, SqueezeAddPattern_SwapRangeAddScalarPattern--sym_size_int_15) -> le_3
Gather(_model.model.embed_tokens.weight, input_ids) -> embedding
  Cast(embedding, to=1) -> to_10
    Pow(to_10, init1_s_) -> pow_1
      ReduceMean(pow_1, init7_s1_-1, keepdims=1) -> mean
        Add(mean, init1_s_2::RSh1) -> add_4
          Sqrt(add_4) -> _onx_sqrt_add_4
            Reciprocal(_onx_sqrt_add_4) -> rsqrt
    Mul(to_10, rsqrt) -> mul_2
      Cast(mul_2, to=10) -> to_11
        Mul(_model.model.layers.0.input_layernorm.weight, to_11) -> mul_3
          MatMul(mul_3, p__model_model_layers_0_self_attn_q_proj_weight::T10) -> _onx_matmul_mul_3
            Reshape(_onx_matmul_mul_3, init7_s4_0_0_2_96) -> view
              Transpose(view, perm=[0,2,1,3]) -> transpose_1
  CausalMaskMulAdd[intermediate](SqueezeAddPattern_SwapRangeAddScalarPattern--sym_size_int_15, past_key_0::Shape:1, to::Shape-1:) -> _onx_add_unsqueeze_11
    Shape(_onx_add_unsqueeze_11) -> _onx_add_unsqueeze_11::Shape:
  Reshape(to, init7_s1_-1) -> to::RSh-1
Reshape(_onx_add_unsqueeze_11, init7_s1_-1) -> _onx_add_unsqueeze_11::RSh-1
  Gather(to::RSh-1, _onx_add_unsqueeze_11::RSh-1) -> _onx_gather_to::RSh-1
    Reshape(_onx_gather_to::RSh-1, _onx_add_unsqueeze_11::Shape:) -> index
    And(le_3, index) -> and_2
Unsqueeze(uoutput_1, init7_s1_1) -> uunsqueeze_16
  Concat(uunsqueeze_16, uunsqueeze_16, axis=-1) -> unsqueeze_16
    HalfRotaryEmbedding[intermediate](transpose_1, unsqueeze_15, unsqueeze_16) -> add_5
      Mul(add_5, init10_s1_) -> _onx_mul_add_5
MatMul(mul_3, p__model_model_layers_0_self_attn_k_proj_weight::T10) -> _onx_matmul_mul_32
  Unsqueeze(_onx_matmul_mul_32, init7_s1_1) -> transpose_2
    HalfRotaryEmbedding[intermediate](transpose_2, unsqueeze_15, unsqueeze_16) -> add_6
      Concat(past_key_0, add_6, axis=-2) -> cat_3
        Unsqueeze(cat_3, init7_s1_2) -> unsqueeze_17
MatMul(mul_3, p__model_model_layers_0_self_attn_v_proj_weight::T10) -> _onx_matmul_mul_33
  Unsqueeze(_onx_matmul_mul_33, init7_s1_1) -> transpose_3
    Concat(past_value_0, transpose_3, axis=-2) -> cat_4
      Unsqueeze(cat_4, init7_s1_2) -> unsqueeze_18
        Expand(unsqueeze_18, init7_s5_1_1_2_1_1) -> expand_3
          Squeeze(expand_3, init7_s1_1) -> reshape_1
  Concat(past_key_0::Shape:1, init7_s1_1, init7_s1_2, SqueezeAddPattern_SwapRangeAddScalarPattern--sym_size_int_15, init7_s1_96, axis=0) -> _onx_concat_sym_size_int_20::UnSq03
    Expand(unsqueeze_17, _onx_concat_sym_size_int_20::UnSq03) -> expand_2
      Mul(expand_2, init10_s1_) -> SwapUnaryPattern--reshape
  Concat(past_key_0::Shape:1, init7_s1_2, init7_s1_-1, init7_s1_96, axis=0) -> _onx_concat_sym_size_int_20::UnSq04--concat
    Reshape(SwapUnaryPattern--reshape, _onx_concat_sym_size_int_20::UnSq04--concat) -> _onx_mul_reshape
      Transpose(_onx_mul_reshape, perm=[0,1,3,2]) -> _onx_mul_reshape::T0132
        MatMul(_onx_mul_add_5, _onx_mul_reshape::T0132) -> _onx_matmul_mul_add_5
      Where(and_2, _onx_matmul_mul_add_5, init10_s1_3) -> _onx_add_matmul_mul_add_5
        Softmax(_onx_add_matmul_mul_add_5, axis=-1) -> _onx_softmax_add_matmul_mul_add_5
          IsNaN(_onx_softmax_add_matmul_mul_add_5) -> _onx_isnan_softmax_add_matmul_mul_add_5
          Where(_onx_isnan_softmax_add_matmul_mul_add_5, init10_s1_2, _onx_softmax_add_matmul_mul_add_5) -> _onx_where_isnan_softmax_add_matmul_mul_add_5
            MatMul(_onx_where_isnan_softmax_add_matmul_mul_add_5, reshape_1) -> scaled_dot_product_attention
              Transpose(scaled_dot_product_attention, perm=[0,2,1,3]) -> transpose_4
  Concat(past_key_0::Shape:1, input_ids::Shape1:2, init7_s1_-1, axis=0) -> _onx_concat_sym_size_int_20::UnSq07
    Reshape(transpose_4, _onx_concat_sym_size_int_20::UnSq07) -> reshape_2
      MatMul(reshape_2, p__model_model_layers_0_self_attn_o_proj_weight::T10) -> _onx_matmul_reshape_2
  Add(embedding, _onx_matmul_reshape_2) -> add_7
    Cast(add_7, to=1) -> to_12
      Pow(to_12, init1_s_) -> pow_2
        ReduceMean(pow_2, init7_s1_-1, keepdims=1) -> mean_1
          Add(mean_1, init1_s_2::RSh1) -> add_8
            Sqrt(add_8) -> _onx_sqrt_add_8
              Reciprocal(_onx_sqrt_add_8) -> rsqrt_1
      Mul(to_12, rsqrt_1) -> mul_16
        Cast(mul_16, to=10) -> to_13
          Mul(_model.model.layers.0.post_attention_layernorm.weight, to_13) -> mul_17
            MatMul(mul_17, p__model_model_layers_0_mlp_gate_proj_weight::T10) -> _onx_matmul_mul_17
              Sigmoid(_onx_matmul_mul_17) -> _onx_sigmoid_linear_4
              Mul(_onx_matmul_mul_17, _onx_sigmoid_linear_4) -> silu
            MatMul(mul_17, p__model_model_layers_0_mlp_up_proj_weight::T10) -> _onx_matmul_mul_172
              Mul(silu, _onx_matmul_mul_172) -> mul_18
                MatMul(mul_18, p__model_model_layers_0_mlp_down_proj_weight::T10) -> _onx_matmul_mul_18
    Add(add_7, _onx_matmul_mul_18) -> add_9
      Cast(add_9, to=1) -> to_14
        Pow(to_14, init1_s_) -> pow_3
          ReduceMean(pow_3, init7_s1_-1, keepdims=1) -> mean_2
            Add(mean_2, init1_s_2::RSh1) -> add_10
              Sqrt(add_10) -> _onx_sqrt_add_10
                Reciprocal(_onx_sqrt_add_10) -> rsqrt_2
        Mul(to_14, rsqrt_2) -> mul_19
          Cast(mul_19, to=10) -> to_15
            Mul(_model.model.norm.weight, to_15) -> mul_20
              MatMul(mul_20, p__model_lm_head_weight::T10) -> output_0
output: name='output_0' type=dtype('float16') shape=['batch', 'channel', 32000]
----- function name=CosSinCache_to10_p1 domain=intermediate
----- doc_string: -- function_options=FunctionOptions(export_as_function=...
opset: domain='' version=21
input: 'position_ids'
input: 'weights'
Constant(value=[1]) -> init7_s1_1
  Unsqueeze(position_ids, init7_s1_1) -> position_ids::UnSq1
    Cast(position_ids::UnSq1, to=1) -> position_ids::UnSq1::C1
Constant(value=[0, -1, 1]) -> init7_s3_0_-1_1
  Reshape(position_ids::UnSq1::C1, init7_s3_0_-1_1) -> position_ids::UnSq1::C1::RSh0x-1x1
    Mul(weights, position_ids::UnSq1::C1::RSh0x-1x1) -> _onx_mul_weights
      Cos(_onx_mul_weights) -> _onx_cos_mul_weights
        Cast(_onx_cos_mul_weights, to=10) -> cos
      Sin(_onx_mul_weights) -> _onx_sin_mul_weights
        Cast(_onx_sin_mul_weights, to=10) -> sin
output: name='cos' type=? shape=?
output: name='sin' type=? shape=?
----- function name=HalfRotaryEmbedding domain=intermediate
----- doc_string: -- function_options=FunctionOptions(export_as_function=...
opset: domain='' version=21
input: 'X'
input: 'cos_cache'
input: 'sin_cache'
Mul(X, cos_cache) -> _onx_mul_X
Split(X, axis=-1, num_outputs=2) -> _onx_split_X_0, _onx_split_X_1
  Neg(_onx_split_X_1) -> _onx_neg_split_X_1
  Concat(_onx_neg_split_X_1, _onx_split_X_0, axis=-1) -> _onx_concat_neg_split_X_1
    Mul(_onx_concat_neg_split_X_1, sin_cache) -> _onx_mul_concat_neg_split_X_1
  Add(_onx_mul_X, _onx_mul_concat_neg_split_X_1) -> Y
output: name='Y' type=? shape=?
----- function name=CausalMask domain=intermediate
----- doc_string: -- function_options=FunctionOptions(export_as_function=...
opset: domain='' version=21
input: 'A'
input: 'B'
Constant(value=0) -> init7_s_0
Constant(value=1) -> init7_s_1
Constant(value=[0, 1, 2]) -> init7_s3_0_1_2
Constant(value=[0, 1, 3]) -> init7_s3_0_1_3
Squeeze(A) -> A::Sq
Squeeze(B) -> B::Sq
  Range(init7_s_0, B::Sq, init7_s_1) -> _onx_range_init7_s_0
  Unsqueeze(_onx_range_init7_s_0, init7_s3_0_1_2) -> _onx_range_init7_s_0::UnSq0x1x2
Range(A::Sq, B::Sq, init7_s_1) -> _onx_range_A::Sq
  Unsqueeze(_onx_range_A::Sq, init7_s3_0_1_3) -> _onx_range_A::Sq::UnSq0x1x3
    LessOrEqual(_onx_range_init7_s_0::UnSq0x1x2, _onx_range_A::Sq::UnSq0x1x3) -> mask
output: name='mask' type=? shape=?
----- function name=CausalMaskMulAdd domain=intermediate
----- doc_string: -- function_options=FunctionOptions(export_as_function=...
opset: domain='' version=21
input: 'A'
input: 'B'
input: 'C'
Constant(value=0) -> init7_s_0
Constant(value=1) -> init7_s_1
Constant(value=[0, 1, 2]) -> init7_s3_0_1_2
Constant(value=[1, 2, 3]) -> init7_s3_1_2_3
Squeeze(A) -> A::Sq
  Range(init7_s_0, A::Sq, init7_s_1) -> _onx_range_init7_s_0
  Unsqueeze(_onx_range_init7_s_0, init7_s3_0_1_2) -> _onx_range_init7_s_0::UnSq0x1x2
Squeeze(B) -> B::Sq
  Range(init7_s_0, B::Sq, init7_s_1) -> _onx_range_init7_s_02
  Unsqueeze(_onx_range_init7_s_02, init7_s3_1_2_3) -> _onx_range_init7_s_02::UnSq1x2x3
    Mul(_onx_range_init7_s_02::UnSq1x2x3, C) -> _onx_mul_range_init7_s_02::UnSq1x2x3
    Add(_onx_mul_range_init7_s_02::UnSq1x2x3, _onx_range_init7_s_0::UnSq0x1x2) -> mask
output: name='mask' type=? shape=?

dynamo-ir#

inputs: #1[(T7s2x3,T7s2x33,T7s2x3,T10s2x1x30x96,T10s2x1x30x96)]
shapes: dict(input_ids:{0:DYNAMIC,1:DYNAMIC},attention_mask:{0:DYNAMIC,1:DYNAMIC},position_ids:{0:DYNAMIC,1:DYNAMIC},past_key_0:{0:DYNAMIC,2:DYNAMIC},past_value_0:{0:DYNAMIC,2:DYNAMIC})

opset: domain='' version=20
input: name='input_ids' type=dtype('int64') shape=['s2', 's70']
input: name='attention_mask' type=dtype('int64') shape=['s43', 's53']
input: name='position_ids' type=dtype('int64') shape=['s2', 's70']
input: name='past_key_0' type=dtype('float16') shape=['s2', 1, 's15', 96]
input: name='past_value_0' type=dtype('float16') shape=['s2', 1, 's95', 96]
init: name='_model.model.embed_tokens.weight' type=float16 shape=(32000, 192)
init: name='_model.model.layers.0.input_layernorm.weight' type=float16 shape=(192,)
init: name='val_11' type=int64 shape=() -- array([0])
init: name='val_13' type=int64 shape=() -- array([1])
init: name='new_ones' type=bool shape=() -- array([ True])
init: name='_to_copy_1' type=float32 shape=(1, 48, 1)
init: name='val_102' type=int64 shape=(1,) -- array([-1])
init: name='val_105' type=float16 shape=(192, 192)
init: name='val_112' type=float16 shape=(192, 96)
init: name='val_119' type=float16 shape=(192, 96)
init: name='val_128' type=int64 shape=(1,) -- array([0])
init: name='val_132' type=int64 shape=(1,) -- array([48])
init: name='val_135' type=int64 shape=(1,) -- array([3])
init: name='val_142' type=int64 shape=(1,) -- array([9223372036854775807])
init: name='val_248' type=float16 shape=() -- array([0.3193], dtype=float16)
init: name='val_265' type=float16 shape=(192, 192)
init: name='val_270' type=float16 shape=(192, 1024)
init: name='val_272' type=float16 shape=(192, 1024)
init: name='val_273' type=float16 shape=(1024, 192)
init: name='val_298' type=float16 shape=(192, 32000)
init: name='val_30' type=int64 shape=(1,) -- array([1])
init: name='val_31' type=int64 shape=(1,) -- array([2])
init: name='val_299' type=int64 shape=(2,) -- array([1, 2])
init: name='val_300' type=int64 shape=(2,) -- array([0, 1])
init: name='val_100' type=float32 shape=() -- array([2.], dtype=float32)
init: name='val_103' type=float32 shape=() -- array([1.e-05], dtype=float32)
init: name='val_110' type=int64 shape=(1,) -- array([96])
init: name='val_238' type=int64 shape=(1,) -- array([-2])
init: name='val_240' type=int64 shape=(1,) -- array([-9223372036854775808])
init: name='val_252' type=float16 shape=() -- array([0.], dtype=float16)
init: name='val_253' type=float16 shape=() -- array([-6.55e+04], dtype=float16)
Cast(attention_mask, to=9) -> _to_copy
Shape(input_ids, end=2, start=1) -> val_1
  Squeeze(val_1) -> sym_size_int_27
    Range(val_11, sym_size_int_27, val_13) -> arange_2
Shape(past_key_0, end=1, start=0) -> val_5
  Squeeze(val_5) -> sym_size_int_32
    Range(val_11, sym_size_int_32, val_13) -> arange
      Unsqueeze(arange, val_299) -> unsqueeze_1
        Unsqueeze(unsqueeze_1, val_135) -> unsqueeze_2
Shape(past_key_0, end=3, start=2) -> val_6
  Squeeze(val_6) -> sym_size_int_33
    Add(sym_size_int_33, sym_size_int_27) -> add_473
      Range(val_11, add_473, val_13) -> arange_3
        Unsqueeze(arange_3, val_300) -> unsqueeze_10
          Unsqueeze(unsqueeze_10, val_31) -> unsqueeze_11
          Max(unsqueeze_2, unsqueeze_11) -> val_60
            Shape(val_60, start=0) -> val_61
          Expand(unsqueeze_2, val_61) -> val_62
            Unsqueeze(val_62, val_102) -> val_64
Shape(past_value_0, end=3, start=2) -> val_8
  Squeeze(val_8) -> sym_size_int_35
    Add(sym_size_int_27, sym_size_int_35) -> add_474
      Reshape(add_474, val_102, allowzero=0) -> val_224
  Concat(val_5, val_30, val_31, val_224, val_110, axis=0) -> val_226
Gather(_model.model.embed_tokens.weight, input_ids, axis=0) -> embedding
  Cast(embedding, to=1) -> _to_copy_5
    Pow(_to_copy_5, val_100) -> pow_1
      ReduceMean(pow_1, val_102, noop_with_empty_axes=0, keepdims=1) -> mean
        Add(mean, val_103) -> add_136
          Sqrt(add_136) -> val_104
            Reciprocal(val_104) -> rsqrt
    Mul(_to_copy_5, rsqrt) -> mul_105
      Cast(mul_105, to=10) -> _to_copy_6
        Mul(_model.model.layers.0.input_layernorm.weight, _to_copy_6) -> mul_112
          MatMul(mul_112, val_105) -> linear
    Add(arange_2, sym_size_int_33) -> add_13
      Unsqueeze(add_13, val_300) -> unsqueeze_7
        Unsqueeze(unsqueeze_7, val_135) -> unsqueeze_8
          LessOrEqual(unsqueeze_11, unsqueeze_8) -> le_2
            And(new_ones, le_2) -> bitwise_and
Expand(unsqueeze_11, val_61) -> val_65
  Unsqueeze(val_65, val_102) -> val_66
    Concat(val_64, val_66, axis=-1) -> val_67
  GatherND(_to_copy, val_67, batch_dims=0) -> val_68
    And(bitwise_and, val_68) -> bitwise_and_1
Reshape(add_473, val_102, allowzero=0) -> val_73
  Concat(val_5, val_30, val_1, val_73, axis=0) -> val_74
    Expand(bitwise_and_1, val_74) -> expand
      Where(expand, val_252, val_253) -> val_254
  Concat(val_5, val_30, val_30, axis=0) -> val_79
    Expand(_to_copy_1, val_79) -> expand_1
Unsqueeze(position_ids, val_30) -> unsqueeze_14
  Cast(unsqueeze_14, to=1) -> _to_copy_2
    MatMul(expand_1, _to_copy_2) -> matmul
      Transpose(matmul, perm=[0,2,1]) -> transpose
        Concat(transpose, transpose, axis=-1) -> cat
          Cos(cat) -> cos
            Cast(cos, to=10) -> _to_copy_3
              Unsqueeze(_to_copy_3, val_30) -> unsqueeze_15
          Sin(cat) -> sin
            Cast(sin, to=10) -> _to_copy_4
              Unsqueeze(_to_copy_4, val_30) -> unsqueeze_16
  Concat(val_5, val_1, val_102, val_110, axis=0) -> val_111
    Reshape(linear, val_111, allowzero=1) -> view
      Transpose(view, perm=[0,2,1,3]) -> transpose_1
        Mul(transpose_1, unsqueeze_15) -> mul_151
MatMul(mul_112, val_112) -> linear_1
  Reshape(linear_1, val_111, allowzero=1) -> view_1
    Transpose(view_1, perm=[0,2,1,3]) -> transpose_2
      Mul(transpose_2, unsqueeze_15) -> mul_174
MatMul(mul_112, val_119) -> linear_2
  Reshape(linear_2, val_111, allowzero=1) -> view_2
    Transpose(view_2, perm=[0,2,1,3]) -> transpose_3
      Concat(past_value_0, transpose_3, axis=-2) -> cat_4
        Unsqueeze(cat_4, val_31) -> unsqueeze_18
    Expand(unsqueeze_18, val_226) -> expand_3
Slice(transpose_1, val_128, val_132, val_135, val_30) -> slice_6
Slice(transpose_1, val_132, val_142, val_135, val_30) -> slice_7
  Neg(slice_7) -> neg
  Concat(neg, slice_6, axis=-1) -> cat_1
    Mul(cat_1, unsqueeze_16) -> mul_167
      Add(mul_151, mul_167) -> add_239
        Mul(add_239, val_248) -> val_249
      Slice(transpose_2, val_128, val_132, val_135, val_30) -> slice_8
Slice(transpose_2, val_132, val_142, val_135, val_30) -> slice_9
  Neg(slice_9) -> neg_1
    Concat(neg_1, slice_8, axis=-1) -> cat_2
      Mul(cat_2, unsqueeze_16) -> mul_192
        Add(mul_174, mul_192) -> add_275
          Concat(past_key_0, add_275, axis=-2) -> cat_3
            Unsqueeze(cat_3, val_31) -> unsqueeze_17
  Concat(val_5, val_30, val_31, val_73, val_110, axis=0) -> val_193
    Expand(unsqueeze_17, val_193) -> expand_2
  Concat(val_5, val_31, val_73, val_110, axis=0) -> val_199
    Reshape(expand_2, val_199, allowzero=1) -> view_3
      Shape(view_3, start=0) -> val_235
        Slice(val_235, val_102, val_142) -> val_237
  Concat(val_5, val_31, val_224, val_110, axis=0) -> val_232
    Reshape(expand_3, val_232, allowzero=1) -> view_4
Slice(val_235, val_238, val_102) -> val_239
  Concat(val_102, val_239, val_237, axis=0) -> val_243
    Reshape(view_3, val_243, allowzero=0) -> val_244
      Transpose(val_244, perm=[0,2,1]) -> val_245
Slice(val_235, val_240, val_238) -> val_241
  Concat(val_241, val_237, val_239, axis=0) -> val_246
    Reshape(val_245, val_246, allowzero=0) -> val_247
      Mul(val_247, val_248) -> val_251
        MatMul(val_249, val_251) -> val_255
        Add(val_255, val_254) -> val_256
          Softmax(val_256, axis=-1) -> val_257
            IsNaN(val_257) -> val_258
            Where(val_258, val_252, val_257) -> val_259
      MatMul(val_259, view_4) -> scaled_dot_product_attention
        Transpose(scaled_dot_product_attention, perm=[0,2,1,3]) -> transpose_4
  Concat(val_5, val_1, val_102, axis=0) -> val_264
    Reshape(transpose_4, val_264, allowzero=1) -> view_5
      MatMul(view_5, val_265) -> linear_3
  Add(embedding, linear_3) -> add_365
    Cast(add_365, to=1) -> _to_copy_7
      Pow(_to_copy_7, val_100) -> pow_2
        ReduceMean(pow_2, val_102, noop_with_empty_axes=0, keepdims=1) -> mean_1
          Add(mean_1, val_103) -> add_382
            Sqrt(add_382) -> val_269
              Reciprocal(val_269) -> rsqrt_1
      Mul(_to_copy_7, rsqrt_1) -> mul_319
        Cast(mul_319, to=10) -> _to_copy_8
          Mul(_model.model.layers.0.input_layernorm.weight, _to_copy_8) -> mul_326
            MatMul(mul_326, val_270) -> linear_4
              Sigmoid(linear_4) -> val_271
              Mul(linear_4, val_271) -> silu
            MatMul(mul_326, val_272) -> linear_5
              Mul(silu, linear_5) -> mul_339
                MatMul(mul_339, val_273) -> linear_6
    Add(add_365, linear_6) -> add_423
      Cast(add_423, to=1) -> _to_copy_9
        Pow(_to_copy_9, val_100) -> pow_3
          ReduceMean(pow_3, val_102, noop_with_empty_axes=0, keepdims=1) -> mean_2
            Add(mean_2, val_103) -> add_440
              Sqrt(add_440) -> val_277
                Reciprocal(val_277) -> rsqrt_2
        Mul(_to_copy_9, rsqrt_2) -> mul_361
          Cast(mul_361, to=10) -> _to_copy_10
            Mul(_model.model.layers.0.input_layernorm.weight, _to_copy_10) -> mul_368
              MatMul(mul_368, val_298) -> linear_7
output: name='linear_7' type=dtype('float16') shape=['s2', 's70', 32000]

tracing#

FAILED

symbolically traced variables cannot be used as inputs to control flow

new-tracing#

FAILED

TracingBool('96*_dyn_70*_dyn_71==0') cannot be converted to a Python bool; the result depends on a symbolic/dynamic dimension. Use torch._check(condition) to assert the condition holds.

TypeBFloat16#

code: yobx.torch.testing._model_eval_cases.TypeBFloat16

forward#

def forward(self, x):
    xb = x.to(torch.bfloat16)
    return (xb + xb).to(torch.float32)

yobx#

inputs: #1[(T1s4x4,)]
shapes: dict(x:{0:Dim(batch)})

opset: domain='' version=21
input: name='x' type=dtype('float32') shape=['batch', 4]
Add(x, x) -> add-x
  Cast(add-x, to=16) -> add
    Cast(add, to=1) -> output_0
output: name='output_0' type=dtype('float32') shape=['batch', 4]

dynamo-ir#

inputs: #1[(T1s4x4,)]
shapes: dict(x:{0:Dim(batch)})

opset: domain='' version=20
input: name='x' type=dtype('float32') shape=['batch', 4]
Cast(x, to=16) -> _to_copy
  Add(_to_copy, _to_copy) -> add_3
    Cast(add_3, to=1) -> _to_copy_1
output: name='_to_copy_1' type=dtype('float32') shape=['batch', 4]

FAILED

[ONNXRuntimeError] : 9 : NOT_IMPLEMENTED : Could not find an implementation for Add(14) node with name 'node_add_3'

tracing#

inputs: #1[(T1s4x4,)]
shapes: dict(x:{0:Dim(batch)})

opset: domain='' version=21
input: name='x' type=dtype('float32') shape=['batch', 4]
Add(x, x) -> add-x
  Cast(add-x, to=16) -> add
    Cast(add, to=1) -> output
output: name='output' type=dtype('float32') shape=['batch', 4]

new-tracing#

inputs: #1[(T1s4x4,)]
shapes: dict(x:{0:Dim(batch)})

opset: domain='' version=21
input: name='x' type=dtype('float32') shape=['batch', 4]
Add(x, x) -> add-x
  Cast(add-x, to=16) -> add_tensor
    Cast(add_tensor, to=1) -> output
output: name='output' type=dtype('float32') shape=['batch', 4]

Vmap#

code: yobx.torch.testing._model_eval_cases.Vmap

forward#

def forward(self, x, y):
    f = lambda x, y: x * y + 1  # noqa: E731
    return torch.vmap(f)(x, y)

yobx#

inputs: #1[(T1s3,T1s3)]
shapes: dict(x:{0:DYNAMIC},y:{0:DYNAMIC})

opset: domain='' version=21
input: name='x' type=dtype('float32') shape=['batch']
input: name='y' type=dtype('float32') shape=['batch']
init: name='init1_s_' type=float32 shape=() -- array([1.], dtype=float32)-- shape_type_compute._cast_inputs.0
Shape(y, end=1, start=0) -> y::Shape:1
Squeeze(x) -> clone_default::Sq
Squeeze(y) -> clone_default_1::Sq
  Mul(clone_default::Sq, clone_default_1::Sq) -> mul
    Add(mul, init1_s_) -> add
  Expand(add, y::Shape:1) -> output_0
output: name='output_0' type=dtype('float32') shape=['batch']

dynamo-ir#

inputs: #1[(T1s3,T1s3)]
shapes: dict(x:{0:DYNAMIC},y:{0:DYNAMIC})

opset: domain='' version=20
input: name='x' type=dtype('float32') shape=['s17']
input: name='y' type=dtype('float32') shape=['s17']
init: name='_to_copy' type=float32 shape=() -- array([1.], dtype=float32)
Mul(x, y) -> mul
  Add(mul, _to_copy) -> add_2
output: name='add_2' type=dtype('float32') shape=['s17']

tracing#

inputs: #1[(T1s3,T1s3)]
shapes: dict(x:{0:DYNAMIC},y:{0:DYNAMIC})

opset: domain='' version=21
input: name='x' type=dtype('float32') shape=['batch']
input: name='y' type=dtype('float32') shape=['batch_2']
init: name='init1_s_::RSh1' type=float32 shape=(1,) -- array([1.], dtype=float32)-- GraphBuilder.constant_folding.from/fold(init1_s_,init7_s1_1)##init1_s_/shape_type_compute._cast_inputs.1(add)##init7_s1_1/Opset.make_node.1/Shape
Mul(x, y) -> mul
  Add(mul, init1_s_::RSh1) -> output
output: name='output' type=dtype('float32') shape=['DYN0^DYN1']

new-tracing#

inputs: #1[(T1s3,T1s3)]
shapes: dict(x:{0:DYNAMIC},y:{0:DYNAMIC})

opset: domain='' version=21
input: name='x' type=dtype('float32') shape=['batch']
input: name='y' type=dtype('float32') shape=['batch_2']
init: name='param_1::RSh1' type=float32 shape=(1,) -- array([1.], dtype=float32)-- GraphBuilder.constant_folding.from/fold(init7_s1_1,param_1)##param_1/DynamoInterpret.placeholder.0##init7_s1_1/Opset.make_node.1/Shape
Mul(x, y) -> mul_tensor
  Add(mul_tensor, param_1::RSh1) -> output
output: name='output' type=dtype('float32') shape=['batch']

VmapPython#

code: yobx.torch.testing._model_eval_cases.VmapPython

forward#

def forward(self, x, y):
    f = lambda x, y: x * y + 1  # noqa: E731
    return patched_vmap(f)(x, y)

yobx#

inputs: #1[(T1s3,T1s3)]
shapes: dict(x:{0:DYNAMIC},y:{0:DYNAMIC})

opset: domain='' version=21
opset: domain='local_functions' version=1
input: name='x' type=dtype('float32') shape=['batch']
input: name='y' type=dtype('float32') shape=['batch_2']
init: name='init1_s_2_cst2init' type=float32 shape=() -- array([1.], dtype=float32)-- GraphBuilderPatternOptimization.make_initializer.1/Small
Scan(x, y, body=G1, num_scan_inputs=2, scan_input_directions=[0,0], scan_output_axes=[0], scan_output_directions=[0]) -> output_0
output: name='output_0' type=dtype('float32') shape=['batch']
----- subgraph ---- Scan - scan - att.body=G1 -- level=1 -- scan_0_movedim,scan_1_movedim_1 -> output_0
input: name='scan_0_movedim' type=dtype('float32') shape=None
input: name='scan_1_movedim_1' type=dtype('float32') shape=None
Mul(scan_0_movedim, scan_1_movedim_1) -> mul2
Add(mul2, init1_s_2_cst2init) -> output_0
output: name='output_0' type=dtype('float32') shape=None

dynamo-ir#

inputs: #1[(T1s3,T1s3)]
shapes: dict(x:{0:DYNAMIC},y:{0:DYNAMIC})

opset: domain='' version=20
input: name='x' type=dtype('float32') shape=['s77']
input: name='y' type=dtype('float32') shape=['s17']
init: name='scalar_tensor_default' type=float32 shape=() -- array([1.], dtype=float32)
Scan(x, y, body=G1, num_scan_inputs=2, scan_input_directions=[0,0], scan_output_directions=[0]) -> getitem
output: name='getitem' type=dtype('float32') shape=['s77']
----- subgraph ---- Scan - node_scan__0 - att.body=G1 -- level=1 -- permute_scan_combine_graph_0__subgraph_in,permute_1_scan_combine_graph_0__subgraph_in -> add_scan_combine_graph_0
input: name='permute_scan_combine_graph_0__subgraph_in' type=dtype('float32') shape=None
input: name='permute_1_scan_combine_graph_0__subgraph_in' type=dtype('float32') shape=None
Mul(permute_scan_combine_graph_0__subgraph_in, permute_1_scan_combine_graph_0__subgraph_in) -> mul
Add(mul, scalar_tensor_default) -> add_scan_combine_graph_0
output: name='add_scan_combine_graph_0' type=dtype('float32') shape=None

tracing#

FAILED

'CustomProxyInt' object cannot be interpreted as an integer

new-tracing#

inputs: #1[(T1s3,T1s3)]
shapes: dict(x:{0:DYNAMIC},y:{0:DYNAMIC})

opset: domain='' version=21
opset: domain='local_functions' version=1
input: name='x' type=dtype('float32') shape=['batch']
input: name='y' type=dtype('float32') shape=['batch_2']
init: name='init1_s_2_cst2init' type=float32 shape=() -- array([1.], dtype=float32)-- GraphBuilderPatternOptimization.make_initializer.1/Small
Scan(x, y, body=G1, num_scan_inputs=2, scan_input_directions=[0,0], scan_output_axes=[0], scan_output_directions=[0]) -> output
output: name='output' type=dtype('float32') shape=['batch']
----- subgraph ---- Scan - scan - att.body=G1 -- level=1 -- scan_0_permute_default,scan_1_permute_default_1 -> output_0
input: name='scan_0_permute_default' type=dtype('float32') shape=None
input: name='scan_1_permute_default_1' type=dtype('float32') shape=None
Mul(scan_0_permute_default, scan_1_permute_default_1) -> mul_tensor2
Add(mul_tensor2, init1_s_2_cst2init) -> output_0
output: name='output_0' type=dtype('float32') shape=None

Summary#

case	dynamo-ir	new-tracing	tracing	yobx
AtenAsStrided	6	3	3	3
AtenInterpolate	1	3	3	3
AtenNnFunctionalBilinear	8	6	6	6
AtenNonZero	2	2	2	2
AtenNonZeroTuple	6	5	4	4
AtenRollPos	FAIL	3	3	3
AtenRollRelu	6	4	4	4
BuildInIsInstance	6	6	6	6
BuildInLen	FAIL	FAIL	FAIL	FAIL
ComplexPolar	FAIL	FAIL	FAIL	FAIL
ControlFlowCond	3	3	3	3
ControlFlowCond2Inputs	3	3	3	3
ControlFlowCond2Outputs	3	3	3	3
ControlFlowCondConstant	3	3	3	3
ControlFlowCondIdentity_153832	FAIL	FAIL	FAIL	FAIL
ControlFlowCondNestedModule	3	3	3	3
ControlFlowCondNonZero	FAIL	7	3	FAIL
ControlFlowIndirectRanks	1	1	1	1
ControlFlowIndirectRanksCat	3	3	3	3
ControlFlowNestCond	3	3	3	3
ControlFlowNumelZero1	3	3	3	3
ControlFlowNumelZero2	3	3	3	3
ControlFlowNumelZero3	3	3	3	3
ControlFlowNumelZero4	3	3	3	3
ControlFlowNumelZero5	0	1	1	1
ControlFlowRanks	1	1	1	1
ControlFlowRanksType	1	1	1	1
ControlFlowScan	FAIL	2	2	2
ControlFlowScan2Carried	FAIL	4	4	4
ControlFlowScanCDist	1	1	1	1
ControlFlowScanCDist2	1	2	2	2
ControlFlowScanCDistXY	1	1	1	1
ControlFlowScanDecomposition_151564	1	1	1	1
ControlFlowScanInplace_153705	FAIL	FAIL	FAIL	FAIL
ControlFlowShapeCheck	7	8	FAIL	FAIL
ControlFlowWhile	FAIL	2	FAIL	FAIL
ControlFlowWhileDec	FAIL	2	FAIL	FAIL
ControlFlowWhileInc	FAIL	4	FAIL	FAIL
CreateFromShape	7	5	7	5
CreateFromShapeThroughFunction	7	5	7	5
CropLastDimensionWithTensorContent	4	1	1	1
CropLastDimensionWithTensorShape	2	2	2	2
DynamicCacheInput	5	5	5	5
DynamicCacheInputMixedLayers	5	5	FAIL	5
ExportWithDimension0	6	6	6	6
ExportWithDimension1	6	6	6	6
ExportWithNewConstant	4	4	4	4
ExportWithNewConstantTo	4	4	4	4
InplaceAdd	1	1	1	1
InplaceAdd2	1	1	1	1
InplaceAdd_Mul	2	2	2	2
InplaceCloneAdd_	1	1	1	1
InplaceSetItemEllipsis_1	FAIL	2	2	FAIL
InplaceSetItemEllipsis_2	FAIL	1	1	FAIL
InplaceSetItemExp	27	8	8	FAIL
InplaceSetItemMask	2	2	2	2
InplaceSetItemSquare	19	1	1	11
InplaceSetItemSquareAdd	20	2	2	11
InplaceSetItemSquareAdd2	21	3	3	12
LayerNorm	1	1	1	1
ShapeAndTypeAndDeviceBased	7	5	7	5
ShapeAndTypeBased	7	5	7	5
ShapeBased	7	5	7	5
SignatureFloat1	FAIL	FAIL	FAIL	FAIL
SignatureInt1	FAIL	FAIL	FAIL	FAIL
SignatureInt2	5	5	FAIL	5
SignatureListFixedLength	6	6	6	6
SignatureListFixedWithNone	FAIL	FAIL	FAIL	FAIL
SignatureListVariableLength	FAIL	FAIL	FAIL	FAIL
SignatureShapeAsIndex	5	5	8	5
TinyLLM	136	FAIL	FAIL	84
TinyLLMfp16	144	FAIL	FAIL	90
TypeBFloat16	FAIL	3	3	3
Vmap	2	2	2	6
VmapPython	1	1	FAIL	1

[runpythonerror] E0519 11:39:49.433000 36374 torch/_subclasses/fake_tensor.py:2964] failed while attempting to run meta for aten.complex.default E0519 11:39:49.433000 36374 torch/_subclasses/fake_tensor.py:2964] Traceback (most recent call last): E0519 11:39:49.433000 36374 torch/_subclasses/fake_tensor.py:2964] File “~/vv/this312/lib/python3.12/site-packages/torch/_subclasses/fake_tensor.py”, line 2960, in _dispatch_impl E0519 11:39:49.433000 36374 torch/_subclasses/fake_tensor.py:2964] r = func(args, **kwargs) E0519 11:39:49.433000 36374 torch/_subclasses/fake_tensor.py:2964] ^^^^^^^^^^^^^^^^^^^^^ E0519 11:39:49.433000 36374 torch/_subclasses/fake_tensor.py:2964] File “~/vv/this312/lib/python3.12/site-packages/torch/_ops.py”, line 871, in __call__ E0519 11:39:49.433000 36374 torch/_subclasses/fake_tensor.py:2964] return self._op(*args, **kwargs) E0519 11:39:49.433000 36374 torch/_subclasses/fake_tensor.py:2964] ^^^^^^^^^^^^^^^^^^^^^^^^^ E0519 11:39:49.433000 36374 torch/_subclasses/fake_tensor.py:2964] File “~/vv/this312/lib/python3.12/site-packages/torch/_prims_common/wrappers.py”, line 315, in _fn E0519 11:39:49.433000 36374 torch/_subclasses/fake_tensor.py:2964] result = fn(*args, **kwargs) E0519 11:39:49.433000 36374 torch/_subclasses/fake_tensor.py:2964] ^^^^^^^^^^^^^^^^^^^ E0519 11:39:49.433000 36374 torch/_subclasses/fake_tensor.py:2964] File “~/vv/this312/lib/python3.12/site-packages/torch/_meta_registrations.py”, line 3586, in meta_complex E0519 11:39:49.433000 36374 torch/_subclasses/fake_tensor.py:2964] real.to(corresponding_complex_dtype(real.dtype)), E0519 11:39:49.433000 36374 torch/_subclasses/fake_tensor.py:2964] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ E0519 11:39:49.433000 36374 torch/_subclasses/fake_tensor.py:2964] File “~/vv/this312/lib/python3.12/site-packages/torch/utils/_stats.py”, line 29, in wrapper E0519 11:39:49.433000 36374 torch/_subclasses/fake_tensor.py:2964] return fn(*args, **kwargs) E0519 11:39:49.433000 36374 torch/_subclasses/fake_tensor.py:2964] ^^^^^^^^^^^^^^^^^^^ E0519 11:39:49.433000 36374 torch/_subclasses/fake_tensor.py:2964] File “~/vv/this312/lib/python3.12/site-packages/torch/_subclasses/fake_tensor.py”, line 924, in __torch_dispatch__ E0519 11:39:49.433000 36374 torch/_subclasses/fake_tensor.py:2964] return handler(args) E0519 11:39:49.433000 36374 torch/_subclasses/fake_tensor.py:2964] ^^^^^^^^^^^^^ E0519 11:39:49.433000 36374 torch/_subclasses/fake_tensor.py:2964] File “~/vv/this312/lib/python3.12/site-packages/torch/_subclasses/fake_tensor.py”, line 3448, in <lambda> E0519 11:39:49.433000 36374 torch/_subclasses/fake_tensor.py:2964] torch.ops.aten.size.default: lambda args: tuple( E0519 11:39:49.433000 36374 torch/_subclasses/fake_tensor.py:2964] ^^^^^^ E0519 11:39:49.433000 36374 torch/_subclasses/fake_tensor.py:2964] File “~/vv/this312/lib/python3.12/site-packages/torch/_subclasses/fake_tensor.py”, line 3449, in <genexpr> E0519 11:39:49.433000 36374 torch/_subclasses/fake_tensor.py:2964] int(s) for s in cast(Tensor, args[0]).size() E0519 11:39:49.433000 36374 torch/_subclasses/fake_tensor.py:2964] ^^^^^^ E0519 11:39:49.433000 36374 torch/_subclasses/fake_tensor.py:2964] File “~/vv/this312/lib/python3.12/site-packages/torch/__init__.py”, line 463, in __int__ E0519 11:39:49.433000 36374 torch/_subclasses/fake_tensor.py:2964] return self.node.int_() E0519 11:39:49.433000 36374 torch/_subclasses/fake_tensor.py:2964] ^^^^^^^^^^^^^^^^ E0519 11:39:49.433000 36374 torch/_subclasses/fake_tensor.py:2964] File “~/vv/this312/lib/python3.12/site-packages/torch/fx/experimental/sym_node.py”, line 502, in int_ E0519 11:39:49.433000 36374 torch/_subclasses/fake_tensor.py:2964] return self.guard_int(“”, 0) # NB: uses Python backtrace E0519 11:39:49.433000 36374 torch/_subclasses/fake_tensor.py:2964] ^^^^^^^^^^^^^^^^^^^^^ E0519 11:39:49.433000 36374 torch/_subclasses/fake_tensor.py:2964] File “~/vv/this312/lib/python3.12/site-packages/torch/fx/experimental/sym_node.py”, line 556, in guard_int E0519 11:39:49.433000 36374 torch/_subclasses/fake_tensor.py:2964] r = self.evaluate() E0519 11:39:49.433000 36374 torch/_subclasses/fake_tensor.py:2964] ^^^^^^^^^^^^^^^ E0519 11:39:49.433000 36374 torch/_subclasses/fake_tensor.py:2964] File “~/vv/this312/lib/python3.12/site-packages/torch/fx/experimental/sym_node.py”, line 550, in evaluate E0519 11:39:49.433000 36374 torch/_subclasses/fake_tensor.py:2964] return self.shape_env.evaluate_sym_node(self, size_oblivious) E0519 11:39:49.433000 36374 torch/_subclasses/fake_tensor.py:2964] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ E0519 11:39:49.433000 36374 torch/_subclasses/fake_tensor.py:2964] File “~/vv/this312/lib/python3.12/site-packages/torch/fx/experimental/symbolic_shapes.py”, line 7871, in evaluate_sym_node E0519 11:39:49.433000 36374 torch/_subclasses/fake_tensor.py:2964] return self.evaluate_expr( E0519 11:39:49.433000 36374 torch/_subclasses/fake_tensor.py:2964] ^^^^^^^^^^^^^^^^^^^ E0519 11:39:49.433000 36374 torch/_subclasses/fake_tensor.py:2964] File “~/vv/this312/lib/python3.12/site-packages/torch/fx/experimental/symbolic_shapes.py”, line 7967, in evaluate_expr E0519 11:39:49.433000 36374 torch/_subclasses/fake_tensor.py:2964] return self._inner_evaluate_expr( E0519 11:39:49.433000 36374 torch/_subclasses/fake_tensor.py:2964] ^^^^^^^^^^^^^^^^^^^^^^^^^^ E0519 11:39:49.433000 36374 torch/_subclasses/fake_tensor.py:2964] File “~/vv/this312/lib/python3.12/site-packages/torch/fx/experimental/recording.py”, line 297, in wrapper E0519 11:39:49.433000 36374 torch/_subclasses/fake_tensor.py:2964] return retlog(fn(*args, **kwargs)) E0519 11:39:49.433000 36374 torch/_subclasses/fake_tensor.py:2964] ^^^^^^^^^^^^^^^^^^^ E0519 11:39:49.433000 36374 torch/_subclasses/fake_tensor.py:2964] File “~/vv/this312/lib/python3.12/site-packages/torch/fx/experimental/symbolic_shapes.py”, line 7990, in _inner_evaluate_expr E0519 11:39:49.433000 36374 torch/_subclasses/fake_tensor.py:2964] return self._evaluate_expr( E0519 11:39:49.433000 36374 torch/_subclasses/fake_tensor.py:2964] ^^^^^^^^^^^^^^^^^^^^ E0519 11:39:49.433000 36374 torch/_subclasses/fake_tensor.py:2964] File “~/vv/this312/lib/python3.12/site-packages/torch/fx/experimental/symbolic_shapes.py”, line 8223, in _evaluate_expr E0519 11:39:49.433000 36374 torch/_subclasses/fake_tensor.py:2964] raise self._make_data_dependent_error( E0519 11:39:49.433000 36374 torch/_subclasses/fake_tensor.py:2964] torch.fx.experimental.symbolic_shapes.GuardOnDataDependentSymNode: Could not extract specialized integer from data-dependent expression u0 (unhinted: u0). (Size-like symbols: none) E0519 11:39:49.433000 36374 torch/_subclasses/fake_tensor.py:2964] E0519 11:39:49.433000 36374 torch/_subclasses/fake_tensor.py:2964] E0519 11:39:49.433000 36374 torch/_subclasses/fake_tensor.py:2964] Caused by: (utils/_stats.py:29 in wrapper) E0519 11:39:49.433000 36374 torch/_subclasses/fake_tensor.py:2964] For more information, run with TORCH_LOGS=”dynamic” E0519 11:39:49.433000 36374 torch/_subclasses/fake_tensor.py:2964] For extended logs when we create symbols, also add TORCHDYNAMO_EXTENDED_DEBUG_CREATE_SYMBOL=”u0” E0519 11:39:49.433000 36374 torch/_subclasses/fake_tensor.py:2964] If you suspect the guard was triggered from C++, add TORCHDYNAMO_EXTENDED_DEBUG_CPP=1 E0519 11:39:49.433000 36374 torch/_subclasses/fake_tensor.py:2964] For more debugging help, see https://docs.google.com/document/d/1HSuTTVvYH1pTew89Rtpeu84Ht3nQEFTYhAX3Ypa_xJs/edit?usp=sharing E0519 11:39:49.433000 36374 torch/_subclasses/fake_tensor.py:2964] E0519 11:39:49.433000 36374 torch/_subclasses/fake_tensor.py:2964] For C++ stack trace, run with TORCHDYNAMO_EXTENDED_DEBUG_CPP=1 E0519 11:40:42.306000 36374 torch/_guards.py:370] Error while creating guard: E0519 11:40:42.306000 36374 torch/_guards.py:370] Name: ‘’ E0519 11:40:42.306000 36374 torch/_guards.py:370] Source: shape_env E0519 11:40:42.306000 36374 torch/_guards.py:370] Create Function: SHAPE_ENV E0519 11:40:42.306000 36374 torch/_guards.py:370] Guard Types: None E0519 11:40:42.306000 36374 torch/_guards.py:370] Code List: None E0519 11:40:42.306000 36374 torch/_guards.py:370] Object Weakref: None E0519 11:40:42.306000 36374 torch/_guards.py:370] Guarded Class Weakref: None E0519 11:40:42.306000 36374 torch/_guards.py:370] Traceback (most recent call last): E0519 11:40:42.306000 36374 torch/_guards.py:370] File “~/vv/this312/lib/python3.12/site-packages/torch/_guards.py”, line 368, in create E0519 11:40:42.306000 36374 torch/_guards.py:370] return self.create_fn(builder, self) E0519 11:40:42.306000 36374 torch/_guards.py:370] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ E0519 11:40:42.306000 36374 torch/_guards.py:370] File “~/vv/this312/lib/python3.12/site-packages/torch/_dynamo/guards.py”, line 3204, in SHAPE_ENV E0519 11:40:42.306000 36374 torch/_guards.py:370] python_code_parts, verbose_code_parts = _get_code_parts( E0519 11:40:42.306000 36374 torch/_guards.py:370] ^^^^^^^^^^^^^^^^ E0519 11:40:42.306000 36374 torch/_guards.py:370] File “~/vv/this312/lib/python3.12/site-packages/torch/_dynamo/guards.py”, line 3177, in _get_code_parts E0519 11:40:42.306000 36374 torch/_guards.py:370] return output_graph.shape_env.produce_guards_verbose( E0519 11:40:42.306000 36374 torch/_guards.py:370] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ E0519 11:40:42.306000 36374 torch/_guards.py:370] File “~/vv/this312/lib/python3.12/site-packages/torch/fx/experimental/symbolic_shapes.py”, line 5864, in produce_guards_verbose E0519 11:40:42.306000 36374 torch/_guards.py:370] raise ConstraintViolationError( E0519 11:40:42.306000 36374 torch/_guards.py:370] torch.fx.experimental.symbolic_shapes.ConstraintViolationError: L[‘flat_args’][1].size()[0] = 8192 is not equal to L[‘flat_args’][0].size()[0] = 4 E0519 11:40:42.313000 36374 torch/_guards.py:372] Created at: E0519 11:40:42.313000 36374 torch/_guards.py:372] File “~/vv/this312/lib/python3.12/site-packages/torch/_dynamo/convert_frame.py”, line 906, in trace_frame E0519 11:40:42.313000 36374 torch/_guards.py:372] tracer = InstructionTranslator( E0519 11:40:42.313000 36374 torch/_guards.py:372] File “~/vv/this312/lib/python3.12/site-packages/torch/_dynamo/symbolic_convert.py”, line 4937, in __init__ E0519 11:40:42.313000 36374 torch/_guards.py:372] output=OutputGraph( E0519 11:40:42.313000 36374 torch/_guards.py:372] File “~/vv/this312/lib/python3.12/site-packages/torch/_dynamo/output_graph.py”, line 706, in __init__ E0519 11:40:42.313000 36374 torch/_guards.py:372] self.init_ambient_guards() E0519 11:40:42.313000 36374 torch/_guards.py:372] File “~/vv/this312/lib/python3.12/site-packages/torch/_dynamo/output_graph.py”, line 1047, in init_ambient_guards E0519 11:40:42.313000 36374 torch/_guards.py:372] self.guards.add(ShapeEnvSource().make_guard(GuardBuilder.SHAPE_ENV)) E0519 11:40:43.210000 36374 torch/_guards.py:370] Error while creating guard: E0519 11:40:43.210000 36374 torch/_guards.py:370] Name: ‘’ E0519 11:40:43.210000 36374 torch/_guards.py:370] Source: shape_env E0519 11:40:43.210000 36374 torch/_guards.py:370] Create Function: SHAPE_ENV E0519 11:40:43.210000 36374 torch/_guards.py:370] Guard Types: None E0519 11:40:43.210000 36374 torch/_guards.py:370] Code List: None E0519 11:40:43.210000 36374 torch/_guards.py:370] Object Weakref: None E0519 11:40:43.210000 36374 torch/_guards.py:370] Guarded Class Weakref: None E0519 11:40:43.210000 36374 torch/_guards.py:370] Traceback (most recent call last): E0519 11:40:43.210000 36374 torch/_guards.py:370] File “~/vv/this312/lib/python3.12/site-packages/torch/_guards.py”, line 368, in create E0519 11:40:43.210000 36374 torch/_guards.py:370] return self.create_fn(builder, self) E0519 11:40:43.210000 36374 torch/_guards.py:370] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ E0519 11:40:43.210000 36374 torch/_guards.py:370] File “~/vv/this312/lib/python3.12/site-packages/torch/_dynamo/guards.py”, line 3204, in SHAPE_ENV E0519 11:40:43.210000 36374 torch/_guards.py:370] python_code_parts, verbose_code_parts = _get_code_parts( E0519 11:40:43.210000 36374 torch/_guards.py:370] ^^^^^^^^^^^^^^^^ E0519 11:40:43.210000 36374 torch/_guards.py:370] File “~/vv/this312/lib/python3.12/site-packages/torch/_dynamo/guards.py”, line 3177, in _get_code_parts E0519 11:40:43.210000 36374 torch/_guards.py:370] return output_graph.shape_env.produce_guards_verbose( E0519 11:40:43.210000 36374 torch/_guards.py:370] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ E0519 11:40:43.210000 36374 torch/_guards.py:370] File “~/vv/this312/lib/python3.12/site-packages/torch/fx/experimental/symbolic_shapes.py”, line 5864, in produce_guards_verbose E0519 11:40:43.210000 36374 torch/_guards.py:370] raise ConstraintViolationError( E0519 11:40:43.210000 36374 torch/_guards.py:370] torch.fx.experimental.symbolic_shapes.ConstraintViolationError: L[‘flat_args’][1].size()[0] = 8192 is not equal to L[‘flat_args’][0].size()[0] = 4 E0519 11:40:43.211000 36374 torch/_guards.py:372] Created at: E0519 11:40:43.211000 36374 torch/_guards.py:372] File “~/vv/this312/lib/python3.12/site-packages/torch/_dynamo/convert_frame.py”, line 906, in trace_frame E0519 11:40:43.211000 36374 torch/_guards.py:372] tracer = InstructionTranslator( E0519 11:40:43.211000 36374 torch/_guards.py:372] File “~/vv/this312/lib/python3.12/site-packages/torch/_dynamo/symbolic_convert.py”, line 4937, in __init__ E0519 11:40:43.211000 36374 torch/_guards.py:372] output=OutputGraph( E0519 11:40:43.211000 36374 torch/_guards.py:372] File “~/vv/this312/lib/python3.12/site-packages/torch/_dynamo/output_graph.py”, line 706, in __init__ E0519 11:40:43.211000 36374 torch/_guards.py:372] self.init_ambient_guards() E0519 11:40:43.211000 36374 torch/_guards.py:372] File “~/vv/this312/lib/python3.12/site-packages/torch/_dynamo/output_graph.py”, line 1047, in init_ambient_guards E0519 11:40:43.211000 36374 torch/_guards.py:372] self.guards.add(ShapeEnvSource().make_guard(GuardBuilder.SHAPE_ENV)) [0;93m2026-05-19 11:40:51.637030387 [W:onnxruntime:, graph.cc:122 MergeShapeInfo] Error merging shape info for output. ‘native_layer_norm_default#1’ source:{-1,1} target:{4}. Falling back to lenient merge.[m [0;93m2026-05-19 11:40:51.637113492 [W:onnxruntime:, graph.cc:122 MergeShapeInfo] Error merging shape info for output. ‘native_layer_norm_default#2’ source:{-1,1} target:{4}. Falling back to lenient merge.[m [1;31m2026-05-19 11:40:56.359052224 [E:onnxruntime:, sequential_executor.cc:615 ExecuteKernel] Non-zero status code returned while running Concat node. Name:’_getitem_slicenSD’ Status Message: ~/github/onnxruntime/onnxruntime/core/providers/cpu/tensor/concatbase.h:118 onnxruntime::common::Status onnxruntime::ConcatBase::PrepareForComputeImpl(KernelContextType, const InlinedTensorsVector&, onnxruntime::Prepare&) const [with KernelContextType = onnxruntime::OpKernelContext; InlinedTensorsVector = absl::lts_20250814::InlinedVector<const onnxruntime::Tensor*, 5, std::allocator<const onnxruntime::Tensor*> >] input_rank == reference_rank was false. Ranks of input data are different, cannot concatenate them. expected rank: 1 got: 2 [m [1;31m2026-05-19 11:40:56.360215039 [E:onnxruntime:, sequential_executor.cc:615 ExecuteKernel] Non-zero status code returned while running Concat node. Name:’_getitem_slicenSD’ Status Message: ~/github/onnxruntime/onnxruntime/core/providers/cpu/tensor/concatbase.h:118 onnxruntime::common::Status onnxruntime::ConcatBase::PrepareForComputeImpl(KernelContextType*, const InlinedTensorsVector&, onnxruntime::Prepare&) const [with KernelContextType = onnxruntime::OpKernelContext; InlinedTensorsVector = absl::lts_20250814::InlinedVector<const onnxruntime::Tensor*, 5, std::allocator<const onnxruntime::Tensor*> >] input_rank == reference_rank was false. Ranks of input data are different, cannot concatenate them. expected rank: 1 got: 2 [m [1;31m2026-05-19 11:40:57.748757819 [E:onnxruntime:, sequential_executor.cc:615 ExecuteKernel] Non-zero status code returned while running Concat node. Name:’_getitem_slicenSD’ Status Message: ~/github/onnxruntime/onnxruntime/core/providers/cpu/tensor/concatbase.h:118 onnxruntime::common::Status onnxruntime::ConcatBase::PrepareForComputeImpl(KernelContextType*, const InlinedTensorsVector&, onnxruntime::Prepare&) const [with KernelContextType = onnxruntime::OpKernelContext; InlinedTensorsVector = absl::lts_20250814::InlinedVector<const onnxruntime::Tensor*, 5, std::allocator<const onnxruntime::Tensor*> >] input_rank == reference_rank was false. Ranks of input data are different, cannot concatenate them. expected rank: 1 got: 2 [m [transformers] loss_type=None was set in the config but it is unrecognized. Using the default loss: ForCausalLMLoss.