Pattern Optimizer¶
The pattern optimizer is implemented by class GraphBuilderPatternOptimization
.
It searches for a specific sequence of nodes in the graph and
replaces it by another one without changing the inputs or the long_outputs
of the graph. The goal of the optimizer is to make the whole computation
graph more efficient. The goal of this implementation is to make this
optimization as fast as possible.
Assuming the nodes in an onnx graph are ordered in a way every input of a
node was created by previous nodes, the optimizer must not require
any global reordering. The cost should be in in the worst
case where N is the number of nodes, P is the number of patterns,
I is the number of iterations.
It is difficult to foresee what a pattern needs in order to rewrite a part of the graph. This API tries to give as much freedom as it can without leaving too much to do to the developper which tries to add a new pattern.
Patterns¶
Patterns must inherit from PatternOptimization
.
This class defines two methods.
PatternOptimization.match¶
def match(
self,
g: "GraphBuilderPatternOptimization",
node: NodeProto,
matched: List[MatchResult],
) -> Optional[MatchResult]:
g
is aGraphBuilderPatternOptimization
, it holds all the existing nodes, is able to return any information about type, shape, the node before, the node after another one.node
: the matching must determine if some nodes around this one are part of set of nodes this pattern optimizer can rewrite. From there, the function explores wherever it needs, checking any condition it needs.matched
: usually unused, it returns of nodes already matching a pattern
The method must not modify the graph.
The method returns None if no match is found or an instance of class MatchResult
. It must contain:
a list of nodes involved in the rewriting. It does not mean all of them will be removed but all of them are needed to do the rewriting and must not be impacted by other pattern optimizer.
A function doing the rewriting (usually method apply of the pattern class).
An existing node where the rewritten nodes can be inserted. Knowing it makes it faster to rewriter. If not specified, the optimizer will automatically determine the position of the new nodes.
Debugging: method none
def none(
self,
node: Optional[NodeProto] = None,
lineno: Optional[int] = None,
msg: str = "",
):
It may be useful which reason made a pattern matching fail. Instead of returning None, method match can return the following expression:
return self.none(node, inspect.currentframe().f_lineno)
By setting the verbosity (see next Section), the user may then know which lines in the code returned None and which condition failed.
PatternOptimization.apply¶
@classmethod
def apply(
cls, g: "GraphBuilder", *nodes: Sequence[NodeProto]
) -> List[NodeProto]:
The method does the rewriting. It assumes it can happen. It takes a list of nodes impacted by the rewriting. It assumes no other pattern optimizer modified them or will modify them. It receives the list of nodes returned by method apply. Since it is a list of argument, method match can include None values. The method returns the new nodes. The optimizer considers that any node given to this function is removed from the graph, and any node returned by it are added. If a received node must be kept, it must be added to the list of returned node.
Optimization Algorithm¶
It is implemented in method optimize
def optimize(
self, max_iter=-1, remove_identity: bool = True
) -> List[Dict[str, Any]]:
The algorithm runs multiple iteration until the graph is not evolving or max_iter is reached. By default, it is equal to the number of nodes. An iteration is:
matches = []
builds all successors and predecessors
# Step 1: match
for all patterns P:
for all nodes n:
r = p.match(n)
if r:
if no node already scheduled to be rewritten by another match:
matches.append(r)
# Step 2: apply
for all matches r:
apply the match r
# Step 3: clean
remove unused nodes
remove identity nodes
This algorithm may apply more than one rewriting at each iteration but it guarantees the local structure when applying the rewriting was not altered by another one.
Adding a pattern¶
See #80 about the addition of a new pattern.
Example¶
Simple API¶
We consider the following simple model:
<<<
import torch
from experimental_experiment.helpers import pretty_onnx
from experimental_experiment.xbuilder import OptimizationOptions
from experimental_experiment.torch_interpreter import to_onnx
class MLP(torch.nn.Module):
def __init__(self):
super().__init__()
self.layers = torch.nn.Sequential(
torch.nn.Linear(10, 32),
torch.nn.ReLU(),
torch.nn.Linear(32, 1),
)
def forward(self, x):
return self.layers(x)
x = torch.rand(3, 10)
onx = to_onnx(
MLP(), (x,), input_names=["x"], options=OptimizationOptions(patterns=None)
)
with open("temp_doc_mlp.onnx", "wb") as f:
f.write(onx.SerializeToString())
print(pretty_onnx(onx))
>>>
opset: domain='' version=18
input: name='x' type=dtype('float32') shape=[3, 10]
init: name='p_layers_0_weight::T10' type=float32 shape=(10, 32) -- GraphBuilder.constant_folding.from/fold(p_layers_0_weight)##p_layers_0_weight/DynamoInterpret.placeholder.1/P(layers.0.weight)
init: name='p_layers_2_weight::T10' type=float32 shape=(32, 1) -- GraphBuilder.constant_folding.from/fold(p_layers_2_weight)##p_layers_2_weight/DynamoInterpret.placeholder.1/P(layers.2.weight)
init: name='layers.0.bias' type=float32 shape=(32,) -- DynamoInterpret.placeholder.1/P(layers.0.bias)
init: name='layers.2.bias' type=float32 shape=(1,) -- array([0.082], dtype=float32)-- DynamoInterpret.placeholder.1/P(layers.2.bias)
MatMul(x, p_layers_0_weight::T10) -> _onx_matmul_x
Add(_onx_matmul_x, layers.0.bias) -> linear
Relu(linear) -> relu
MatMul(relu, p_layers_2_weight::T10) -> _onx_matmul_relu
Add(_onx_matmul_relu, layers.2.bias) -> output_0
output: name='output_0' type=dtype('float32') shape=[3, 1]
Which we can renders as follows:
![digraph{
nodesep=0.05;
orientation=portrait;
size=7;
ranksep=0.25;
x [shape=box color=red label="x\nTensorProto.FLOAT\nshape=[3, 10]" fontsize=10];
output_0 [shape=box color=green label="output_0\nTensorProto.FLOAT\nshape=[3, 1]" fontsize=10];
p_layers_0_weight____T10 [shape=box label="p_layers_0_weight____T10\nfloat32((10, 32))\n[[ 0.307 0.265 -0.067 -0.04 0.15 0.051 -0.138..." fontsize=10];
p_layers_2_weight____T10 [shape=box label="p_layers_2_weight____T10\nfloat32((32, 1))\n[[ 0.108]\n [-0.152]\n [ 0.122]\n [-0.128]\n [ 0.144]\n..." fontsize=10];
layers_0_bias [shape=box label="layers_0_bias\nfloat32((32,))\n[ 0.06 -0.024 -0.302 0.072 -0.292 0.083 -0.159 ..." fontsize=10];
layers_2_bias [shape=box label="layers_2_bias\nfloat32((1,))\n[0.082]" fontsize=10];
_onx_matmul_x [shape=box label="_onx_matmul_x" fontsize=10];
Opset [shape=box style="filled,rounded" color=orange label="MatMul" fontsize=10];
x -> Opset;
p_layers_0_weight____T10 -> Opset;
Opset -> _onx_matmul_x;
linear [shape=box label="linear" fontsize=10];
Opset2 [shape=box style="filled,rounded" color=orange label="Add" fontsize=10];
_onx_matmul_x -> Opset2;
layers_0_bias -> Opset2;
Opset2 -> linear;
relu [shape=box label="relu" fontsize=10];
relu [shape=box style="filled,rounded" color=orange label="Relu" fontsize=10];
linear -> relu;
relu -> relu;
_onx_matmul_relu [shape=box label="_onx_matmul_relu" fontsize=10];
Opset3 [shape=box style="filled,rounded" color=orange label="MatMul" fontsize=10];
relu -> Opset3;
p_layers_2_weight____T10 -> Opset3;
Opset3 -> _onx_matmul_relu;
Opset4 [shape=box style="filled,rounded" color=orange label="Add" fontsize=10];
_onx_matmul_relu -> Opset4;
layers_2_bias -> Opset4;
Opset4 -> output_0;
}](../_images/graphviz-a6cdbc720a89a1a5c57a3eb6cdeb009aae1819fd.png)
We then apply the optimizations by writing the following code:
<<<
import onnx
from experimental_experiment.helpers import pretty_onnx
from experimental_experiment.xbuilder import GraphBuilder
onx = onnx.load("temp_doc_mlp.onnx")
# The model is placed in a GraphBuilder.
# It creates dictionnaires to store shapes, ranks, types
# to make it easier to the optimizers to find the information
# they need. It still uses NodeProto to store nodes
gr = GraphBuilder(onx, infer_shapes_options=True)
# Let's optimize.
opt_onx = gr.to_onnx(optimize=True)
with open("temp_doc_mlp_opt.onnx", "wb") as f:
f.write(opt_onx.SerializeToString())
print(pretty_onnx(opt_onx))
>>>
opset: domain='' version=18
input: name='x' type=dtype('float32') shape=[3, 10]
init: name='layers.0.bias' type=float32 shape=(32,) -- DynamoInterpret.placeholder.1/P(layers.0.bias)GraphBuilder._update_structures_with_proto.1/from(layers.0.bias)
init: name='layers.2.bias' type=float32 shape=(1,) -- array([0.082], dtype=float32)-- DynamoInterpret.placeholder.1/P(layers.2.bias)GraphBuilder._update_structures_with_proto.1/from(layers.2.bias)
init: name='GemmTransposePattern--p_layers_0_weight::T10' type=float32 shape=(32, 10)-- GraphBuilder.constant_folding.from/fold(p_layers_0_weight::T10)##p_layers_0_weight::T10/GraphBuilder._update_structures_with_proto.1/from(p_layers_0_weight::T10)
init: name='GemmTransposePattern--p_layers_2_weight::T10' type=float32 shape=(1, 32)-- GraphBuilder.constant_folding.from/fold(init7_s2_1_-1,p_layers_2_weight::T10)##p_layers_2_weight::T10/GraphBuilder._update_structures_with_proto.1/from(p_layers_2_weight::T10)##init7_s2_1_-1/TransposeEqualReshapePattern.apply.new_shape
Gemm(x, GemmTransposePattern--p_layers_0_weight::T10, layers.0.bias, transB=1) -> linear
Relu(linear) -> relu
Gemm(relu, GemmTransposePattern--p_layers_2_weight::T10, layers.2.bias, transB=1) -> output_0
output: name='output_0' type=dtype('float32') shape=[3, 1]
Which renders as follows:
![digraph{
nodesep=0.05;
orientation=portrait;
size=7;
ranksep=0.25;
x [shape=box color=red label="x\nTensorProto.FLOAT\nshape=[3, 10]" fontsize=10];
output_0 [shape=box color=green label="output_0\nTensorProto.FLOAT\nshape=[3, 1]" fontsize=10];
layers_0_bias [shape=box label="layers_0_bias\nfloat32((32,))\n[ 0.06 -0.024 -0.302 0.072 -0.292 0.083 -0.159 ..." fontsize=10];
layers_2_bias [shape=box label="layers_2_bias\nfloat32((1,))\n[0.082]" fontsize=10];
GemmTransposePattern__p_layers_0_weight____T10 [shape=box label="GemmTransposePattern__p_layers_0_weight____T10\nfloat32((32, 10))\n[[ 0.307 0.167 0.174 -0.177 0.3 0.262 0.063..." fontsize=10];
GemmTransposePattern__p_layers_2_weight____T10 [shape=box label="GemmTransposePattern__p_layers_2_weight____T10\nfloat32((1, 32))\n[[ 0.108 -0.152 0.122 -0.128 0.144 0.083 0.059..." fontsize=10];
linear [shape=box label="linear" fontsize=10];
GemmTransposePattern__MatMulAddPattern__Opset2 [shape=box style="filled,rounded" color=orange label="Gemm\ntransB=1" fontsize=10];
x -> GemmTransposePattern__MatMulAddPattern__Opset2;
GemmTransposePattern__p_layers_0_weight____T10 -> GemmTransposePattern__MatMulAddPattern__Opset2;
layers_0_bias -> GemmTransposePattern__MatMulAddPattern__Opset2;
GemmTransposePattern__MatMulAddPattern__Opset2 -> linear;
relu [shape=box label="relu" fontsize=10];
relu [shape=box style="filled,rounded" color=orange label="Relu" fontsize=10];
linear -> relu;
relu -> relu;
GemmTransposePattern__MatMulAddPattern__Opset32 [shape=box style="filled,rounded" color=orange label="Gemm\ntransB=1" fontsize=10];
relu -> GemmTransposePattern__MatMulAddPattern__Opset32;
GemmTransposePattern__p_layers_2_weight____T10 -> GemmTransposePattern__MatMulAddPattern__Opset32;
layers_2_bias -> GemmTransposePattern__MatMulAddPattern__Opset32;
GemmTransposePattern__MatMulAddPattern__Opset32 -> output_0;
}](../_images/graphviz-ccf65828ccb3ec717ffec4e1be73791e84308e4c.png)
Verbosity¶
<<<
import onnx
from experimental_experiment.xbuilder import GraphBuilder
onx = onnx.load("temp_doc_mlp.onnx")
gr = GraphBuilder(onx, infer_shapes_options=True, verbose=1)
opt_onx = gr.to_onnx(optimize=True)
>>>
[GraphBuilder-XGA._add_shape_information] dynamic shapes replacements={}
[GraphBuilder-XGA.optimize] start with 5 nodes
[GraphBuilder-XGA.optimize] #patterns=72
[GraphBuilder-XGA.optimize] start with subgraphs
[GraphBuilder-XGA.optimize] done with subgraphs
[GraphBuilderPatternOptimization-XGA.optimize] start with 5 nodes, 4 initializers, 72 patterns, priorities=[0, 1, 3], max_iter=30
[GraphBuilderPatternOptimization-XGA.optimize] iteration 0: 5 nodes, priority=0
[GraphBuilderPatternOptimization-XGA.optimize] increase priority to 1
[GraphBuilderPatternOptimization-XGA.optimize] iteration 1: 5 nodes, priority=1
[GraphBuilderPatternOptimization-XGA.optimize] increase priority to 3
[GraphBuilderPatternOptimization-XGA.optimize] iteration 2: 5 nodes, priority=3
[GraphBuilderPatternOptimization-XGA.optimize] applies 2 matches, 2*MatMulAddPattern - time=0.001 | max_time=BatchNormalizationTrainingPattern:0.000
[GraphBuilderPatternOptimization-XGA.optimize] iteration 3: 3 nodes, priority=3
[GraphBuilderPatternOptimization-XGA.optimize] applies 2 matches, 2*GemmTransposePattern - time=0.000 | max_time=GemmTransposePattern:0.000
[GraphBuilderPatternOptimization-XGA.optimize] iteration 4: 5 nodes, priority=3
[GraphBuilderPatternOptimization-XGA.optimize] applies 1 matches, [0]=MatchResult: TransposeEqualReshapePattern replaces ['Transpose'] - time=0.000 | max_time=TransposeMatMulPattern:0.000
[GraphBuilderPatternOptimization-XGA.optimize] iteration 5: 5 nodes, priority=3
[GraphBuilderPatternOptimization-XGA.optimize] stops current_priority_index=3, priorities=[0, 1, 3]
[GraphBuilderPatternOptimization-XGA.optimize] done after 6 iterations with 5 nodes in 0.008
[GraphBuilder-XGA.optimize] done with 3 nodes in 0.010
[GraphBuilder-XGA.to_onnx] make_model 4 inits 0 params
[GraphBuilder-XGA.time_evaluation_constants_] 0
[GraphBuilder-XGA._build_initializers] start with 4 initializers, large_model=False, external_threshold=1024
[GraphBuilder-XGA._build_initializers] switch low/high order
[GraphBuilder-XGA._build_initializers] done in 1.2380005500745028e-06s with 4 initializers, 0 large initializers
[GraphBuilder-XGA._add_shape_information] dynamic shapes replacements={}
With more verbosity:
<<<
import onnx
from experimental_experiment.xbuilder import GraphBuilder
onx = onnx.load("temp_doc_mlp.onnx")
gr = GraphBuilder(onx, infer_shapes_options=True, verbose=11)
opt_onx = gr.to_onnx(optimize=True)
>>>
[GraphBuilder-VNK._update_structures_with_proto] -- starts with 5 nodes
[GraphBuilder-VNK.set_shape] p_layers_0_weight::T10:(10, 32)
[GraphBuilder-VNK.set_rank] p_layers_0_weight::T10:2
[GraphBuilder-VNK.set_type] p_layers_0_weight::T10:1
[GraphBuilder-VNK.make_initializer] p_layers_0_weight::T10[1:(10, 32)]
[GraphBuilder-VNK.update_node_constant] new constant 'p_layers_0_weight::T10', node=None
[GraphBuilder-VNK.set_shape] p_layers_2_weight::T10:(32, 1)
[GraphBuilder-VNK.set_rank] p_layers_2_weight::T10:2
[GraphBuilder-VNK.set_type] p_layers_2_weight::T10:1
[GraphBuilder-VNK.make_initializer] p_layers_2_weight::T10[1:(32, 1)]
[GraphBuilder-VNK.update_node_constant] new constant 'p_layers_2_weight::T10', node=None
[GraphBuilder-VNK.set_shape] layers.0.bias:(32,)
[GraphBuilder-VNK.set_rank] layers.0.bias:1
[GraphBuilder-VNK.set_type] layers.0.bias:1
[GraphBuilder-VNK.make_initializer] layers.0.bias[1:(32,)]
[GraphBuilder-VNK.update_node_constant] new constant 'layers.0.bias', node=None
[GraphBuilder-VNK.set_shape] layers.2.bias:(1,)
[GraphBuilder-VNK.set_rank] layers.2.bias:1
[GraphBuilder-VNK.set_type] layers.2.bias:1
[GraphBuilder-VNK.make_initializer] layers.2.bias[1:(1,)]
[GraphBuilder-VNK.update_node_constant] new constant 'layers.2.bias', node=None
[GraphBuilder-VNK.set_type] x:1
[GraphBuilder-VNK.set_shape] x:(3, 10)
[GraphBuilder-VNK.set_rank] x:2
[GraphBuilder-VNK.set_type] output_0:1
[GraphBuilder-VNK.set_shape] output_0:(3, 1)
[GraphBuilder-VNK.set_rank] output_0:2
[GraphBuilder-VNK.set_type] _onx_matmul_x:1
[GraphBuilder-VNK.set_shape] _onx_matmul_x:(3, 32)
[GraphBuilder-VNK.set_rank] _onx_matmul_x:2
[GraphBuilder-VNK.set_type] linear:1
[GraphBuilder-VNK.set_shape] linear:(3, 32)
[GraphBuilder-VNK.set_rank] linear:2
[GraphBuilder-VNK.set_type] relu:1
[GraphBuilder-VNK.set_shape] relu:(3, 32)
[GraphBuilder-VNK.set_rank] relu:2
[GraphBuilder-VNK.set_type] _onx_matmul_relu:1
[GraphBuilder-VNK.set_shape] _onx_matmul_relu:(3, 1)
[GraphBuilder-VNK.set_rank] _onx_matmul_relu:2
[GraphBuilder-VNK.set_type] output_0:1
[GraphBuilder-VNK._update_structures_with_proto] ends with 5 nodes in 0.0008961910007201368
[GraphBuilder-VNK.constant_folding] -- starts with 4 constants and 5 nodes.
[GraphBuilder-VNK.constant_folding] cst:: 1 :: layers.0.bias
[GraphBuilder-VNK.constant_folding] cst:: . :: _onx_matmul_x
[GraphBuilder-VNK.constant_folding] cst:: . :: x
[GraphBuilder-VNK.constant_folding] cst:: . :: relu
[GraphBuilder-VNK.constant_folding] cst:: . :: output_0
[GraphBuilder-VNK.constant_folding] cst:: 1 :: layers.2.bias
[GraphBuilder-VNK.constant_folding] cst:: . :: linear
[GraphBuilder-VNK.constant_folding] cst:: . :: _onx_matmul_relu
[GraphBuilder-VNK.constant_folding] cst:: 1 :: p_layers_2_weight::T10
[GraphBuilder-VNK.constant_folding] cst:: 1 :: p_layers_0_weight::T10
[GraphBuilder-VNK.constant_folding] initializer: p_layers_0_weight::T10
[GraphBuilder-VNK.constant_folding] initializer: p_layers_2_weight::T10
[GraphBuilder-VNK.constant_folding] initializer: layers.0.bias
[GraphBuilder-VNK.constant_folding] initializer: layers.2.bias
[GraphBuilder-VNK.constant_folding] ends with 4 constants and 5 nodes in 3.7802999941050075e-05 seconds
[GraphBuilder-VNK._update_shape_types_with_proto] -- starts with 5 nodes and 0 shapes.
[GraphBuilder._update_shape_types_with_proto] infer shapes
[GraphBuilder._update_shape_types_with_proto] infer shapes done 0.00022670500038657337 seconds
[GraphBuilder._update_shape_types_with_proto] _clean_shapes after 0.0002520689995435532 seconds
[GraphBuilder-VNK._update_shape_types_with_proto] walk through 0 shapes.
[GraphBuilder-VNK.set_type] _onx_matmul_x:1
[_update_shape_types_with_proto_one_result] update shape(_onx_matmul_x) with (3, 32)
[GraphBuilder-VNK.set_type] linear:1
[_update_shape_types_with_proto_one_result] update shape(linear) with (3, 32)
[GraphBuilder-VNK.set_type] relu:1
[_update_shape_types_with_proto_one_result] update shape(relu) with (3, 32)
[GraphBuilder-VNK.set_type] _onx_matmul_relu:1
[_update_shape_types_with_proto_one_result] update shape(_onx_matmul_relu) with (3, 1)
[GraphBuilder-VNK._update_shape_types_with_proto] ends in 7.657599962840322e-05 seconds.
[GraphBuilder-VNK._add_shape_information] dynamic shapes replacements={}
[GraphBuilder-VNK.optimize] start with 5 nodes
[GraphBuilder-VNK.optimize] options=OptimizationOptions(constant_folding={'Concat', 'Add', 'Sub', 'Transpose', 'Div', 'Cast', 'Mul', 'Reshape'}, patterns=[BatchNormalizationPattern(), BatchNormalizationTrainingPattern(), CastLayerNormalizationCastPattern(), CastPattern(), CastCastBinaryPattern(), CastOpCastPattern(), ClipClipPattern(), ComputationCastOpCastPattern(), ConcatEmptyPattern(), ConcatGatherPattern(), ConcatReshapePattern(), ConcatTwiceUnaryPattern(), ConvBiasNullPattern(), DropoutPattern(), ExpandPattern(), ExpandBroadcastPattern(), ExpandSwapPattern(), GeluPattern(), IdentityPattern(), LayerNormalizationPattern(), LayerNormalizationScalePattern(), LeakyReluPattern(), MulMulMulScalarPattern(), ReduceReshapePattern(), ReduceSumNormalizePattern(), ReshapePattern(), ReshapeMatMulReshapePattern(), Reshape2Of3Pattern(), ReshapeReshapeBinaryPattern(), MatMulAddPattern(), GemmTransposePattern(), MatMulReshape2Of3Pattern(), MulMulMatMulPattern(), ShapeBasedReshapeIsSqueezePattern(), ShapeBasedStaticExpandPattern(), ShapeBasedConcatExpandPattern(), ShapeBasedEditDistanceReshapePattern(), ShapeBasedIdentityPattern(), ShapeBasedExpandBroadcastPattern(), ShapeBasedExpandBroadcastMatMulPattern(), ShapeBasedExpandCastWhereSwapPattern(), ShapeBasedExpandSwapPattern(), ShapeBasedMatMulToMulPattern(), ShapeBasedSameChildrenPattern(), ShapeBasedShapeShapeAddPattern(), ReshapeReshapePattern(), RotaryEmbeddingPattern(), SameChildrenPattern(), SequenceConstructAtPattern(), SliceSlicePattern(), SlicesSplitPattern(), SoftmaxCrossEntropyLossCastPattern(), SplitConcatPattern(), SqueezeAddPattern(), SqueezeUnsqueezePattern(), StaticConcatReshapePattern(), Sub1MulPattern(), SwitchOrderBinaryPattern(), SwitchReshapeActivationPattern(), TransposeEqualReshapePattern(), TransposeMatMulPattern(), TransposeReshapeMatMulPattern(), TransposeReshapeTransposePattern(), TransposeTransposePattern(), UnsqueezeEqualPattern(), UnsqueezeUnsqueezePattern(), RotaryConcatPartPattern(), FunctionCausalMaskPattern(), FunctionCausalMaskMulAddPattern(), FunctionCosSinCachePattern(), FunctionHalfRotaryEmbeddingPattern(), RMSNormalizationPattern()], verbose=11)
-- GRAPH BEFORE OPTIMIZATON --
opset: : 18
init: p_layers_0_weight::T10: ?: ? -- GraphBuilder._update_structures_with_proto.1/from(p_layers_0_weight::T10)
init: p_layers_2_weight::T10: ?: ? -- GraphBuilder._update_structures_with_proto.1/from(p_layers_2_weight::T10)
init: layers.0.bias: ?: ? -- GraphBuilder._update_structures_with_proto.1/from(layers.0.bias)
init: layers.2.bias: ?: ? -- GraphBuilder._update_structures_with_proto.1/from(layers.2.bias)
input:: x |T1: 3 x 10
MatMul: x, p_layers_0_weight::T10 -> _onx_matmul_x |T1: 3 x 32 - Opset
Add: _onx_matmul_x, layers.0.bias -> linear |T1: 3 x 32 - Opset2
Relu: linear -> relu |T1: 3 x 32 - relu
MatMul: relu, p_layers_2_weight::T10 -> _onx_matmul_relu |T1: 3 x 1 - Opset3
Add: _onx_matmul_relu, layers.2.bias -> output_0 |T1: 3 x 1 - Opset4
output:: output_0 |T1: 3 x 1
-- END --
[GraphBuilder-VNK.optimize] start with subgraphs
[GraphBuilder-VNK.optimize] done with subgraphs
[GraphBuilder-VNK.remove_identity_nodes] -- starts with 5
[GraphBuilder-VNK.remove_identity_nodes] found 0 replacements
[GraphBuilder-VNK.remove_identity_nodes] kept 5 nodes
[GraphBuilder-VNK.remove_identity_nodes] ends with 5 nodes in 2.9308001103345305e-05 seconds
[GraphBuilder-VNK.constant_folding] -- starts with 4 constants and 5 nodes.
[GraphBuilder-VNK.constant_folding] cst:: 1 :: layers.0.bias
[GraphBuilder-VNK.constant_folding] cst:: . :: _onx_matmul_x
[GraphBuilder-VNK.constant_folding] cst:: . :: x
[GraphBuilder-VNK.constant_folding] cst:: . :: relu
[GraphBuilder-VNK.constant_folding] cst:: . :: output_0
[GraphBuilder-VNK.constant_folding] cst:: 1 :: layers.2.bias
[GraphBuilder-VNK.constant_folding] cst:: . :: linear
[GraphBuilder-VNK.constant_folding] cst:: . :: _onx_matmul_relu
[GraphBuilder-VNK.constant_folding] cst:: 1 :: p_layers_2_weight::T10
[GraphBuilder-VNK.constant_folding] cst:: 1 :: p_layers_0_weight::T10
[GraphBuilder-VNK.constant_folding] initializer: p_layers_0_weight::T10
[GraphBuilder-VNK.constant_folding] initializer: p_layers_2_weight::T10
[GraphBuilder-VNK.constant_folding] initializer: layers.0.bias
[GraphBuilder-VNK.constant_folding] initializer: layers.2.bias
[GraphBuilder-VNK.constant_folding] ends with 4 constants and 5 nodes in 3.101299989793915e-05 seconds
[GraphBuilderPatternOptimization-VNK.optimize] start with 5 nodes, 4 initializers, 72 patterns, priorities=[0, 1, 3], max_iter=30
[GraphBuilderPatternOptimization-VNK.optimize] use pattern 1/72 - P0 - BatchNormalizationPattern()
[GraphBuilderPatternOptimization-VNK.optimize] use pattern 2/72 - P0 - BatchNormalizationTrainingPattern()
[GraphBuilderPatternOptimization-VNK.optimize] use pattern 3/72 - P0 - CastPattern()
[GraphBuilderPatternOptimization-VNK.optimize] use pattern 4/72 - P0 - ConcatGatherPattern()
[GraphBuilderPatternOptimization-VNK.optimize] use pattern 5/72 - P0 - ConcatReshapePattern()
[GraphBuilderPatternOptimization-VNK.optimize] use pattern 6/72 - P0 - ConvBiasNullPattern()
[GraphBuilderPatternOptimization-VNK.optimize] use pattern 7/72 - P0 - ExpandPattern()
[GraphBuilderPatternOptimization-VNK.optimize] use pattern 8/72 - P0 - GeluPattern()
[GraphBuilderPatternOptimization-VNK.optimize] use pattern 9/72 - P0 - IdentityPattern()
[GraphBuilderPatternOptimization-VNK.optimize] use pattern 10/72 - P0 - LeakyReluPattern()
[GraphBuilderPatternOptimization-VNK.optimize] use pattern 11/72 - P0 - ReshapePattern()
[GraphBuilderPatternOptimization-VNK.optimize] use pattern 12/72 - P0 - ReshapeReshapePattern()
[GraphBuilderPatternOptimization-VNK.optimize] use pattern 13/72 - P0 - SameChildrenPattern()
[GraphBuilderPatternOptimization-VNK.optimize] use pattern 14/72 - P0 - ShapeBasedEditDistanceReshapePattern()
[GraphBuilderPatternOptimization-VNK.optimize] use pattern 15/72 - P0 - ShapeBasedIdentityPattern()
[GraphBuilderPatternOptimization-VNK.optimize] use pattern 16/72 - P0 - ShapeBasedReshapeIsSqueezePattern()
[GraphBuilderPatternOptimization-VNK.optimize] use pattern 17/72 - P0 - ShapeBasedSameChildrenPattern()
[GraphBuilderPatternOptimization-VNK.optimize] use pattern 18/72 - P0 - ShapeBasedShapeShapeAddPattern()
[GraphBuilderPatternOptimization-VNK.optimize] use pattern 19/72 - P0 - ShapeBasedStaticExpandPattern()
[GraphBuilderPatternOptimization-VNK.optimize] use pattern 20/72 - P0 - SoftmaxCrossEntropyLossCastPattern()
[GraphBuilderPatternOptimization-VNK.optimize] use pattern 21/72 - P0 - SqueezeAddPattern()
[GraphBuilderPatternOptimization-VNK.optimize] use pattern 22/72 - P0 - SqueezeUnsqueezePattern()
[GraphBuilderPatternOptimization-VNK.optimize] use pattern 23/72 - P0 - StaticConcatReshapePattern()
[GraphBuilderPatternOptimization-VNK.optimize] use pattern 24/72 - P0 - TransposeReshapeTransposePattern()
[GraphBuilderPatternOptimization-VNK.optimize] use pattern 25/72 - P0 - TransposeTransposePattern()
[GraphBuilderPatternOptimization-VNK.optimize] use pattern 26/72 - P0 - UnsqueezeUnsqueezePattern()
[GraphBuilderPatternOptimization-VNK.optimize] use pattern 27/72 - P1 - CastCastBinaryPattern()
[GraphBuilderPatternOptimization-VNK.optimize] use pattern 28/72 - P1 - CastLayerNormalizationCastPattern()
[GraphBuilderPatternOptimization-VNK.optimize] use pattern 29/72 - P1 - CastOpCastPattern()
[GraphBuilderPatternOptimization-VNK.optimize] use pattern 30/72 - P1 - ClipClipPattern()
[GraphBuilderPatternOptimization-VNK.optimize] use pattern 31/72 - P1 - ComputationCastOpCastPattern()
[GraphBuilderPatternOptimization-VNK.optimize] use pattern 32/72 - P1 - ConcatEmptyPattern()
[GraphBuilderPatternOptimization-VNK.optimize] use pattern 33/72 - P1 - ConcatTwiceUnaryPattern()
[GraphBuilderPatternOptimization-VNK.optimize] use pattern 34/72 - P1 - DropoutPattern()
[GraphBuilderPatternOptimization-VNK.optimize] use pattern 35/72 - P1 - ExpandBroadcastPattern()
[GraphBuilderPatternOptimization-VNK.optimize] use pattern 36/72 - P1 - ExpandSwapPattern()
[GraphBuilderPatternOptimization-VNK.optimize] use pattern 37/72 - P1 - FunctionCausalMaskMulAddPattern()
[GraphBuilderPatternOptimization-VNK.optimize] use pattern 38/72 - P1 - FunctionCausalMaskPattern()
[GraphBuilderPatternOptimization-VNK.optimize] use pattern 39/72 - P1 - FunctionCosSinCachePattern()
[GraphBuilderPatternOptimization-VNK.optimize] use pattern 40/72 - P1 - FunctionHalfRotaryEmbeddingPattern()
[GraphBuilderPatternOptimization-VNK.optimize] use pattern 41/72 - P1 - GemmTransposePattern()
[GraphBuilderPatternOptimization-VNK.optimize] use pattern 42/72 - P1 - LayerNormalizationPattern()
[GraphBuilderPatternOptimization-VNK.optimize] use pattern 43/72 - P1 - LayerNormalizationScalePattern()
[GraphBuilderPatternOptimization-VNK.optimize] use pattern 44/72 - P1 - MatMulReshape2Of3Pattern()
[GraphBuilderPatternOptimization-VNK.optimize] use pattern 45/72 - P1 - MulMulMatMulPattern()
[GraphBuilderPatternOptimization-VNK.optimize] use pattern 46/72 - P1 - MulMulMulScalarPattern()
[GraphBuilderPatternOptimization-VNK.optimize] use pattern 47/72 - P1 - RMSNormalizationPattern()
[GraphBuilderPatternOptimization-VNK.optimize] use pattern 48/72 - P1 - ReduceReshapePattern()
[GraphBuilderPatternOptimization-VNK.optimize] use pattern 49/72 - P1 - ReduceSumNormalizePattern()
[GraphBuilderPatternOptimization-VNK.optimize] use pattern 50/72 - P1 - Reshape2Of3Pattern()
[GraphBuilderPatternOptimization-VNK.optimize] use pattern 51/72 - P1 - ReshapeMatMulReshapePattern()
[GraphBuilderPatternOptimization-VNK.optimize] use pattern 52/72 - P1 - ReshapeReshapeBinaryPattern()
[GraphBuilderPatternOptimization-VNK.optimize] use pattern 53/72 - P1 - RotaryConcatPartPattern()
[GraphBuilderPatternOptimization-VNK.optimize] use pattern 54/72 - P1 - RotaryEmbeddingPattern()
[GraphBuilderPatternOptimization-VNK.optimize] use pattern 55/72 - P1 - SequenceConstructAtPattern()
[GraphBuilderPatternOptimization-VNK.optimize] use pattern 56/72 - P1 - ShapeBasedConcatExpandPattern()
[GraphBuilderPatternOptimization-VNK.optimize] use pattern 57/72 - P1 - ShapeBasedExpandBroadcastMatMulPattern()
[GraphBuilderPatternOptimization-VNK.optimize] use pattern 58/72 - P1 - ShapeBasedExpandBroadcastPattern()
[GraphBuilderPatternOptimization-VNK.optimize] use pattern 59/72 - P1 - ShapeBasedExpandCastWhereSwapPattern()
[GraphBuilderPatternOptimization-VNK.optimize] use pattern 60/72 - P1 - ShapeBasedExpandSwapPattern()
[GraphBuilderPatternOptimization-VNK.optimize] use pattern 61/72 - P1 - ShapeBasedMatMulToMulPattern()
[GraphBuilderPatternOptimization-VNK.optimize] use pattern 62/72 - P1 - SliceSlicePattern()
[GraphBuilderPatternOptimization-VNK.optimize] use pattern 63/72 - P1 - SlicesSplitPattern()
[GraphBuilderPatternOptimization-VNK.optimize] use pattern 64/72 - P1 - SplitConcatPattern()
[GraphBuilderPatternOptimization-VNK.optimize] use pattern 65/72 - P1 - Sub1MulPattern()
[GraphBuilderPatternOptimization-VNK.optimize] use pattern 66/72 - P1 - SwitchOrderBinaryPattern()
[GraphBuilderPatternOptimization-VNK.optimize] use pattern 67/72 - P1 - SwitchReshapeActivationPattern()
[GraphBuilderPatternOptimization-VNK.optimize] use pattern 68/72 - P1 - TransposeEqualReshapePattern()
[GraphBuilderPatternOptimization-VNK.optimize] use pattern 69/72 - P1 - TransposeMatMulPattern()
[GraphBuilderPatternOptimization-VNK.optimize] use pattern 70/72 - P1 - TransposeReshapeMatMulPattern()
[GraphBuilderPatternOptimization-VNK.optimize] use pattern 71/72 - P1 - UnsqueezeEqualPattern()
[GraphBuilderPatternOptimization-VNK.optimize] use pattern 72/72 - P3 - MatMulAddPattern()
-- optimize starts with...
opset: : 18
init: p_layers_0_weight::T10: ?: ? -- GraphBuilder._update_structures_with_proto.1/from(p_layers_0_weight::T10)
init: p_layers_2_weight::T10: ?: ? -- GraphBuilder._update_structures_with_proto.1/from(p_layers_2_weight::T10)
init: layers.0.bias: ?: ? -- GraphBuilder._update_structures_with_proto.1/from(layers.0.bias)
init: layers.2.bias: ?: ? -- GraphBuilder._update_structures_with_proto.1/from(layers.2.bias)
input:: x |T1: 3 x 10
MatMul: x, p_layers_0_weight::T10 -> _onx_matmul_x |T1: 3 x 32 - Opset
Add: _onx_matmul_x, layers.0.bias -> linear |T1: 3 x 32 - Opset2
Relu: linear -> relu |T1: 3 x 32 - relu
MatMul: relu, p_layers_2_weight::T10 -> _onx_matmul_relu |T1: 3 x 1 - Opset3
Add: _onx_matmul_relu, layers.2.bias -> output_0 |T1: 3 x 1 - Opset4
output:: output_0 |T1: 3 x 1
-- starts optimization
[GraphBuilderPatternOptimization-VNK.optimize] iteration 0: 5 nodes, priority=0
[PatternOptimization.enumerate_matches] start BatchNormalizationPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start BatchNormalizationTrainingPattern with main_opset=18 and min_opset=1
[GraphBuilderPatternOptimization-VNK.optimize] skips CastLayerNormalizationCastPattern, pattern.priority=1, current_priority_index=0, priorities[current_priority_index]=0 priorities=[0, 1, 3]
[PatternOptimization.enumerate_matches] start CastPattern with main_opset=18 and min_opset=1
[GraphBuilderPatternOptimization-VNK.optimize] skips CastCastBinaryPattern, pattern.priority=1, current_priority_index=0, priorities[current_priority_index]=0 priorities=[0, 1, 3]
[GraphBuilderPatternOptimization-VNK.optimize] skips CastOpCastPattern, pattern.priority=1, current_priority_index=0, priorities[current_priority_index]=0 priorities=[0, 1, 3]
[GraphBuilderPatternOptimization-VNK.optimize] skips ClipClipPattern, pattern.priority=1, current_priority_index=0, priorities[current_priority_index]=0 priorities=[0, 1, 3]
[GraphBuilderPatternOptimization-VNK.optimize] skips ComputationCastOpCastPattern, pattern.priority=1, current_priority_index=0, priorities[current_priority_index]=0 priorities=[0, 1, 3]
[GraphBuilderPatternOptimization-VNK.optimize] skips ConcatEmptyPattern, pattern.priority=1, current_priority_index=0, priorities[current_priority_index]=0 priorities=[0, 1, 3]
[PatternOptimization.enumerate_matches] start ConcatGatherPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start ConcatReshapePattern with main_opset=18 and min_opset=1
[GraphBuilderPatternOptimization-VNK.optimize] skips ConcatTwiceUnaryPattern, pattern.priority=1, current_priority_index=0, priorities[current_priority_index]=0 priorities=[0, 1, 3]
[PatternOptimization.enumerate_matches] start ConvBiasNullPattern with main_opset=18 and min_opset=1
[GraphBuilderPatternOptimization-VNK.optimize] skips DropoutPattern, pattern.priority=1, current_priority_index=0, priorities[current_priority_index]=0 priorities=[0, 1, 3]
[PatternOptimization.enumerate_matches] start ExpandPattern with main_opset=18 and min_opset=1
[GraphBuilderPatternOptimization-VNK.optimize] skips ExpandBroadcastPattern, pattern.priority=1, current_priority_index=0, priorities[current_priority_index]=0 priorities=[0, 1, 3]
[GraphBuilderPatternOptimization-VNK.optimize] skips ExpandSwapPattern, pattern.priority=1, current_priority_index=0, priorities[current_priority_index]=0 priorities=[0, 1, 3]
[PatternOptimization.enumerate_matches] start GeluPattern with main_opset=18 and min_opset=20
[PatternOptimization.enumerate_matches] start IdentityPattern with main_opset=18 and min_opset=1
[IdentityPattern.match] NONE - line: 297:experimental_experiment.xoptim.patterns.onnx_any, op_type=Add, name=Opset2, inputs=_onx_matmul_x,layers.0.bias
[IdentityPattern.match] NONE - line: 339:experimental_experiment.xoptim.patterns.onnx_any, op_type=Add, name=Opset4, inputs=_onx_matmul_relu,layers.2.bias
[GraphBuilderPatternOptimization-VNK.optimize] skips LayerNormalizationPattern, pattern.priority=1, current_priority_index=0, priorities[current_priority_index]=0 priorities=[0, 1, 3]
[GraphBuilderPatternOptimization-VNK.optimize] skips LayerNormalizationScalePattern, pattern.priority=1, current_priority_index=0, priorities[current_priority_index]=0 priorities=[0, 1, 3]
[PatternOptimization.enumerate_matches] start LeakyReluPattern with main_opset=18 and min_opset=6
[GraphBuilder-KQQ.make_tensor_input] x[0:None] -- marker=_build_pattern1_x
[GraphBuilder-KQQ.set_type] x:0
[GraphBuilder-KQQ.set_type] x:-1
[GraphBuilder-KQQ.make_tensor_input] zero[0:None] -- marker=_build_pattern1_zero
[GraphBuilder-KQQ.set_type] zero:0
[GraphBuilder-KQQ.set_type] zero:-1
[GraphBuilder-KQQ.make_tensor_input] slope[0:None] -- marker=_build_pattern1_slope
[GraphBuilder-KQQ.set_type] slope:0
[GraphBuilder-KQQ.set_type] slope:-1
[GraphBuilder-KQQ.make_node] [TT:-] Greater: ['x', 'zero']->['_onx_greater_x']
[GraphBuilder-KQQ.set_type] _onx_greater_x:9
[GraphBuilder-KQQ.make_node] [TT:-] Mul: ['x', 'slope']->['_onx_mul_x']
[GraphBuilder-KQQ.set_type] _onx_mul_x:-1
[GraphBuilder-KQQ.make_node] [TTT:-] Where: ['_onx_greater_x', 'x', '_onx_mul_x']->['_onx_where_greater_x']
[GraphBuilder-KQQ.set_type] _onx_where_greater_x:-1
[GraphBuilder-KQQ.make_tensor_output] _onx_where_greater_x[0: None]
[GraphBuilderPatternOptimization-VNK.optimize] skips MulMulMulScalarPattern, pattern.priority=1, current_priority_index=0, priorities[current_priority_index]=0 priorities=[0, 1, 3]
[GraphBuilderPatternOptimization-VNK.optimize] skips ReduceReshapePattern, pattern.priority=1, current_priority_index=0, priorities[current_priority_index]=0 priorities=[0, 1, 3]
[GraphBuilderPatternOptimization-VNK.optimize] skips ReduceSumNormalizePattern, pattern.priority=1, current_priority_index=0, priorities[current_priority_index]=0 priorities=[0, 1, 3]
[PatternOptimization.enumerate_matches] start ReshapePattern with main_opset=18 and min_opset=1
[GraphBuilderPatternOptimization-VNK.optimize] skips ReshapeMatMulReshapePattern, pattern.priority=1, current_priority_index=0, priorities[current_priority_index]=0 priorities=[0, 1, 3]
[GraphBuilderPatternOptimization-VNK.optimize] skips Reshape2Of3Pattern, pattern.priority=1, current_priority_index=0, priorities[current_priority_index]=0 priorities=[0, 1, 3]
[GraphBuilderPatternOptimization-VNK.optimize] skips ReshapeReshapeBinaryPattern, pattern.priority=1, current_priority_index=0, priorities[current_priority_index]=0 priorities=[0, 1, 3]
[GraphBuilderPatternOptimization-VNK.optimize] skips MatMulAddPattern, pattern.priority=3, current_priority_index=0, priorities[current_priority_index]=0 priorities=[0, 1, 3]
[GraphBuilderPatternOptimization-VNK.optimize] skips GemmTransposePattern, pattern.priority=1, current_priority_index=0, priorities[current_priority_index]=0 priorities=[0, 1, 3]
[GraphBuilderPatternOptimization-VNK.optimize] skips MatMulReshape2Of3Pattern, pattern.priority=1, current_priority_index=0, priorities[current_priority_index]=0 priorities=[0, 1, 3]
[GraphBuilderPatternOptimization-VNK.optimize] skips MulMulMatMulPattern, pattern.priority=1, current_priority_index=0, priorities[current_priority_index]=0 priorities=[0, 1, 3]
[PatternOptimization.enumerate_matches] start ShapeBasedReshapeIsSqueezePattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start ShapeBasedStaticExpandPattern with main_opset=18 and min_opset=1
[GraphBuilderPatternOptimization-VNK.optimize] skips ShapeBasedConcatExpandPattern, pattern.priority=1, current_priority_index=0, priorities[current_priority_index]=0 priorities=[0, 1, 3]
[PatternOptimization.enumerate_matches] start ShapeBasedEditDistanceReshapePattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start ShapeBasedIdentityPattern with main_opset=18 and min_opset=1
[GraphBuilderPatternOptimization-VNK.optimize] skips ShapeBasedExpandBroadcastPattern, pattern.priority=1, current_priority_index=0, priorities[current_priority_index]=0 priorities=[0, 1, 3]
[GraphBuilderPatternOptimization-VNK.optimize] skips ShapeBasedExpandBroadcastMatMulPattern, pattern.priority=1, current_priority_index=0, priorities[current_priority_index]=0 priorities=[0, 1, 3]
[GraphBuilderPatternOptimization-VNK.optimize] skips ShapeBasedExpandCastWhereSwapPattern, pattern.priority=1, current_priority_index=0, priorities[current_priority_index]=0 priorities=[0, 1, 3]
[GraphBuilderPatternOptimization-VNK.optimize] skips ShapeBasedExpandSwapPattern, pattern.priority=1, current_priority_index=0, priorities[current_priority_index]=0 priorities=[0, 1, 3]
[GraphBuilderPatternOptimization-VNK.optimize] skips ShapeBasedMatMulToMulPattern, pattern.priority=1, current_priority_index=0, priorities[current_priority_index]=0 priorities=[0, 1, 3]
[PatternOptimization.enumerate_matches] start ShapeBasedSameChildrenPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start ShapeBasedShapeShapeAddPattern with main_opset=18 and min_opset=1
[ShapeBasedShapeShapeAddPattern.match] NONE - line: 23:experimental_experiment.xoptim.patterns.onnx_shape, op_type=Add, name=Opset2, inputs=_onx_matmul_x,layers.0.bias
[ShapeBasedShapeShapeAddPattern.match] NONE - line: 23:experimental_experiment.xoptim.patterns.onnx_shape, op_type=Add, name=Opset4, inputs=_onx_matmul_relu,layers.2.bias
[PatternOptimization.enumerate_matches] start ReshapeReshapePattern with main_opset=18 and min_opset=1
[GraphBuilderPatternOptimization-VNK.optimize] skips RotaryEmbeddingPattern, pattern.priority=1, current_priority_index=0, priorities[current_priority_index]=0 priorities=[0, 1, 3]
[PatternOptimization.enumerate_matches] start SameChildrenPattern with main_opset=18 and min_opset=1
[GraphBuilderPatternOptimization-VNK.optimize] skips SequenceConstructAtPattern, pattern.priority=1, current_priority_index=0, priorities[current_priority_index]=0 priorities=[0, 1, 3]
[GraphBuilderPatternOptimization-VNK.optimize] skips SliceSlicePattern, pattern.priority=1, current_priority_index=0, priorities[current_priority_index]=0 priorities=[0, 1, 3]
[GraphBuilderPatternOptimization-VNK.optimize] skips SlicesSplitPattern, pattern.priority=1, current_priority_index=0, priorities[current_priority_index]=0 priorities=[0, 1, 3]
[PatternOptimization.enumerate_matches] start SoftmaxCrossEntropyLossCastPattern with main_opset=18 and min_opset=14
[GraphBuilder-VMU.make_tensor_input] X[0:None] -- marker=_build_pattern1_X
[GraphBuilder-VMU.set_type] X:0
[GraphBuilder-VMU.set_type] X:-1
[GraphBuilder-VMU.make_tensor_input] indices[0:None] -- marker=_build_pattern1_indices
[GraphBuilder-VMU.set_type] indices:0
[GraphBuilder-VMU.set_type] indices:-1
[GraphBuilder-VMU.make_tensor_input] axis[0:None] -- marker=_build_pattern1_axis
[GraphBuilder-VMU.set_type] axis:0
[GraphBuilder-VMU.set_type] axis:-1
[GraphBuilder-VMU.make_tensor_input] zerof[0:None] -- marker=_build_pattern1_zerof
[GraphBuilder-VMU.set_type] zerof:0
[GraphBuilder-VMU.set_type] zerof:-1
[GraphBuilder-VMU.make_tensor_input] zeroi[0:None] -- marker=_build_pattern1_zeroi
[GraphBuilder-VMU.set_type] zeroi:0
[GraphBuilder-VMU.set_type] zeroi:-1
[GraphBuilder-VMU.make_tensor_input] b[0:None] -- marker=_build_pattern1_b
[GraphBuilder-VMU.set_type] b:0
[GraphBuilder-VMU.set_type] b:-1
[GraphBuilder-VMU.make_node] [TT:-] Equal: ['indices', 'b']->['_onx_equal_indices']
[GraphBuilder-VMU.set_type] _onx_equal_indices:9
[GraphBuilder-VMU.make_node] [T:-] Not: ['_onx_equal_indices']->['_onx_not_equal_indices']
[GraphBuilder-VMU.set_type] _onx_not_equal_indices:9
[GraphBuilder-VMU.make_node] [TTT:-] Where: ['_onx_not_equal_indices', 'indices', 'zeroi']->['_onx_where_not_equal_indices']
[GraphBuilder-VMU.set_type] _onx_where_not_equal_indices:-1
[GraphBuilder-VMU.make_node] [TT:-] Unsqueeze: ['_onx_where_not_equal_indices', 'axis']->['_onx_where_not_equal_indices::UnSq']
[GraphBuilder-VMU.set_type] _onx_where_not_equal_indices::UnSq:-1
[GraphBuilder-VMU.make_node] [T:-] LogSoftmax: ['X']->['_onx_logsoftmax_X']
[GraphBuilder-VMU.set_type] _onx_logsoftmax_X:-1
[GraphBuilder-VMU.set_type] _onx_gatherelements_logsoftmax_X:-1
[GraphBuilder-VMU.make_node] [TT:T] GatherElements: ['_onx_logsoftmax_X', '_onx_where_not_equal_indices::UnSq']->['_onx_gatherelements_logsoftmax_X']
[GraphBuilder-VMU.set_type] _onx_gatherelements_logsoftmax_X:-1
[GraphBuilder-VMU.make_node] [TT:-] Squeeze: ['_onx_gatherelements_logsoftmax_X', 'axis']->['_onx_gatherelements_logsoftmax_X::Sq']
[GraphBuilder-VMU.set_type] _onx_gatherelements_logsoftmax_X::Sq:-1
[GraphBuilder-VMU.make_node] [T:-] Neg: ['_onx_gatherelements_logsoftmax_X::Sq']->['_onx_neg_gatherelements_logsoftmax_X::Sq']
[GraphBuilder-VMU.set_type] _onx_neg_gatherelements_logsoftmax_X::Sq:-1
[GraphBuilder-VMU.make_node] [TTT:-] Where: ['_onx_not_equal_indices', '_onx_neg_gatherelements_logsoftmax_X::Sq', 'zerof']->['_onx_where_not_equal_indices2']
[GraphBuilder-VMU.set_type] _onx_where_not_equal_indices2:-1
[GraphBuilder-VMU.make_node] [T:-] Cast: ['_onx_not_equal_indices']->['_onx_not_equal_indices::C1']
[GraphBuilder-VMU.set_type] _onx_not_equal_indices::C1:1
[GraphBuilder-VMU.make_node] [T:-] ReduceSum: ['_onx_not_equal_indices::C1']->['_onx_reducesum_not_equal_indices::C1']
[GraphBuilder-VMU.set_type] _onx_reducesum_not_equal_indices::C1:1
[GraphBuilder-VMU.set_shape] _onx_reducesum_not_equal_indices::C1:()
[GraphBuilder-VMU.set_rank] _onx_reducesum_not_equal_indices::C1:0
[GraphBuilder-VMU.make_node] [#:-] Cast: ['_onx_reducesum_not_equal_indices::C1']->['_onx_reducesum_not_equal_indices::C1::C10']
[GraphBuilder-VMU.set_type] _onx_reducesum_not_equal_indices::C1::C10:10
[GraphBuilder-VMU.set_shape] _onx_reducesum_not_equal_indices::C1::C10:()
[GraphBuilder-VMU.set_rank] _onx_reducesum_not_equal_indices::C1::C10:0
[GraphBuilder-VMU.make_node] [T:-] Cast: ['_onx_where_not_equal_indices2']->['_onx_where_not_equal_indices2::C1']
[GraphBuilder-VMU.set_type] _onx_where_not_equal_indices2::C1:1
[GraphBuilder-VMU.make_node] [T:-] ReduceSum: ['_onx_where_not_equal_indices2::C1']->['_onx_reducesum_where_not_equal_indices2::C1']
[GraphBuilder-VMU.set_type] _onx_reducesum_where_not_equal_indices2::C1:1
[GraphBuilder-VMU.set_shape] _onx_reducesum_where_not_equal_indices2::C1:()
[GraphBuilder-VMU.set_rank] _onx_reducesum_where_not_equal_indices2::C1:0
[GraphBuilder-VMU.make_node] [#:-] Cast: ['_onx_reducesum_where_not_equal_indices2::C1']->['_onx_reducesum_where_not_equal_indices2::C1::C10']
[GraphBuilder-VMU.set_type] _onx_reducesum_where_not_equal_indices2::C1::C10:10
[GraphBuilder-VMU.set_shape] _onx_reducesum_where_not_equal_indices2::C1::C10:()
[GraphBuilder-VMU.set_rank] _onx_reducesum_where_not_equal_indices2::C1::C10:0
[GraphBuilder-VMU.make_node] [##:-] Div: ['_onx_reducesum_where_not_equal_indices2::C1::C10', '_onx_reducesum_not_equal_indices::C1::C10']->['_onx_div_reducesum_where_not_equal_indices2::C1::C10']
[GraphBuilder-VMU.set_type] _onx_div_reducesum_where_not_equal_indices2::C1::C10:10
[GraphBuilder-VMU.set_shape] _onx_div_reducesum_where_not_equal_indices2::C1::C10:()
[GraphBuilder-VMU.set_rank] _onx_div_reducesum_where_not_equal_indices2::C1::C10:0
[GraphBuilder-VMU.make_tensor_output] _onx_div_reducesum_where_not_equal_indices2::C1::C10[0: None]
[GraphBuilderPatternOptimization-VNK.optimize] skips SplitConcatPattern, pattern.priority=1, current_priority_index=0, priorities[current_priority_index]=0 priorities=[0, 1, 3]
[PatternOptimization.enumerate_matches] start SqueezeAddPattern with main_opset=18 and min_opset=1
[SqueezeAddPattern.match] NONE - line: 211:experimental_experiment.xoptim.patterns.onnx_unsqueeze, op_type=Add, name=Opset2, inputs=_onx_matmul_x,layers.0.bias
[SqueezeAddPattern.match] NONE - line: 211:experimental_experiment.xoptim.patterns.onnx_unsqueeze, op_type=Add, name=Opset4, inputs=_onx_matmul_relu,layers.2.bias
[PatternOptimization.enumerate_matches] start SqueezeUnsqueezePattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start StaticConcatReshapePattern with main_opset=18 and min_opset=1
[GraphBuilderPatternOptimization-VNK.optimize] skips Sub1MulPattern, pattern.priority=1, current_priority_index=0, priorities[current_priority_index]=0 priorities=[0, 1, 3]
[GraphBuilderPatternOptimization-VNK.optimize] skips SwitchOrderBinaryPattern, pattern.priority=1, current_priority_index=0, priorities[current_priority_index]=0 priorities=[0, 1, 3]
[GraphBuilderPatternOptimization-VNK.optimize] skips SwitchReshapeActivationPattern, pattern.priority=1, current_priority_index=0, priorities[current_priority_index]=0 priorities=[0, 1, 3]
[GraphBuilderPatternOptimization-VNK.optimize] skips TransposeEqualReshapePattern, pattern.priority=1, current_priority_index=0, priorities[current_priority_index]=0 priorities=[0, 1, 3]
[GraphBuilderPatternOptimization-VNK.optimize] skips TransposeMatMulPattern, pattern.priority=1, current_priority_index=0, priorities[current_priority_index]=0 priorities=[0, 1, 3]
[GraphBuilderPatternOptimization-VNK.optimize] skips TransposeReshapeMatMulPattern, pattern.priority=1, current_priority_index=0, priorities[current_priority_index]=0 priorities=[0, 1, 3]
[PatternOptimization.enumerate_matches] start TransposeReshapeTransposePattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start TransposeTransposePattern with main_opset=18 and min_opset=1
[GraphBuilderPatternOptimization-VNK.optimize] skips UnsqueezeEqualPattern, pattern.priority=1, current_priority_index=0, priorities[current_priority_index]=0 priorities=[0, 1, 3]
[PatternOptimization.enumerate_matches] start UnsqueezeUnsqueezePattern with main_opset=18 and min_opset=1
[GraphBuilderPatternOptimization-VNK.optimize] skips RotaryConcatPartPattern, pattern.priority=1, current_priority_index=0, priorities[current_priority_index]=0 priorities=[0, 1, 3]
[GraphBuilderPatternOptimization-VNK.optimize] skips FunctionCausalMaskPattern, pattern.priority=1, current_priority_index=0, priorities[current_priority_index]=0 priorities=[0, 1, 3]
[GraphBuilderPatternOptimization-VNK.optimize] skips FunctionCausalMaskMulAddPattern, pattern.priority=1, current_priority_index=0, priorities[current_priority_index]=0 priorities=[0, 1, 3]
[GraphBuilderPatternOptimization-VNK.optimize] skips FunctionCosSinCachePattern, pattern.priority=1, current_priority_index=0, priorities[current_priority_index]=0 priorities=[0, 1, 3]
[GraphBuilderPatternOptimization-VNK.optimize] skips FunctionHalfRotaryEmbeddingPattern, pattern.priority=1, current_priority_index=0, priorities[current_priority_index]=0 priorities=[0, 1, 3]
[GraphBuilderPatternOptimization-VNK.optimize] skips RMSNormalizationPattern, pattern.priority=1, current_priority_index=0, priorities[current_priority_index]=0 priorities=[0, 1, 3]
[GraphBuilderPatternOptimization-VNK.optimize] done all: -0 +0 nodes
[GraphBuilder-VNK.remove_identity_nodes] -- starts with 5
[GraphBuilder-VNK.remove_identity_nodes] found 0 replacements
[GraphBuilder-VNK.remove_identity_nodes] kept 5 nodes
[GraphBuilder-VNK.remove_identity_nodes] ends with 5 nodes in 4.004099901067093e-05 seconds
[GraphBuilderPatternOptimization-VNK.optimize] increase priority to 1
[GraphBuilderPatternOptimization-VNK.optimize] iteration 1: 5 nodes, priority=1
[PatternOptimization.enumerate_matches] start BatchNormalizationPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start BatchNormalizationTrainingPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start CastLayerNormalizationCastPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start CastPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start CastCastBinaryPattern with main_opset=18 and min_opset=1
[CastCastBinaryPattern.match] NONE - line: 87:experimental_experiment.xoptim.patterns.onnx_cast, op_type=Add, name=Opset2, inputs=_onx_matmul_x,layers.0.bias
[CastCastBinaryPattern.match] NONE - line: 87:experimental_experiment.xoptim.patterns.onnx_cast, op_type=Add, name=Opset4, inputs=_onx_matmul_relu,layers.2.bias
[PatternOptimization.enumerate_matches] start CastOpCastPattern with main_opset=18 and min_opset=1
[CastOpCastPattern.match] NONE - line: 180:experimental_experiment.xoptim.patterns.onnx_cast, op_type=Add, name=Opset2, inputs=_onx_matmul_x,layers.0.bias
[CastOpCastPattern.match] NONE - line: 177:experimental_experiment.xoptim.patterns.onnx_cast, op_type=Add, name=Opset4, inputs=_onx_matmul_relu,layers.2.bias
[PatternOptimization.enumerate_matches] start ClipClipPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start ComputationCastOpCastPattern with main_opset=18 and min_opset=1
[ComputationCastOpCastPattern.match] NONE - line: 334:experimental_experiment.xoptim.patterns.onnx_cast, op_type=Add, name=Opset2, inputs=_onx_matmul_x,layers.0.bias
[ComputationCastOpCastPattern.match] NONE - line: 334:experimental_experiment.xoptim.patterns.onnx_cast, op_type=Add, name=Opset4, inputs=_onx_matmul_relu,layers.2.bias
[PatternOptimization.enumerate_matches] start ConcatEmptyPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start ConcatGatherPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start ConcatReshapePattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start ConcatTwiceUnaryPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start ConvBiasNullPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start DropoutPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start ExpandPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start ExpandBroadcastPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start ExpandSwapPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start GeluPattern with main_opset=18 and min_opset=20
[PatternOptimization.enumerate_matches] start IdentityPattern with main_opset=18 and min_opset=1
[IdentityPattern.match] NONE - line: 297:experimental_experiment.xoptim.patterns.onnx_any, op_type=Add, name=Opset2, inputs=_onx_matmul_x,layers.0.bias
[IdentityPattern.match] NONE - line: 339:experimental_experiment.xoptim.patterns.onnx_any, op_type=Add, name=Opset4, inputs=_onx_matmul_relu,layers.2.bias
[PatternOptimization.enumerate_matches] start LayerNormalizationPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start LayerNormalizationScalePattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start LeakyReluPattern with main_opset=18 and min_opset=6
[PatternOptimization.enumerate_matches] start MulMulMulScalarPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start ReduceReshapePattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start ReduceSumNormalizePattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start ReshapePattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start ReshapeMatMulReshapePattern with main_opset=18 and min_opset=1
[ReshapeMatMulReshapePattern.match] NONE - line: 780:experimental_experiment.xoptim.patterns.onnx_matmul, op_type=MatMul, name=Opset, inputs=x,p_layers_0_weight::T10
[ReshapeMatMulReshapePattern.match] NONE - line: 780:experimental_experiment.xoptim.patterns.onnx_matmul, op_type=MatMul, name=Opset3, inputs=relu,p_layers_2_weight::T10
[PatternOptimization.enumerate_matches] start Reshape2Of3Pattern with main_opset=18 and min_opset=1
[Reshape2Of3Pattern.match] NONE - line: 306:experimental_experiment.xoptim.patterns.onnx_reshape, op_type=Add, name=Opset2, inputs=_onx_matmul_x,layers.0.bias
[Reshape2Of3Pattern.match] NONE - line: 306:experimental_experiment.xoptim.patterns.onnx_reshape, op_type=Add, name=Opset4, inputs=_onx_matmul_relu,layers.2.bias
[PatternOptimization.enumerate_matches] start ReshapeReshapeBinaryPattern with main_opset=18 and min_opset=1
[ReshapeReshapeBinaryPattern.match] NONE - line: 486:experimental_experiment.xoptim.patterns.onnx_reshape, op_type=Add, name=Opset2, inputs=_onx_matmul_x,layers.0.bias
[ReshapeReshapeBinaryPattern.match] NONE - line: 486:experimental_experiment.xoptim.patterns.onnx_reshape, op_type=Add, name=Opset4, inputs=_onx_matmul_relu,layers.2.bias
[GraphBuilderPatternOptimization-VNK.optimize] skips MatMulAddPattern, pattern.priority=3, current_priority_index=1, priorities[current_priority_index]=1 priorities=[0, 1, 3]
[PatternOptimization.enumerate_matches] start GemmTransposePattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start MatMulReshape2Of3Pattern with main_opset=18 and min_opset=1
[MatMulReshape2Of3Pattern.match] NONE - line: 400:experimental_experiment.xoptim.patterns.onnx_matmul, op_type=MatMul, name=Opset, inputs=x,p_layers_0_weight::T10
[MatMulReshape2Of3Pattern.match] NONE - line: 400:experimental_experiment.xoptim.patterns.onnx_matmul, op_type=MatMul, name=Opset3, inputs=relu,p_layers_2_weight::T10
[PatternOptimization.enumerate_matches] start MulMulMatMulPattern with main_opset=18 and min_opset=1
[MulMulMatMulPattern.match] NONE - line: 716:experimental_experiment.xoptim.patterns.onnx_matmul, op_type=MatMul, name=Opset, inputs=x,p_layers_0_weight::T10
[MulMulMatMulPattern.match] NONE - line: 716:experimental_experiment.xoptim.patterns.onnx_matmul, op_type=MatMul, name=Opset3, inputs=relu,p_layers_2_weight::T10
[PatternOptimization.enumerate_matches] start ShapeBasedReshapeIsSqueezePattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start ShapeBasedStaticExpandPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start ShapeBasedConcatExpandPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start ShapeBasedEditDistanceReshapePattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start ShapeBasedIdentityPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start ShapeBasedExpandBroadcastPattern with main_opset=18 and min_opset=1
[ShapeBasedExpandBroadcastPattern.match] NONE - line: 232:experimental_experiment.xoptim.patterns.onnx_expand, op_type=Add, name=Opset2, inputs=_onx_matmul_x,layers.0.bias
[ShapeBasedExpandBroadcastPattern.match] NONE - line: 232:experimental_experiment.xoptim.patterns.onnx_expand, op_type=Add, name=Opset4, inputs=_onx_matmul_relu,layers.2.bias
[PatternOptimization.enumerate_matches] start ShapeBasedExpandBroadcastMatMulPattern with main_opset=18 and min_opset=1
[ShapeBasedExpandBroadcastMatMulPattern.match] NONE - line: 710:experimental_experiment.xoptim.patterns.onnx_expand, op_type=MatMul, name=Opset, inputs=x,p_layers_0_weight::T10
[ShapeBasedExpandBroadcastMatMulPattern.match] NONE - line: 710:experimental_experiment.xoptim.patterns.onnx_expand, op_type=MatMul, name=Opset3, inputs=relu,p_layers_2_weight::T10
[PatternOptimization.enumerate_matches] start ShapeBasedExpandCastWhereSwapPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start ShapeBasedExpandSwapPattern with main_opset=18 and min_opset=1
[ShapeBasedExpandSwapPattern.match] NONE - line: 560:experimental_experiment.xoptim.patterns.onnx_expand, op_type=Add, name=Opset2, inputs=_onx_matmul_x,layers.0.bias
[ShapeBasedExpandSwapPattern.match] NONE - line: 560:experimental_experiment.xoptim.patterns.onnx_expand, op_type=Add, name=Opset4, inputs=_onx_matmul_relu,layers.2.bias
[PatternOptimization.enumerate_matches] start ShapeBasedMatMulToMulPattern with main_opset=18 and min_opset=1
[ShapeBasedMatMulToMulPattern.match] NONE - line: 1255:experimental_experiment.xoptim.patterns.onnx_matmul, op_type=MatMul, name=Opset, inputs=x,p_layers_0_weight::T10
[ShapeBasedMatMulToMulPattern.match] NONE - line: 1255:experimental_experiment.xoptim.patterns.onnx_matmul, op_type=MatMul, name=Opset3, inputs=relu,p_layers_2_weight::T10
[PatternOptimization.enumerate_matches] start ShapeBasedSameChildrenPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start ShapeBasedShapeShapeAddPattern with main_opset=18 and min_opset=1
[ShapeBasedShapeShapeAddPattern.match] NONE - line: 23:experimental_experiment.xoptim.patterns.onnx_shape, op_type=Add, name=Opset2, inputs=_onx_matmul_x,layers.0.bias
[ShapeBasedShapeShapeAddPattern.match] NONE - line: 23:experimental_experiment.xoptim.patterns.onnx_shape, op_type=Add, name=Opset4, inputs=_onx_matmul_relu,layers.2.bias
[PatternOptimization.enumerate_matches] start ReshapeReshapePattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start RotaryEmbeddingPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start SameChildrenPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start SequenceConstructAtPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start SliceSlicePattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start SlicesSplitPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start SoftmaxCrossEntropyLossCastPattern with main_opset=18 and min_opset=14
[PatternOptimization.enumerate_matches] start SplitConcatPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start SqueezeAddPattern with main_opset=18 and min_opset=1
[SqueezeAddPattern.match] NONE - line: 211:experimental_experiment.xoptim.patterns.onnx_unsqueeze, op_type=Add, name=Opset2, inputs=_onx_matmul_x,layers.0.bias
[SqueezeAddPattern.match] NONE - line: 211:experimental_experiment.xoptim.patterns.onnx_unsqueeze, op_type=Add, name=Opset4, inputs=_onx_matmul_relu,layers.2.bias
[PatternOptimization.enumerate_matches] start SqueezeUnsqueezePattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start StaticConcatReshapePattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start Sub1MulPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start SwitchOrderBinaryPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start SwitchReshapeActivationPattern with main_opset=18 and min_opset=1
[SwitchReshapeActivationPattern.match] NONE - line: 1178:experimental_experiment.xoptim.patterns.onnx_matmul, op_type=Relu, name=relu, inputs=linear
[PatternOptimization.enumerate_matches] start TransposeEqualReshapePattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start TransposeMatMulPattern with main_opset=18 and min_opset=1
[TransposeMatMulPattern.match] NONE - line: 890:experimental_experiment.xoptim.patterns.onnx_matmul, op_type=MatMul, name=Opset, inputs=x,p_layers_0_weight::T10
[TransposeMatMulPattern.match] NONE - line: 890:experimental_experiment.xoptim.patterns.onnx_matmul, op_type=MatMul, name=Opset3, inputs=relu,p_layers_2_weight::T10
[PatternOptimization.enumerate_matches] start TransposeReshapeMatMulPattern with main_opset=18 and min_opset=1
[TransposeReshapeMatMulPattern.match] NONE - line: 1033:experimental_experiment.xoptim.patterns.onnx_matmul, op_type=MatMul, name=Opset, inputs=x,p_layers_0_weight::T10
[TransposeReshapeMatMulPattern.match] NONE - line: 1033:experimental_experiment.xoptim.patterns.onnx_matmul, op_type=MatMul, name=Opset3, inputs=relu,p_layers_2_weight::T10
[PatternOptimization.enumerate_matches] start TransposeReshapeTransposePattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start TransposeTransposePattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start UnsqueezeEqualPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start UnsqueezeUnsqueezePattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start RotaryConcatPartPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start FunctionCausalMaskPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start FunctionCausalMaskMulAddPattern with main_opset=18 and min_opset=1
[FunctionCausalMaskMulAddPattern.match] NONE - line: 1119:experimental_experiment.xoptim.patterns.onnx_rotary, op_type=Add, name=Opset2, inputs=_onx_matmul_x,layers.0.bias
[FunctionCausalMaskMulAddPattern.match] NONE - line: 1119:experimental_experiment.xoptim.patterns.onnx_rotary, op_type=Add, name=Opset4, inputs=_onx_matmul_relu,layers.2.bias
[PatternOptimization.enumerate_matches] start FunctionCosSinCachePattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start FunctionHalfRotaryEmbeddingPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start RMSNormalizationPattern with main_opset=18 and min_opset=1
[GraphBuilderPatternOptimization-VNK.optimize] done all: -0 +0 nodes
[GraphBuilder-VNK.remove_identity_nodes] -- starts with 5
[GraphBuilder-VNK.remove_identity_nodes] found 0 replacements
[GraphBuilder-VNK.remove_identity_nodes] kept 5 nodes
[GraphBuilder-VNK.remove_identity_nodes] ends with 5 nodes in 4.6549999751732685e-05 seconds
[GraphBuilderPatternOptimization-VNK.optimize] increase priority to 3
[GraphBuilderPatternOptimization-VNK.optimize] iteration 2: 5 nodes, priority=3
[PatternOptimization.enumerate_matches] start BatchNormalizationPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start BatchNormalizationTrainingPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start CastLayerNormalizationCastPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start CastPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start CastCastBinaryPattern with main_opset=18 and min_opset=1
[CastCastBinaryPattern.match] NONE - line: 87:experimental_experiment.xoptim.patterns.onnx_cast, op_type=Add, name=Opset2, inputs=_onx_matmul_x,layers.0.bias
[CastCastBinaryPattern.match] NONE - line: 87:experimental_experiment.xoptim.patterns.onnx_cast, op_type=Add, name=Opset4, inputs=_onx_matmul_relu,layers.2.bias
[PatternOptimization.enumerate_matches] start CastOpCastPattern with main_opset=18 and min_opset=1
[CastOpCastPattern.match] NONE - line: 180:experimental_experiment.xoptim.patterns.onnx_cast, op_type=Add, name=Opset2, inputs=_onx_matmul_x,layers.0.bias
[CastOpCastPattern.match] NONE - line: 177:experimental_experiment.xoptim.patterns.onnx_cast, op_type=Add, name=Opset4, inputs=_onx_matmul_relu,layers.2.bias
[PatternOptimization.enumerate_matches] start ClipClipPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start ComputationCastOpCastPattern with main_opset=18 and min_opset=1
[ComputationCastOpCastPattern.match] NONE - line: 334:experimental_experiment.xoptim.patterns.onnx_cast, op_type=Add, name=Opset2, inputs=_onx_matmul_x,layers.0.bias
[ComputationCastOpCastPattern.match] NONE - line: 334:experimental_experiment.xoptim.patterns.onnx_cast, op_type=Add, name=Opset4, inputs=_onx_matmul_relu,layers.2.bias
[PatternOptimization.enumerate_matches] start ConcatEmptyPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start ConcatGatherPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start ConcatReshapePattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start ConcatTwiceUnaryPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start ConvBiasNullPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start DropoutPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start ExpandPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start ExpandBroadcastPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start ExpandSwapPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start GeluPattern with main_opset=18 and min_opset=20
[PatternOptimization.enumerate_matches] start IdentityPattern with main_opset=18 and min_opset=1
[IdentityPattern.match] NONE - line: 297:experimental_experiment.xoptim.patterns.onnx_any, op_type=Add, name=Opset2, inputs=_onx_matmul_x,layers.0.bias
[IdentityPattern.match] NONE - line: 339:experimental_experiment.xoptim.patterns.onnx_any, op_type=Add, name=Opset4, inputs=_onx_matmul_relu,layers.2.bias
[PatternOptimization.enumerate_matches] start LayerNormalizationPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start LayerNormalizationScalePattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start LeakyReluPattern with main_opset=18 and min_opset=6
[PatternOptimization.enumerate_matches] start MulMulMulScalarPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start ReduceReshapePattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start ReduceSumNormalizePattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start ReshapePattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start ReshapeMatMulReshapePattern with main_opset=18 and min_opset=1
[ReshapeMatMulReshapePattern.match] NONE - line: 780:experimental_experiment.xoptim.patterns.onnx_matmul, op_type=MatMul, name=Opset, inputs=x,p_layers_0_weight::T10
[ReshapeMatMulReshapePattern.match] NONE - line: 780:experimental_experiment.xoptim.patterns.onnx_matmul, op_type=MatMul, name=Opset3, inputs=relu,p_layers_2_weight::T10
[PatternOptimization.enumerate_matches] start Reshape2Of3Pattern with main_opset=18 and min_opset=1
[Reshape2Of3Pattern.match] NONE - line: 306:experimental_experiment.xoptim.patterns.onnx_reshape, op_type=Add, name=Opset2, inputs=_onx_matmul_x,layers.0.bias
[Reshape2Of3Pattern.match] NONE - line: 306:experimental_experiment.xoptim.patterns.onnx_reshape, op_type=Add, name=Opset4, inputs=_onx_matmul_relu,layers.2.bias
[PatternOptimization.enumerate_matches] start ReshapeReshapeBinaryPattern with main_opset=18 and min_opset=1
[ReshapeReshapeBinaryPattern.match] NONE - line: 486:experimental_experiment.xoptim.patterns.onnx_reshape, op_type=Add, name=Opset2, inputs=_onx_matmul_x,layers.0.bias
[ReshapeReshapeBinaryPattern.match] NONE - line: 486:experimental_experiment.xoptim.patterns.onnx_reshape, op_type=Add, name=Opset4, inputs=_onx_matmul_relu,layers.2.bias
[PatternOptimization.enumerate_matches] start MatMulAddPattern with main_opset=18 and min_opset=1
[MatchResult.match] MATCH MatMulAddPattern with 2 nodes and types ['MatMul', 'Add']
[GraphBuilderPatternOptimization-VNK.optimize] match=MatchResult: MatMulAddPattern replaces ['MatMul', 'Add']
[MatchResult.match] MATCH MatMulAddPattern with 2 nodes and types ['MatMul', 'Add']
[GraphBuilderPatternOptimization-VNK.optimize] match=MatchResult: MatMulAddPattern replaces ['MatMul', 'Add']
[PatternOptimization.enumerate_matches] start GemmTransposePattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start MatMulReshape2Of3Pattern with main_opset=18 and min_opset=1
[MatMulReshape2Of3Pattern.match] NONE - line: 400:experimental_experiment.xoptim.patterns.onnx_matmul, op_type=MatMul, name=Opset, inputs=x,p_layers_0_weight::T10
[MatMulReshape2Of3Pattern.match] NONE - line: 400:experimental_experiment.xoptim.patterns.onnx_matmul, op_type=MatMul, name=Opset3, inputs=relu,p_layers_2_weight::T10
[PatternOptimization.enumerate_matches] start MulMulMatMulPattern with main_opset=18 and min_opset=1
[MulMulMatMulPattern.match] NONE - line: 716:experimental_experiment.xoptim.patterns.onnx_matmul, op_type=MatMul, name=Opset, inputs=x,p_layers_0_weight::T10
[MulMulMatMulPattern.match] NONE - line: 716:experimental_experiment.xoptim.patterns.onnx_matmul, op_type=MatMul, name=Opset3, inputs=relu,p_layers_2_weight::T10
[PatternOptimization.enumerate_matches] start ShapeBasedReshapeIsSqueezePattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start ShapeBasedStaticExpandPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start ShapeBasedConcatExpandPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start ShapeBasedEditDistanceReshapePattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start ShapeBasedIdentityPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start ShapeBasedExpandBroadcastPattern with main_opset=18 and min_opset=1
[ShapeBasedExpandBroadcastPattern.match] NONE - line: 232:experimental_experiment.xoptim.patterns.onnx_expand, op_type=Add, name=Opset2, inputs=_onx_matmul_x,layers.0.bias
[ShapeBasedExpandBroadcastPattern.match] NONE - line: 232:experimental_experiment.xoptim.patterns.onnx_expand, op_type=Add, name=Opset4, inputs=_onx_matmul_relu,layers.2.bias
[PatternOptimization.enumerate_matches] start ShapeBasedExpandBroadcastMatMulPattern with main_opset=18 and min_opset=1
[ShapeBasedExpandBroadcastMatMulPattern.match] NONE - line: 710:experimental_experiment.xoptim.patterns.onnx_expand, op_type=MatMul, name=Opset, inputs=x,p_layers_0_weight::T10
[ShapeBasedExpandBroadcastMatMulPattern.match] NONE - line: 710:experimental_experiment.xoptim.patterns.onnx_expand, op_type=MatMul, name=Opset3, inputs=relu,p_layers_2_weight::T10
[PatternOptimization.enumerate_matches] start ShapeBasedExpandCastWhereSwapPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start ShapeBasedExpandSwapPattern with main_opset=18 and min_opset=1
[ShapeBasedExpandSwapPattern.match] NONE - line: 560:experimental_experiment.xoptim.patterns.onnx_expand, op_type=Add, name=Opset2, inputs=_onx_matmul_x,layers.0.bias
[ShapeBasedExpandSwapPattern.match] NONE - line: 560:experimental_experiment.xoptim.patterns.onnx_expand, op_type=Add, name=Opset4, inputs=_onx_matmul_relu,layers.2.bias
[PatternOptimization.enumerate_matches] start ShapeBasedMatMulToMulPattern with main_opset=18 and min_opset=1
[ShapeBasedMatMulToMulPattern.match] NONE - line: 1255:experimental_experiment.xoptim.patterns.onnx_matmul, op_type=MatMul, name=Opset, inputs=x,p_layers_0_weight::T10
[ShapeBasedMatMulToMulPattern.match] NONE - line: 1255:experimental_experiment.xoptim.patterns.onnx_matmul, op_type=MatMul, name=Opset3, inputs=relu,p_layers_2_weight::T10
[PatternOptimization.enumerate_matches] start ShapeBasedSameChildrenPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start ShapeBasedShapeShapeAddPattern with main_opset=18 and min_opset=1
[ShapeBasedShapeShapeAddPattern.match] NONE - line: 23:experimental_experiment.xoptim.patterns.onnx_shape, op_type=Add, name=Opset2, inputs=_onx_matmul_x,layers.0.bias
[ShapeBasedShapeShapeAddPattern.match] NONE - line: 23:experimental_experiment.xoptim.patterns.onnx_shape, op_type=Add, name=Opset4, inputs=_onx_matmul_relu,layers.2.bias
[PatternOptimization.enumerate_matches] start ReshapeReshapePattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start RotaryEmbeddingPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start SameChildrenPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start SequenceConstructAtPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start SliceSlicePattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start SlicesSplitPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start SoftmaxCrossEntropyLossCastPattern with main_opset=18 and min_opset=14
[PatternOptimization.enumerate_matches] start SplitConcatPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start SqueezeAddPattern with main_opset=18 and min_opset=1
[SqueezeAddPattern.match] NONE - line: 211:experimental_experiment.xoptim.patterns.onnx_unsqueeze, op_type=Add, name=Opset2, inputs=_onx_matmul_x,layers.0.bias
[SqueezeAddPattern.match] NONE - line: 211:experimental_experiment.xoptim.patterns.onnx_unsqueeze, op_type=Add, name=Opset4, inputs=_onx_matmul_relu,layers.2.bias
[PatternOptimization.enumerate_matches] start SqueezeUnsqueezePattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start StaticConcatReshapePattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start Sub1MulPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start SwitchOrderBinaryPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start SwitchReshapeActivationPattern with main_opset=18 and min_opset=1
[SwitchReshapeActivationPattern.match] NONE - line: 1178:experimental_experiment.xoptim.patterns.onnx_matmul, op_type=Relu, name=relu, inputs=linear
[PatternOptimization.enumerate_matches] start TransposeEqualReshapePattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start TransposeMatMulPattern with main_opset=18 and min_opset=1
[TransposeMatMulPattern.match] NONE - line: 890:experimental_experiment.xoptim.patterns.onnx_matmul, op_type=MatMul, name=Opset, inputs=x,p_layers_0_weight::T10
[TransposeMatMulPattern.match] NONE - line: 890:experimental_experiment.xoptim.patterns.onnx_matmul, op_type=MatMul, name=Opset3, inputs=relu,p_layers_2_weight::T10
[PatternOptimization.enumerate_matches] start TransposeReshapeMatMulPattern with main_opset=18 and min_opset=1
[TransposeReshapeMatMulPattern.match] NONE - line: 1033:experimental_experiment.xoptim.patterns.onnx_matmul, op_type=MatMul, name=Opset, inputs=x,p_layers_0_weight::T10
[TransposeReshapeMatMulPattern.match] NONE - line: 1033:experimental_experiment.xoptim.patterns.onnx_matmul, op_type=MatMul, name=Opset3, inputs=relu,p_layers_2_weight::T10
[PatternOptimization.enumerate_matches] start TransposeReshapeTransposePattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start TransposeTransposePattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start UnsqueezeEqualPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start UnsqueezeUnsqueezePattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start RotaryConcatPartPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start FunctionCausalMaskPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start FunctionCausalMaskMulAddPattern with main_opset=18 and min_opset=1
[FunctionCausalMaskMulAddPattern.match] NONE - line: 1119:experimental_experiment.xoptim.patterns.onnx_rotary, op_type=Add, name=Opset2, inputs=_onx_matmul_x,layers.0.bias
[FunctionCausalMaskMulAddPattern.match] NONE - line: 1119:experimental_experiment.xoptim.patterns.onnx_rotary, op_type=Add, name=Opset4, inputs=_onx_matmul_relu,layers.2.bias
[PatternOptimization.enumerate_matches] start FunctionCosSinCachePattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start FunctionHalfRotaryEmbeddingPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start RMSNormalizationPattern with main_opset=18 and min_opset=1
[GraphBuilderPatternOptimization-VNK.optimize] applies 2 matches, 2*MatMulAddPattern - time=0.001 | max_time=IdentityPattern:0.000
[GraphBuilderPatternOptimization-VNK.optimize] apply MatchResult: MatMulAddPattern replaces ['MatMul', 'Add'], inputs: ['x', 'p_layers_0_weight::T10', '_onx_matmul_x', 'layers.0.bias'], outputs: ['_onx_matmul_x', 'linear']
[GraphBuilderPatternOptimization-VNK.apply_match] MatchResult: MatMulAddPattern replaces ['MatMul', 'Add']
- MatMul: ['x', 'p_layers_0_weight::T10'] -> ['_onx_matmul_x']
- Add: ['_onx_matmul_x', 'layers.0.bias'] -> ['linear']
+ Gemm: ['x', 'p_layers_0_weight::T10', 'layers.0.bias'] -> ['linear']
[GraphBuilder-VNK.set_type] linear:1
[GraphBuilderPatternOptimization-VNK.apply_match] MatchResult: MatMulAddPattern replaces ['MatMul', 'Add'] applied.
[GraphBuilderPatternOptimization-VNK.optimize] - add ['Gemm']
[GraphBuilderPatternOptimization-VNK.optimize] done MatchResult: MatMulAddPattern replaces ['MatMul', 'Add']: -2 +1 nodes
[GraphBuilderPatternOptimization-VNK.optimize] removed outputs {'_onx_matmul_x'}
[GraphBuilderPatternOptimization-VNK.optimize] apply MatchResult: MatMulAddPattern replaces ['MatMul', 'Add'], inputs: ['relu', 'p_layers_2_weight::T10', '_onx_matmul_relu', 'layers.2.bias'], outputs: ['_onx_matmul_relu', 'output_0']
[GraphBuilderPatternOptimization-VNK.apply_match] MatchResult: MatMulAddPattern replaces ['MatMul', 'Add']
- MatMul: ['relu', 'p_layers_2_weight::T10'] -> ['_onx_matmul_relu']
- Add: ['_onx_matmul_relu', 'layers.2.bias'] -> ['output_0']
+ Gemm: ['relu', 'p_layers_2_weight::T10', 'layers.2.bias'] -> ['output_0']
[GraphBuilder-VNK.set_type] output_0:1
[GraphBuilderPatternOptimization-VNK.apply_match] MatchResult: MatMulAddPattern replaces ['MatMul', 'Add'] applied.
[GraphBuilderPatternOptimization-VNK.optimize] - add ['Gemm']
[GraphBuilderPatternOptimization-VNK.optimize] done MatchResult: MatMulAddPattern replaces ['MatMul', 'Add']: -2 +1 nodes
[GraphBuilderPatternOptimization-VNK.optimize] removed outputs {'_onx_matmul_relu'}
[GraphBuilderPatternOptimization-VNK.optimize] done all: -4 +2 nodes
[GraphBuilder-VNK.remove_identity_nodes] -- starts with 3
[GraphBuilder-VNK.remove_identity_nodes] found 0 replacements
[GraphBuilder-VNK.remove_identity_nodes] kept 3 nodes
[GraphBuilder-VNK.remove_identity_nodes] ends with 3 nodes in 3.006999941135291e-05 seconds
[GraphBuilderPatternOptimization-VNK.optimize] iteration 3: 3 nodes, priority=3
[PatternOptimization.enumerate_matches] start BatchNormalizationPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start BatchNormalizationTrainingPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start CastLayerNormalizationCastPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start CastPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start CastCastBinaryPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start CastOpCastPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start ClipClipPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start ComputationCastOpCastPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start ConcatEmptyPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start ConcatGatherPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start ConcatReshapePattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start ConcatTwiceUnaryPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start ConvBiasNullPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start DropoutPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start ExpandPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start ExpandBroadcastPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start ExpandSwapPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start GeluPattern with main_opset=18 and min_opset=20
[PatternOptimization.enumerate_matches] start IdentityPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start LayerNormalizationPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start LayerNormalizationScalePattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start LeakyReluPattern with main_opset=18 and min_opset=6
[PatternOptimization.enumerate_matches] start MulMulMulScalarPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start ReduceReshapePattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start ReduceSumNormalizePattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start ReshapePattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start ReshapeMatMulReshapePattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start Reshape2Of3Pattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start ReshapeReshapeBinaryPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start MatMulAddPattern with main_opset=18 and min_opset=1
[MatMulAddPattern.match] NONE - line: 58:experimental_experiment.xoptim.patterns.onnx_matmul, op_type=Gemm, name=MatMulAddPattern--Opset, inputs=x,p_layers_0_weight::T10,layers.0.bias
[MatMulAddPattern.match] NONE - line: 55:experimental_experiment.xoptim.patterns.onnx_matmul, op_type=Gemm, name=MatMulAddPattern--Opset3, inputs=relu,p_layers_2_weight::T10,layers.2.bias
[PatternOptimization.enumerate_matches] start GemmTransposePattern with main_opset=18 and min_opset=1
[MatchResult.match] MATCH GemmTransposePattern with 1 nodes and types ['Gemm']
[GraphBuilderPatternOptimization-VNK.optimize] match=MatchResult: GemmTransposePattern replaces ['Gemm']
[MatchResult.match] MATCH GemmTransposePattern with 1 nodes and types ['Gemm']
[GraphBuilderPatternOptimization-VNK.optimize] match=MatchResult: GemmTransposePattern replaces ['Gemm']
[PatternOptimization.enumerate_matches] start MatMulReshape2Of3Pattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start MulMulMatMulPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start ShapeBasedReshapeIsSqueezePattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start ShapeBasedStaticExpandPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start ShapeBasedConcatExpandPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start ShapeBasedEditDistanceReshapePattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start ShapeBasedIdentityPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start ShapeBasedExpandBroadcastPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start ShapeBasedExpandBroadcastMatMulPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start ShapeBasedExpandCastWhereSwapPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start ShapeBasedExpandSwapPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start ShapeBasedMatMulToMulPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start ShapeBasedSameChildrenPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start ShapeBasedShapeShapeAddPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start ReshapeReshapePattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start RotaryEmbeddingPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start SameChildrenPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start SequenceConstructAtPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start SliceSlicePattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start SlicesSplitPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start SoftmaxCrossEntropyLossCastPattern with main_opset=18 and min_opset=14
[PatternOptimization.enumerate_matches] start SplitConcatPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start SqueezeAddPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start SqueezeUnsqueezePattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start StaticConcatReshapePattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start Sub1MulPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start SwitchOrderBinaryPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start SwitchReshapeActivationPattern with main_opset=18 and min_opset=1
[SwitchReshapeActivationPattern.match] NONE - line: 1178:experimental_experiment.xoptim.patterns.onnx_matmul, op_type=Relu, name=relu, inputs=linear
[PatternOptimization.enumerate_matches] start TransposeEqualReshapePattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start TransposeMatMulPattern with main_opset=18 and min_opset=1
[TransposeMatMulPattern.match] NONE - line: 890:experimental_experiment.xoptim.patterns.onnx_matmul, op_type=Gemm, name=MatMulAddPattern--Opset, inputs=x,p_layers_0_weight::T10,layers.0.bias
[TransposeMatMulPattern.match] NONE - line: 890:experimental_experiment.xoptim.patterns.onnx_matmul, op_type=Gemm, name=MatMulAddPattern--Opset3, inputs=relu,p_layers_2_weight::T10,layers.2.bias
[PatternOptimization.enumerate_matches] start TransposeReshapeMatMulPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start TransposeReshapeTransposePattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start TransposeTransposePattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start UnsqueezeEqualPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start UnsqueezeUnsqueezePattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start RotaryConcatPartPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start FunctionCausalMaskPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start FunctionCausalMaskMulAddPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start FunctionCosSinCachePattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start FunctionHalfRotaryEmbeddingPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start RMSNormalizationPattern with main_opset=18 and min_opset=1
[GraphBuilderPatternOptimization-VNK.optimize] applies 2 matches, 2*GemmTransposePattern - time=0.001 | max_time=GemmTransposePattern:0.000
[GraphBuilderPatternOptimization-VNK.optimize] apply MatchResult: GemmTransposePattern replaces ['Gemm'], inputs: ['x', 'p_layers_0_weight::T10', 'layers.0.bias'], outputs: ['linear']
[GraphBuilder-VNK.update_node_constant] new constant 'GemmTransposePattern--p_layers_0_weight::T10', node=Transpose
[GraphBuilderPatternOptimization-VNK.apply_match] MatchResult: GemmTransposePattern replaces ['Gemm']
- Gemm: ['x', 'p_layers_0_weight::T10', 'layers.0.bias'] -> ['linear']
+ Transpose: ['p_layers_0_weight::T10'] -> ['GemmTransposePattern--p_layers_0_weight::T10']
+ Gemm: ['x', 'GemmTransposePattern--p_layers_0_weight::T10', 'layers.0.bias'] -> ['linear']
[GraphBuilder-VNK.update_node_constant] new constant 'GemmTransposePattern--p_layers_0_weight::T10', node=Transpose
[GraphBuilder-VNK.set_type] GemmTransposePattern--p_layers_0_weight::T10:1
[GraphBuilder-VNK.set_shape] GemmTransposePattern--p_layers_0_weight::T10:(32, 10)
[GraphBuilder-VNK.set_rank] GemmTransposePattern--p_layers_0_weight::T10:2
[GraphBuilder-VNK.set_type] linear:1
[GraphBuilderPatternOptimization-VNK.apply_match] MatchResult: GemmTransposePattern replaces ['Gemm'] applied.
[GraphBuilderPatternOptimization-VNK.optimize] - add ['Transpose', 'Gemm']
[GraphBuilderPatternOptimization-VNK.optimize] done MatchResult: GemmTransposePattern replaces ['Gemm']: -1 +2 nodes
[GraphBuilderPatternOptimization-VNK.optimize] apply MatchResult: GemmTransposePattern replaces ['Gemm'], inputs: ['relu', 'p_layers_2_weight::T10', 'layers.2.bias'], outputs: ['output_0']
[GraphBuilder-VNK.update_node_constant] new constant 'GemmTransposePattern--p_layers_2_weight::T10', node=Transpose
[GraphBuilderPatternOptimization-VNK.apply_match] MatchResult: GemmTransposePattern replaces ['Gemm']
- Gemm: ['relu', 'p_layers_2_weight::T10', 'layers.2.bias'] -> ['output_0']
+ Transpose: ['p_layers_2_weight::T10'] -> ['GemmTransposePattern--p_layers_2_weight::T10']
+ Gemm: ['relu', 'GemmTransposePattern--p_layers_2_weight::T10', 'layers.2.bias'] -> ['output_0']
[GraphBuilder-VNK.update_node_constant] new constant 'GemmTransposePattern--p_layers_2_weight::T10', node=Transpose
[GraphBuilder-VNK.set_type] GemmTransposePattern--p_layers_2_weight::T10:1
[GraphBuilder-VNK.set_shape] GemmTransposePattern--p_layers_2_weight::T10:(1, 32)
[GraphBuilder-VNK.set_rank] GemmTransposePattern--p_layers_2_weight::T10:2
[GraphBuilder-VNK.set_type] output_0:1
[GraphBuilderPatternOptimization-VNK.apply_match] MatchResult: GemmTransposePattern replaces ['Gemm'] applied.
[GraphBuilderPatternOptimization-VNK.optimize] - add ['Transpose', 'Gemm']
[GraphBuilderPatternOptimization-VNK.optimize] done MatchResult: GemmTransposePattern replaces ['Gemm']: -1 +2 nodes
[GraphBuilderPatternOptimization-VNK.optimize] done all: -2 +4 nodes
[GraphBuilderPatternOptimization-VNK.optimize] iteration 4: 5 nodes, priority=3
[PatternOptimization.enumerate_matches] start BatchNormalizationPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start BatchNormalizationTrainingPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start CastLayerNormalizationCastPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start CastPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start CastCastBinaryPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start CastOpCastPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start ClipClipPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start ComputationCastOpCastPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start ConcatEmptyPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start ConcatGatherPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start ConcatReshapePattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start ConcatTwiceUnaryPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start ConvBiasNullPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start DropoutPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start ExpandPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start ExpandBroadcastPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start ExpandSwapPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start GeluPattern with main_opset=18 and min_opset=20
[PatternOptimization.enumerate_matches] start IdentityPattern with main_opset=18 and min_opset=1
[IdentityPattern.match] NONE - line: 258:experimental_experiment.xoptim.patterns.onnx_any, op_type=Transpose, name=GemmTransposePattern--MatMulAddPattern--Opset, inputs=p_layers_0_weight::T10
[IdentityPattern.match] NONE - line: 258:experimental_experiment.xoptim.patterns.onnx_any, op_type=Transpose, name=GemmTransposePattern--MatMulAddPattern--Opset3, inputs=p_layers_2_weight::T10
[PatternOptimization.enumerate_matches] start LayerNormalizationPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start LayerNormalizationScalePattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start LeakyReluPattern with main_opset=18 and min_opset=6
[PatternOptimization.enumerate_matches] start MulMulMulScalarPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start ReduceReshapePattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start ReduceSumNormalizePattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start ReshapePattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start ReshapeMatMulReshapePattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start Reshape2Of3Pattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start ReshapeReshapeBinaryPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start MatMulAddPattern with main_opset=18 and min_opset=1
[MatMulAddPattern.match] NONE - line: 58:experimental_experiment.xoptim.patterns.onnx_matmul, op_type=Gemm, name=GemmTransposePattern--MatMulAddPattern--Opset2, inputs=x,GemmTransposePattern--p_layers_0_weight::T10,layers.0.bias
[MatMulAddPattern.match] NONE - line: 55:experimental_experiment.xoptim.patterns.onnx_matmul, op_type=Gemm, name=GemmTransposePattern--MatMulAddPattern--Opset32, inputs=relu,GemmTransposePattern--p_layers_2_weight::T10,layers.2.bias
[PatternOptimization.enumerate_matches] start GemmTransposePattern with main_opset=18 and min_opset=1
[GemmTransposePattern.match] NONE - line: 307:experimental_experiment.xoptim.patterns.onnx_matmul, op_type=Gemm, name=GemmTransposePattern--MatMulAddPattern--Opset2, inputs=x,GemmTransposePattern--p_layers_0_weight::T10,layers.0.bias
[GemmTransposePattern.match] NONE - line: 307:experimental_experiment.xoptim.patterns.onnx_matmul, op_type=Gemm, name=GemmTransposePattern--MatMulAddPattern--Opset32, inputs=relu,GemmTransposePattern--p_layers_2_weight::T10,layers.2.bias
[PatternOptimization.enumerate_matches] start MatMulReshape2Of3Pattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start MulMulMatMulPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start ShapeBasedReshapeIsSqueezePattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start ShapeBasedStaticExpandPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start ShapeBasedConcatExpandPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start ShapeBasedEditDistanceReshapePattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start ShapeBasedIdentityPattern with main_opset=18 and min_opset=1
[ShapeBasedIdentityPattern.match] NONE - line: 401:experimental_experiment.xoptim.patterns.onnx_any, op_type=Transpose, name=GemmTransposePattern--MatMulAddPattern--Opset, inputs=p_layers_0_weight::T10
[ShapeBasedIdentityPattern.match] NONE - line: 401:experimental_experiment.xoptim.patterns.onnx_any, op_type=Transpose, name=GemmTransposePattern--MatMulAddPattern--Opset3, inputs=p_layers_2_weight::T10
[PatternOptimization.enumerate_matches] start ShapeBasedExpandBroadcastPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start ShapeBasedExpandBroadcastMatMulPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start ShapeBasedExpandCastWhereSwapPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start ShapeBasedExpandSwapPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start ShapeBasedMatMulToMulPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start ShapeBasedSameChildrenPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start ShapeBasedShapeShapeAddPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start ReshapeReshapePattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start RotaryEmbeddingPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start SameChildrenPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start SequenceConstructAtPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start SliceSlicePattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start SlicesSplitPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start SoftmaxCrossEntropyLossCastPattern with main_opset=18 and min_opset=14
[PatternOptimization.enumerate_matches] start SplitConcatPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start SqueezeAddPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start SqueezeUnsqueezePattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start StaticConcatReshapePattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start Sub1MulPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start SwitchOrderBinaryPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start SwitchReshapeActivationPattern with main_opset=18 and min_opset=1
[SwitchReshapeActivationPattern.match] NONE - line: 1178:experimental_experiment.xoptim.patterns.onnx_matmul, op_type=Relu, name=relu, inputs=linear
[PatternOptimization.enumerate_matches] start TransposeEqualReshapePattern with main_opset=18 and min_opset=1
[TransposeEqualReshapePattern.match] NONE - line: 342:experimental_experiment.xoptim.patterns.onnx_transpose, op_type=Transpose, name=GemmTransposePattern--MatMulAddPattern--Opset, inputs=p_layers_0_weight::T10
[MatchResult.match] MATCH TransposeEqualReshapePattern with 1 nodes and types ['Transpose']
[GraphBuilderPatternOptimization-VNK.optimize] match=MatchResult: TransposeEqualReshapePattern replaces ['Transpose']
[PatternOptimization.enumerate_matches] start TransposeMatMulPattern with main_opset=18 and min_opset=1
[TransposeMatMulPattern.match] NONE - line: 928:experimental_experiment.xoptim.patterns.onnx_matmul, op_type=Gemm, name=GemmTransposePattern--MatMulAddPattern--Opset2, inputs=x,GemmTransposePattern--p_layers_0_weight::T10,layers.0.bias
[TransposeMatMulPattern.match] NONE - line: 928:experimental_experiment.xoptim.patterns.onnx_matmul, op_type=Gemm, name=GemmTransposePattern--MatMulAddPattern--Opset32, inputs=relu,GemmTransposePattern--p_layers_2_weight::T10,layers.2.bias
[PatternOptimization.enumerate_matches] start TransposeReshapeMatMulPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start TransposeReshapeTransposePattern with main_opset=18 and min_opset=1
[TransposeReshapeTransposePattern.match] NONE - line: 140:experimental_experiment.xoptim.patterns.onnx_transpose, op_type=Transpose, name=GemmTransposePattern--MatMulAddPattern--Opset, inputs=p_layers_0_weight::T10
[TransposeReshapeTransposePattern.match] NONE - line: 140:experimental_experiment.xoptim.patterns.onnx_transpose, op_type=Transpose, name=GemmTransposePattern--MatMulAddPattern--Opset3, inputs=p_layers_2_weight::T10
[PatternOptimization.enumerate_matches] start TransposeTransposePattern with main_opset=18 and min_opset=1
[TransposeTransposePattern.match] NONE - line: 51:experimental_experiment.xoptim.patterns.onnx_transpose, op_type=Transpose, name=GemmTransposePattern--MatMulAddPattern--Opset, inputs=p_layers_0_weight::T10
[TransposeTransposePattern.match] NONE - line: 51:experimental_experiment.xoptim.patterns.onnx_transpose, op_type=Transpose, name=GemmTransposePattern--MatMulAddPattern--Opset3, inputs=p_layers_2_weight::T10
[PatternOptimization.enumerate_matches] start UnsqueezeEqualPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start UnsqueezeUnsqueezePattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start RotaryConcatPartPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start FunctionCausalMaskPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start FunctionCausalMaskMulAddPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start FunctionCosSinCachePattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start FunctionHalfRotaryEmbeddingPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start RMSNormalizationPattern with main_opset=18 and min_opset=1
[GraphBuilderPatternOptimization-VNK.optimize] applies 1 matches, [0]=MatchResult: TransposeEqualReshapePattern replaces ['Transpose'] - time=0.001 | max_time=Reshape2Of3Pattern:0.000
[GraphBuilderPatternOptimization-VNK.optimize] apply MatchResult: TransposeEqualReshapePattern replaces ['Transpose'], inputs: ['p_layers_2_weight::T10'], outputs: ['GemmTransposePattern--p_layers_2_weight::T10']
[GraphBuilder-VNK.set_shape] init7_s2_1_-1:(2,)
[GraphBuilder-VNK.set_rank] init7_s2_1_-1:1
[GraphBuilder-VNK.set_type] init7_s2_1_-1:7
[GraphBuilder-VNK.make_initializer] init7_s2_1_-1[7:(2,)]
[GraphBuilder-VNK.update_node_constant] new constant 'init7_s2_1_-1', node=None
[GraphBuilder-VNK.update_node_constant] new constant 'GemmTransposePattern--p_layers_2_weight::T10', node=Reshape
[GraphBuilderPatternOptimization-VNK.apply_match] MatchResult: TransposeEqualReshapePattern replaces ['Transpose']
- Transpose: ['p_layers_2_weight::T10'] -> ['GemmTransposePattern--p_layers_2_weight::T10']
+ Reshape: ['p_layers_2_weight::T10', 'init7_s2_1_-1'] -> ['GemmTransposePattern--p_layers_2_weight::T10']
[GraphBuilder-VNK.update_node_constant] new constant 'GemmTransposePattern--p_layers_2_weight::T10', node=Reshape
[GraphBuilder-VNK.set_type] GemmTransposePattern--p_layers_2_weight::T10:1
[GraphBuilder-VNK.set_type] GemmTransposePattern--p_layers_2_weight::T10:1
[GraphBuilderPatternOptimization-VNK.apply_match] MatchResult: TransposeEqualReshapePattern replaces ['Transpose'] applied.
[GraphBuilderPatternOptimization-VNK.optimize] - add ['Reshape']
[GraphBuilderPatternOptimization-VNK.optimize] done MatchResult: TransposeEqualReshapePattern replaces ['Transpose']: -1 +1 nodes
[GraphBuilderPatternOptimization-VNK.optimize] done all: -1 +1 nodes
[GraphBuilderPatternOptimization-VNK.optimize] iteration 5: 5 nodes, priority=3
[PatternOptimization.enumerate_matches] start BatchNormalizationPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start BatchNormalizationTrainingPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start CastLayerNormalizationCastPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start CastPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start CastCastBinaryPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start CastOpCastPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start ClipClipPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start ComputationCastOpCastPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start ConcatEmptyPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start ConcatGatherPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start ConcatReshapePattern with main_opset=18 and min_opset=1
[ConcatReshapePattern.match] NONE - line: 552:experimental_experiment.xoptim.patterns.onnx_reshape, op_type=Reshape, name=TransposeEqualReshapePattern--B--GemmTransposePattern--MatMulAddPattern--Opset3, inputs=p_layers_2_weight::T10,init7_s2_1_-1
[PatternOptimization.enumerate_matches] start ConcatTwiceUnaryPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start ConvBiasNullPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start DropoutPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start ExpandPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start ExpandBroadcastPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start ExpandSwapPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start GeluPattern with main_opset=18 and min_opset=20
[PatternOptimization.enumerate_matches] start IdentityPattern with main_opset=18 and min_opset=1
[IdentityPattern.match] NONE - line: 258:experimental_experiment.xoptim.patterns.onnx_any, op_type=Transpose, name=GemmTransposePattern--MatMulAddPattern--Opset, inputs=p_layers_0_weight::T10
[PatternOptimization.enumerate_matches] start LayerNormalizationPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start LayerNormalizationScalePattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start LeakyReluPattern with main_opset=18 and min_opset=6
[PatternOptimization.enumerate_matches] start MulMulMulScalarPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start ReduceReshapePattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start ReduceSumNormalizePattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start ReshapePattern with main_opset=18 and min_opset=1
[ReshapePattern.match] NONE - line: 37:experimental_experiment.xoptim.patterns.onnx_reshape, op_type=Reshape, name=TransposeEqualReshapePattern--B--GemmTransposePattern--MatMulAddPattern--Opset3, inputs=p_layers_2_weight::T10,init7_s2_1_-1
[PatternOptimization.enumerate_matches] start ReshapeMatMulReshapePattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start Reshape2Of3Pattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start ReshapeReshapeBinaryPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start MatMulAddPattern with main_opset=18 and min_opset=1
[MatMulAddPattern.match] NONE - line: 58:experimental_experiment.xoptim.patterns.onnx_matmul, op_type=Gemm, name=GemmTransposePattern--MatMulAddPattern--Opset2, inputs=x,GemmTransposePattern--p_layers_0_weight::T10,layers.0.bias
[MatMulAddPattern.match] NONE - line: 55:experimental_experiment.xoptim.patterns.onnx_matmul, op_type=Gemm, name=GemmTransposePattern--MatMulAddPattern--Opset32, inputs=relu,GemmTransposePattern--p_layers_2_weight::T10,layers.2.bias
[PatternOptimization.enumerate_matches] start GemmTransposePattern with main_opset=18 and min_opset=1
[GemmTransposePattern.match] NONE - line: 307:experimental_experiment.xoptim.patterns.onnx_matmul, op_type=Gemm, name=GemmTransposePattern--MatMulAddPattern--Opset2, inputs=x,GemmTransposePattern--p_layers_0_weight::T10,layers.0.bias
[GemmTransposePattern.match] NONE - line: 307:experimental_experiment.xoptim.patterns.onnx_matmul, op_type=Gemm, name=GemmTransposePattern--MatMulAddPattern--Opset32, inputs=relu,GemmTransposePattern--p_layers_2_weight::T10,layers.2.bias
[PatternOptimization.enumerate_matches] start MatMulReshape2Of3Pattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start MulMulMatMulPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start ShapeBasedReshapeIsSqueezePattern with main_opset=18 and min_opset=1
[ShapeBasedReshapeIsSqueezePattern.match] NONE - line: 977:experimental_experiment.xoptim.patterns.onnx_reshape, op_type=Reshape, name=TransposeEqualReshapePattern--B--GemmTransposePattern--MatMulAddPattern--Opset3, inputs=p_layers_2_weight::T10,init7_s2_1_-1
[PatternOptimization.enumerate_matches] start ShapeBasedStaticExpandPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start ShapeBasedConcatExpandPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start ShapeBasedEditDistanceReshapePattern with main_opset=18 and min_opset=1
[ShapeBasedEditDistanceReshapePattern.match] NONE - line: 886:experimental_experiment.xoptim.patterns.onnx_reshape, op_type=Reshape, name=TransposeEqualReshapePattern--B--GemmTransposePattern--MatMulAddPattern--Opset3, inputs=p_layers_2_weight::T10,init7_s2_1_-1
[PatternOptimization.enumerate_matches] start ShapeBasedIdentityPattern with main_opset=18 and min_opset=1
[ShapeBasedIdentityPattern.match] NONE - line: 401:experimental_experiment.xoptim.patterns.onnx_any, op_type=Transpose, name=GemmTransposePattern--MatMulAddPattern--Opset, inputs=p_layers_0_weight::T10
[PatternOptimization.enumerate_matches] start ShapeBasedExpandBroadcastPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start ShapeBasedExpandBroadcastMatMulPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start ShapeBasedExpandCastWhereSwapPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start ShapeBasedExpandSwapPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start ShapeBasedMatMulToMulPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start ShapeBasedSameChildrenPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start ShapeBasedShapeShapeAddPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start ReshapeReshapePattern with main_opset=18 and min_opset=1
[ReshapeReshapePattern.match] NONE - line: 166:experimental_experiment.xoptim.patterns.onnx_reshape, op_type=Reshape, name=TransposeEqualReshapePattern--B--GemmTransposePattern--MatMulAddPattern--Opset3, inputs=p_layers_2_weight::T10,init7_s2_1_-1
[PatternOptimization.enumerate_matches] start RotaryEmbeddingPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start SameChildrenPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start SequenceConstructAtPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start SliceSlicePattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start SlicesSplitPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start SoftmaxCrossEntropyLossCastPattern with main_opset=18 and min_opset=14
[PatternOptimization.enumerate_matches] start SplitConcatPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start SqueezeAddPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start SqueezeUnsqueezePattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start StaticConcatReshapePattern with main_opset=18 and min_opset=1
[StaticConcatReshapePattern.match] NONE - line: 664:experimental_experiment.xoptim.patterns.onnx_reshape, op_type=Reshape, name=TransposeEqualReshapePattern--B--GemmTransposePattern--MatMulAddPattern--Opset3, inputs=p_layers_2_weight::T10,init7_s2_1_-1
[PatternOptimization.enumerate_matches] start Sub1MulPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start SwitchOrderBinaryPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start SwitchReshapeActivationPattern with main_opset=18 and min_opset=1
[SwitchReshapeActivationPattern.match] NONE - line: 1178:experimental_experiment.xoptim.patterns.onnx_matmul, op_type=Relu, name=relu, inputs=linear
[PatternOptimization.enumerate_matches] start TransposeEqualReshapePattern with main_opset=18 and min_opset=1
[TransposeEqualReshapePattern.match] NONE - line: 342:experimental_experiment.xoptim.patterns.onnx_transpose, op_type=Transpose, name=GemmTransposePattern--MatMulAddPattern--Opset, inputs=p_layers_0_weight::T10
[PatternOptimization.enumerate_matches] start TransposeMatMulPattern with main_opset=18 and min_opset=1
[TransposeMatMulPattern.match] NONE - line: 928:experimental_experiment.xoptim.patterns.onnx_matmul, op_type=Gemm, name=GemmTransposePattern--MatMulAddPattern--Opset2, inputs=x,GemmTransposePattern--p_layers_0_weight::T10,layers.0.bias
[TransposeMatMulPattern.match] NONE - line: 890:experimental_experiment.xoptim.patterns.onnx_matmul, op_type=Gemm, name=GemmTransposePattern--MatMulAddPattern--Opset32, inputs=relu,GemmTransposePattern--p_layers_2_weight::T10,layers.2.bias
[PatternOptimization.enumerate_matches] start TransposeReshapeMatMulPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start TransposeReshapeTransposePattern with main_opset=18 and min_opset=1
[TransposeReshapeTransposePattern.match] NONE - line: 140:experimental_experiment.xoptim.patterns.onnx_transpose, op_type=Transpose, name=GemmTransposePattern--MatMulAddPattern--Opset, inputs=p_layers_0_weight::T10
[PatternOptimization.enumerate_matches] start TransposeTransposePattern with main_opset=18 and min_opset=1
[TransposeTransposePattern.match] NONE - line: 51:experimental_experiment.xoptim.patterns.onnx_transpose, op_type=Transpose, name=GemmTransposePattern--MatMulAddPattern--Opset, inputs=p_layers_0_weight::T10
[PatternOptimization.enumerate_matches] start UnsqueezeEqualPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start UnsqueezeUnsqueezePattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start RotaryConcatPartPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start FunctionCausalMaskPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start FunctionCausalMaskMulAddPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start FunctionCosSinCachePattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start FunctionHalfRotaryEmbeddingPattern with main_opset=18 and min_opset=1
[PatternOptimization.enumerate_matches] start RMSNormalizationPattern with main_opset=18 and min_opset=1
[GraphBuilderPatternOptimization-VNK.optimize] done all: -0 +0 nodes
[GraphBuilderPatternOptimization-VNK.optimize] stops current_priority_index=3, priorities=[0, 1, 3]
[GraphBuilderPatternOptimization-VNK.optimize] done after 6 iterations with 5 nodes in 0.013
STAT apply_GemmTransposePattern +4 -2 #it=1 maxmatch=1 i=2 - time=0.0006508419992314884
STAT apply_MatMulAddPattern +2 -4 #it=1 maxmatch=1 i=2 - time=0.00040602700028102845
STAT apply_TransposeEqualReshapePattern +1 -1 #it=1 maxmatch=0 i=1 - time=0.0006819099999120226
STAT build_graph_for_pattern +0 -0 #it=6 maxmatch=0 i=0 - time=0.0002969959987240145
STAT check_pattern_00 +0 -0 #it=1 maxmatch=0 i=0 - time=2.33369992201915e-05
STAT check_pattern_A0 +0 -0 #it=3 maxmatch=0 i=0 - time=0.00020871200104011223
STAT check_pattern_B0 +0 -0 #it=6 maxmatch=0 i=0 - time=0.0002637689995026449
STAT match_BatchNormalizationPattern +0 -0 #it=6 maxmatch=0 i=0 - time=7.434100007230882e-05
STAT match_BatchNormalizationTrainingPattern +0 -0 #it=6 maxmatch=0 i=0 - time=4.760699812322855e-05
STAT match_CastCastBinaryPattern +0 -0 #it=5 maxmatch=0 i=0 - time=0.0001230689995281864
STAT match_CastLayerNormalizationCastPattern +0 -0 #it=5 maxmatch=0 i=0 - time=3.8672000300721265e-05
STAT match_CastOpCastPattern +0 -0 #it=5 maxmatch=0 i=0 - time=9.698199755803216e-05
STAT match_CastPattern +0 -0 #it=6 maxmatch=0 i=0 - time=4.120499943383038e-05
STAT match_ClipClipPattern +0 -0 #it=5 maxmatch=0 i=0 - time=3.61340007657418e-05
STAT match_ComputationCastOpCastPattern +0 -0 #it=5 maxmatch=0 i=0 - time=6.632899930991698e-05
STAT match_ConcatEmptyPattern +0 -0 #it=5 maxmatch=0 i=0 - time=3.260700214013923e-05
STAT match_ConcatGatherPattern +0 -0 #it=6 maxmatch=0 i=0 - time=4.299100146454293e-05
STAT match_ConcatReshapePattern +0 -0 #it=6 maxmatch=0 i=0 - time=5.633900218526833e-05
STAT match_ConcatTwiceUnaryPattern +0 -0 #it=5 maxmatch=0 i=0 - time=3.7579002309939824e-05
STAT match_ConvBiasNullPattern +0 -0 #it=6 maxmatch=0 i=0 - time=3.6948003980796784e-05
STAT match_DropoutPattern +0 -0 #it=5 maxmatch=0 i=0 - time=3.2151998311746866e-05
STAT match_ExpandBroadcastPattern +0 -0 #it=5 maxmatch=0 i=0 - time=3.119100074400194e-05
STAT match_ExpandPattern +0 -0 #it=6 maxmatch=0 i=0 - time=3.537799602781888e-05
STAT match_ExpandSwapPattern +0 -0 #it=5 maxmatch=0 i=0 - time=2.934900112450123e-05
STAT match_FunctionCausalMaskMulAddPattern +0 -0 #it=5 maxmatch=2 i=0 - time=8.815600085654296e-05
STAT match_FunctionCausalMaskPattern +0 -0 #it=5 maxmatch=2 i=0 - time=3.155100057483651e-05
STAT match_FunctionCosSinCachePattern +0 -0 #it=5 maxmatch=2 i=0 - time=0.0001067709999915678
STAT match_FunctionHalfRotaryEmbeddingPattern +0 -0 #it=5 maxmatch=2 i=0 - time=3.8892001612111926e-05
STAT match_GeluPattern +0 -0 #it=6 maxmatch=0 i=0 - time=1.3644998034578748e-05
STAT match_GemmTransposePattern +0 -0 #it=5 maxmatch=2 i=2 - time=0.0001653399995120708
STAT match_IdentityPattern +0 -0 #it=6 maxmatch=0 i=0 - time=0.0005031620021327399
STAT match_LayerNormalizationPattern +0 -0 #it=5 maxmatch=0 i=0 - time=3.892699896823615e-05
STAT match_LayerNormalizationScalePattern +0 -0 #it=5 maxmatch=0 i=0 - time=3.306199869257398e-05
STAT match_LeakyReluPattern +0 -0 #it=6 maxmatch=0 i=0 - time=0.0012129089955124073
STAT match_MatMulAddPattern +0 -0 #it=4 maxmatch=2 i=2 - time=0.0002485609984432813
STAT match_MatMulReshape2Of3Pattern +0 -0 #it=5 maxmatch=2 i=0 - time=0.00010799999836308416
STAT match_MulMulMatMulPattern +0 -0 #it=5 maxmatch=2 i=0 - time=6.843599840067327e-05
STAT match_MulMulMulScalarPattern +0 -0 #it=5 maxmatch=0 i=0 - time=6.651900002907496e-05
STAT match_RMSNormalizationPattern +0 -0 #it=5 maxmatch=2 i=0 - time=3.467900023679249e-05
STAT match_ReduceReshapePattern +0 -0 #it=5 maxmatch=0 i=0 - time=4.031100070278626e-05
STAT match_ReduceSumNormalizePattern +0 -0 #it=5 maxmatch=0 i=0 - time=3.764500070246868e-05
STAT match_Reshape2Of3Pattern +0 -0 #it=5 maxmatch=0 i=0 - time=0.0002217780020146165
STAT match_ReshapeMatMulReshapePattern +0 -0 #it=5 maxmatch=0 i=0 - time=7.608400119352154e-05
STAT match_ReshapePattern +0 -0 #it=6 maxmatch=0 i=0 - time=7.30459996702848e-05
STAT match_ReshapeReshapeBinaryPattern +0 -0 #it=5 maxmatch=0 i=0 - time=7.328399988182355e-05
STAT match_ReshapeReshapePattern +0 -0 #it=6 maxmatch=2 i=0 - time=4.748299943457823e-05
STAT match_RotaryConcatPartPattern +0 -0 #it=5 maxmatch=2 i=0 - time=4.104999970877543e-05
STAT match_RotaryEmbeddingPattern +0 -0 #it=5 maxmatch=2 i=0 - time=3.122300040558912e-05
STAT match_SameChildrenPattern +0 -0 #it=6 maxmatch=2 i=0 - time=6.179799856909085e-05
STAT match_SequenceConstructAtPattern +0 -0 #it=5 maxmatch=2 i=0 - time=3.66729982488323e-05
STAT match_ShapeBasedConcatExpandPattern +0 -0 #it=5 maxmatch=2 i=0 - time=3.434999962337315e-05
STAT match_ShapeBasedEditDistanceReshapePattern +0 -0 #it=6 maxmatch=2 i=0 - time=0.00011381599688320421
STAT match_ShapeBasedExpandBroadcastMatMulPattern +0 -0 #it=5 maxmatch=2 i=0 - time=7.776400161674246e-05
STAT match_ShapeBasedExpandBroadcastPattern +0 -0 #it=5 maxmatch=2 i=0 - time=8.976800017990172e-05
STAT match_ShapeBasedExpandCastWhereSwapPattern +0 -0 #it=5 maxmatch=2 i=0 - time=3.33359985233983e-05
STAT match_ShapeBasedExpandSwapPattern +0 -0 #it=5 maxmatch=2 i=0 - time=7.048799852782395e-05
STAT match_ShapeBasedIdentityPattern +0 -0 #it=6 maxmatch=2 i=0 - time=6.35130018054042e-05
STAT match_ShapeBasedMatMulToMulPattern +0 -0 #it=5 maxmatch=2 i=0 - time=7.140300112951081e-05
STAT match_ShapeBasedReshapeIsSqueezePattern +0 -0 #it=6 maxmatch=2 i=0 - time=6.6841999796452e-05
STAT match_ShapeBasedSameChildrenPattern +0 -0 #it=6 maxmatch=2 i=0 - time=4.0420003642793745e-05
STAT match_ShapeBasedShapeShapeAddPattern +0 -0 #it=6 maxmatch=2 i=0 - time=8.378799975616857e-05
STAT match_ShapeBasedStaticExpandPattern +0 -0 #it=6 maxmatch=2 i=0 - time=3.700900015246589e-05
STAT match_SliceSlicePattern +0 -0 #it=5 maxmatch=2 i=0 - time=3.178899896738585e-05
STAT match_SlicesSplitPattern +0 -0 #it=5 maxmatch=2 i=0 - time=3.501399805827532e-05
STAT match_SoftmaxCrossEntropyLossCastPattern +0 -0 #it=6 maxmatch=2 i=0 - time=0.001981257997613284
STAT match_SplitConcatPattern +0 -0 #it=5 maxmatch=2 i=0 - time=3.38849986292189e-05
STAT match_SqueezeAddPattern +0 -0 #it=6 maxmatch=2 i=0 - time=0.00011344200174789876
STAT match_SqueezeUnsqueezePattern +0 -0 #it=6 maxmatch=2 i=0 - time=6.647400186921004e-05
STAT match_StaticConcatReshapePattern +0 -0 #it=6 maxmatch=2 i=0 - time=4.6860999646014534e-05
STAT match_Sub1MulPattern +0 -0 #it=5 maxmatch=2 i=0 - time=3.114200080744922e-05
STAT match_SwitchOrderBinaryPattern +0 -0 #it=5 maxmatch=2 i=0 - time=5.62149998586392e-05
STAT match_SwitchReshapeActivationPattern +0 -0 #it=5 maxmatch=2 i=0 - time=9.913300164043903e-05
STAT match_TransposeEqualReshapePattern +0 -0 #it=5 maxmatch=2 i=1 - time=0.00012121800136810634
STAT match_TransposeMatMulPattern +0 -0 #it=5 maxmatch=2 i=0 - time=0.0002322609998373082
STAT match_TransposeReshapeMatMulPattern +0 -0 #it=5 maxmatch=2 i=0 - time=7.186100083345082e-05
STAT match_TransposeReshapeTransposePattern +0 -0 #it=6 maxmatch=2 i=0 - time=6.346800182654988e-05
STAT match_TransposeTransposePattern +0 -0 #it=6 maxmatch=2 i=0 - time=5.7784001910476945e-05
STAT match_UnsqueezeEqualPattern +0 -0 #it=5 maxmatch=2 i=0 - time=3.106499934801832e-05
STAT match_UnsqueezeUnsqueezePattern +0 -0 #it=6 maxmatch=2 i=0 - time=3.584799742384348e-05
STAT remove_identity_nodes +0 -0 #it=3 maxmatch=0 i=0 - time=0.000595396999415243
STAT remove_unused +0 -0 #it=6 maxmatch=0 i=0 - time=0.0009860719965217868
--MODEL: 5 nodes, 1 inputs, 1 outputs, 5 initializers--
INPUT: 1 x 1t
INPUT-SEQ: 1 x Falset
OUTPUT: 1 x 1t
OUTPUT-SEQ: 1 x Falset
INIT: 4 x 1t
INIT: 1 x 7t
NODE: 2 x Gemm
NODE: 1 x Relu
NODE: 1 x Reshape
NODE: 1 x Transpose
--MODEL: 5 nodes, 1 inputs, 1 outputs, 5 initializers--DETAILED--
INPUT: 1 x 1t[3x10]
OUTPUT: 1 x 1t[3x1]
INIT: 1 x 1t[10x32]
INIT: 1 x 1t[1]
INIT: 1 x 1t[32]
INIT: 1 x 1t[32x1]
INIT: 1 x 7t[2]
NODE: 1 x Gemm -SIG- 1t[3x10], 1t[32x10], 1t[32]
NODE: 1 x Gemm -SIG- 1t[3x32], 1t[1x32], 1t[1]
NODE: 1 x Relu -SIG- 1t[3x32]
NODE: 1 x Reshape -SIG- 1t[32x1], 7t[2]
NODE: 1 x Transpose -SIG- 1t[10x32]-perm=1;0
[GraphBuilder-VNK.remove_identity_nodes] -- starts with 5
[GraphBuilder-VNK.remove_identity_nodes] found 0 replacements
[GraphBuilder-VNK.remove_identity_nodes] kept 5 nodes
[GraphBuilder-VNK.remove_identity_nodes] ends with 5 nodes in 3.849400127364788e-05 seconds
[GraphBuilder-VNK.constant_folding] -- starts with 7 constants and 5 nodes.
[GraphBuilder-VNK.constant_folding] cst:: 1 :: layers.0.bias
[GraphBuilder-VNK.constant_folding] cst:: 1 :: GemmTransposePattern--p_layers_2_weight::T10
[GraphBuilder-VNK.constant_folding] cst:: . :: _onx_matmul_x
[GraphBuilder-VNK.constant_folding] cst:: . :: x
[GraphBuilder-VNK.constant_folding] cst:: . :: relu
[GraphBuilder-VNK.constant_folding] cst:: . :: output_0
[GraphBuilder-VNK.constant_folding] cst:: 1 :: layers.2.bias
[GraphBuilder-VNK.constant_folding] cst:: 1 :: init7_s2_1_-1
[GraphBuilder-VNK.constant_folding] cst:: . :: linear
[GraphBuilder-VNK.constant_folding] cst:: . :: _onx_matmul_relu
[GraphBuilder-VNK.constant_folding] cst:: 1 :: GemmTransposePattern--p_layers_0_weight::T10
[GraphBuilder-VNK.constant_folding] cst:: 1 :: p_layers_2_weight::T10
[GraphBuilder-VNK.constant_folding] cst:: 1 :: p_layers_0_weight::T10
[GraphBuilder-VNK.constant_folding] initializer: p_layers_0_weight::T10
[GraphBuilder-VNK.constant_folding] initializer: p_layers_2_weight::T10
[GraphBuilder-VNK.constant_folding] initializer: layers.0.bias
[GraphBuilder-VNK.constant_folding] initializer: layers.2.bias
[GraphBuilder-VNK.constant_folding] from: Transpose(GemmTransposePattern--p_layers_0_weight::T10)
[GraphBuilder-VNK.set_type] GemmTransposePattern--p_layers_0_weight::T10:1
[GraphBuilder-VNK.make_initializer] GemmTransposePattern--p_layers_0_weight::T10[1:(32, 10)]
[GraphBuilder-VNK.update_node_constant] new constant 'GemmTransposePattern--p_layers_0_weight::T10', node=None
[GraphBuilder-VNK.constant_folding] fold_constant:Transpose:GemmTransposePattern--p_layers_0_weight::T10[torch.float32:torch.Size([32, 10])]:from:p_layers_0_weight::T10
[GraphBuilder-VNK.constant_folding] from: Reshape(GemmTransposePattern--p_layers_2_weight::T10)
[GraphBuilder-VNK.set_type] GemmTransposePattern--p_layers_2_weight::T10:1
[GraphBuilder-VNK.make_initializer] GemmTransposePattern--p_layers_2_weight::T10[1:(1, 32)]
[GraphBuilder-VNK.update_node_constant] new constant 'GemmTransposePattern--p_layers_2_weight::T10', node=None
[GraphBuilder-VNK.constant_folding] fold_constant:Reshape:GemmTransposePattern--p_layers_2_weight::T10[float32:(1, 32)]:from:init7_s2_1_-1,p_layers_2_weight::T10
[GraphBuilder-VNK.constant_folding] initializer: init7_s2_1_-1
[GraphBuilder-VNK.update_node_constant] new constant 'GemmTransposePattern--p_layers_0_weight::T10', node=None
[GraphBuilder-VNK.update_node_constant] new constant 'GemmTransposePattern--p_layers_2_weight::T10', node=None
[GraphBuilder-VNK.constant_folding] ends with 7 constants and 3 nodes in 0.0006285919989750255 seconds
[GraphBuilder-VNK.remove_unused] remove_initializer 1:0/7:p_layers_0_weight::T10
[GraphBuilder-VNK.remove_unused] remove_initializer 2:1/7:p_layers_2_weight::T10
[GraphBuilder-VNK.remove_unused] remove_initializer 3:4/7:init7_s2_1_-1:int64[(2,)]
[GraphBuilder-VNK.optimize] done with 3 nodes in 0.017
STAT apply_GemmTransposePattern +4 -2 #it=1 maxmatch=1 i=2 - time=0.0006508419992314884
STAT apply_MatMulAddPattern +2 -4 #it=1 maxmatch=1 i=2 - time=0.00040602700028102845
STAT apply_TransposeEqualReshapePattern +1 -1 #it=1 maxmatch=0 i=1 - time=0.0006819099999120226
STAT apply_constant_folding +0 -2 #it=2 maxmatch=0 i=0 - time=0.0008475449994875817
STAT apply_constant_folding__Reshape +0 -0 #it=1 maxmatch=0 i=0 - time=0.0
STAT apply_constant_folding__Transpose +0 -0 #it=1 maxmatch=0 i=0 - time=0.0
STAT apply_constant_folding_new_inits +0 -0 #it=2 maxmatch=0 i=0 - time=0.0
STAT build_graph_for_pattern +0 -0 #it=6 maxmatch=0 i=0 - time=0.0002969959987240145
STAT check_A-dynamic_dimension_naming +0 -0 #it=0 maxmatch=0 i=0 - time=2.2875001377542503e-05
STAT check_A-opt-sub +0 -0 #it=0 maxmatch=0 i=0 - time=1.86079996638e-05
STAT check_B-remove-identity +0 -0 #it=0 maxmatch=0 i=0 - time=2.200900053139776e-05
STAT check_C-remove-unused +0 -0 #it=0 maxmatch=0 i=0 - time=1.678500120760873e-05
STAT check_Da-constant-folding +0 -0 #it=0 maxmatch=0 i=0 - time=1.5906000044196844e-05
STAT check_Db-constant-folding +0 -0 #it=0 maxmatch=0 i=0 - time=2.3223999960464425e-05
STAT check_Ea-remove-unused +0 -0 #it=0 maxmatch=0 i=0 - time=1.5442999938386492e-05
STAT check_Eb-remove-unused +0 -0 #it=0 maxmatch=0 i=0 - time=1.9225999494665302e-05
STAT check_F-patterns +0 -0 #it=0 maxmatch=0 i=0 - time=3.019500036316458e-05
STAT check_G-remove-identity +0 -0 #it=0 maxmatch=0 i=0 - time=2.3991000489331782e-05
STAT check_G-remove-unused +0 -0 #it=0 maxmatch=0 i=0 - time=2.622499960125424e-05
STAT check_H-remove-duplicated-initializer +0 -0 #it=0 maxmatch=0 i=0 - time=1.5530999007751234e-05
STAT check_H-remove-unused +0 -0 #it=0 maxmatch=0 i=0 - time=1.650000012887176e-05
STAT check_pattern_00 +0 -0 #it=1 maxmatch=0 i=0 - time=2.33369992201915e-05
STAT check_pattern_A0 +0 -0 #it=3 maxmatch=0 i=0 - time=0.00020871200104011223
STAT check_pattern_B0 +0 -0 #it=6 maxmatch=0 i=0 - time=0.0002637689995026449
STAT dynamic_dimension_naming +0 -0 #it=0 maxmatch=0 i=0 - time=2.8393000320647843e-05
STAT match_BatchNormalizationPattern +0 -0 #it=6 maxmatch=0 i=0 - time=7.434100007230882e-05
STAT match_BatchNormalizationTrainingPattern +0 -0 #it=6 maxmatch=0 i=0 - time=4.760699812322855e-05
STAT match_CastCastBinaryPattern +0 -0 #it=5 maxmatch=0 i=0 - time=0.0001230689995281864
STAT match_CastLayerNormalizationCastPattern +0 -0 #it=5 maxmatch=0 i=0 - time=3.8672000300721265e-05
STAT match_CastOpCastPattern +0 -0 #it=5 maxmatch=0 i=0 - time=9.698199755803216e-05
STAT match_CastPattern +0 -0 #it=6 maxmatch=0 i=0 - time=4.120499943383038e-05
STAT match_ClipClipPattern +0 -0 #it=5 maxmatch=0 i=0 - time=3.61340007657418e-05
STAT match_ComputationCastOpCastPattern +0 -0 #it=5 maxmatch=0 i=0 - time=6.632899930991698e-05
STAT match_ConcatEmptyPattern +0 -0 #it=5 maxmatch=0 i=0 - time=3.260700214013923e-05
STAT match_ConcatGatherPattern +0 -0 #it=6 maxmatch=0 i=0 - time=4.299100146454293e-05
STAT match_ConcatReshapePattern +0 -0 #it=6 maxmatch=0 i=0 - time=5.633900218526833e-05
STAT match_ConcatTwiceUnaryPattern +0 -0 #it=5 maxmatch=0 i=0 - time=3.7579002309939824e-05
STAT match_ConvBiasNullPattern +0 -0 #it=6 maxmatch=0 i=0 - time=3.6948003980796784e-05
STAT match_DropoutPattern +0 -0 #it=5 maxmatch=0 i=0 - time=3.2151998311746866e-05
STAT match_ExpandBroadcastPattern +0 -0 #it=5 maxmatch=0 i=0 - time=3.119100074400194e-05
STAT match_ExpandPattern +0 -0 #it=6 maxmatch=0 i=0 - time=3.537799602781888e-05
STAT match_ExpandSwapPattern +0 -0 #it=5 maxmatch=0 i=0 - time=2.934900112450123e-05
STAT match_FunctionCausalMaskMulAddPattern +0 -0 #it=5 maxmatch=2 i=0 - time=8.815600085654296e-05
STAT match_FunctionCausalMaskPattern +0 -0 #it=5 maxmatch=2 i=0 - time=3.155100057483651e-05
STAT match_FunctionCosSinCachePattern +0 -0 #it=5 maxmatch=2 i=0 - time=0.0001067709999915678
STAT match_FunctionHalfRotaryEmbeddingPattern +0 -0 #it=5 maxmatch=2 i=0 - time=3.8892001612111926e-05
STAT match_GeluPattern +0 -0 #it=6 maxmatch=0 i=0 - time=1.3644998034578748e-05
STAT match_GemmTransposePattern +0 -0 #it=5 maxmatch=2 i=2 - time=0.0001653399995120708
STAT match_IdentityPattern +0 -0 #it=6 maxmatch=0 i=0 - time=0.0005031620021327399
STAT match_LayerNormalizationPattern +0 -0 #it=5 maxmatch=0 i=0 - time=3.892699896823615e-05
STAT match_LayerNormalizationScalePattern +0 -0 #it=5 maxmatch=0 i=0 - time=3.306199869257398e-05
STAT match_LeakyReluPattern +0 -0 #it=6 maxmatch=0 i=0 - time=0.0012129089955124073
STAT match_MatMulAddPattern +0 -0 #it=4 maxmatch=2 i=2 - time=0.0002485609984432813
STAT match_MatMulReshape2Of3Pattern +0 -0 #it=5 maxmatch=2 i=0 - time=0.00010799999836308416
STAT match_MulMulMatMulPattern +0 -0 #it=5 maxmatch=2 i=0 - time=6.843599840067327e-05
STAT match_MulMulMulScalarPattern +0 -0 #it=5 maxmatch=0 i=0 - time=6.651900002907496e-05
STAT match_RMSNormalizationPattern +0 -0 #it=5 maxmatch=2 i=0 - time=3.467900023679249e-05
STAT match_ReduceReshapePattern +0 -0 #it=5 maxmatch=0 i=0 - time=4.031100070278626e-05
STAT match_ReduceSumNormalizePattern +0 -0 #it=5 maxmatch=0 i=0 - time=3.764500070246868e-05
STAT match_Reshape2Of3Pattern +0 -0 #it=5 maxmatch=0 i=0 - time=0.0002217780020146165
STAT match_ReshapeMatMulReshapePattern +0 -0 #it=5 maxmatch=0 i=0 - time=7.608400119352154e-05
STAT match_ReshapePattern +0 -0 #it=6 maxmatch=0 i=0 - time=7.30459996702848e-05
STAT match_ReshapeReshapeBinaryPattern +0 -0 #it=5 maxmatch=0 i=0 - time=7.328399988182355e-05
STAT match_ReshapeReshapePattern +0 -0 #it=6 maxmatch=2 i=0 - time=4.748299943457823e-05
STAT match_RotaryConcatPartPattern +0 -0 #it=5 maxmatch=2 i=0 - time=4.104999970877543e-05
STAT match_RotaryEmbeddingPattern +0 -0 #it=5 maxmatch=2 i=0 - time=3.122300040558912e-05
STAT match_SameChildrenPattern +0 -0 #it=6 maxmatch=2 i=0 - time=6.179799856909085e-05
STAT match_SequenceConstructAtPattern +0 -0 #it=5 maxmatch=2 i=0 - time=3.66729982488323e-05
STAT match_ShapeBasedConcatExpandPattern +0 -0 #it=5 maxmatch=2 i=0 - time=3.434999962337315e-05
STAT match_ShapeBasedEditDistanceReshapePattern +0 -0 #it=6 maxmatch=2 i=0 - time=0.00011381599688320421
STAT match_ShapeBasedExpandBroadcastMatMulPattern +0 -0 #it=5 maxmatch=2 i=0 - time=7.776400161674246e-05
STAT match_ShapeBasedExpandBroadcastPattern +0 -0 #it=5 maxmatch=2 i=0 - time=8.976800017990172e-05
STAT match_ShapeBasedExpandCastWhereSwapPattern +0 -0 #it=5 maxmatch=2 i=0 - time=3.33359985233983e-05
STAT match_ShapeBasedExpandSwapPattern +0 -0 #it=5 maxmatch=2 i=0 - time=7.048799852782395e-05
STAT match_ShapeBasedIdentityPattern +0 -0 #it=6 maxmatch=2 i=0 - time=6.35130018054042e-05
STAT match_ShapeBasedMatMulToMulPattern +0 -0 #it=5 maxmatch=2 i=0 - time=7.140300112951081e-05
STAT match_ShapeBasedReshapeIsSqueezePattern +0 -0 #it=6 maxmatch=2 i=0 - time=6.6841999796452e-05
STAT match_ShapeBasedSameChildrenPattern +0 -0 #it=6 maxmatch=2 i=0 - time=4.0420003642793745e-05
STAT match_ShapeBasedShapeShapeAddPattern +0 -0 #it=6 maxmatch=2 i=0 - time=8.378799975616857e-05
STAT match_ShapeBasedStaticExpandPattern +0 -0 #it=6 maxmatch=2 i=0 - time=3.700900015246589e-05
STAT match_SliceSlicePattern +0 -0 #it=5 maxmatch=2 i=0 - time=3.178899896738585e-05
STAT match_SlicesSplitPattern +0 -0 #it=5 maxmatch=2 i=0 - time=3.501399805827532e-05
STAT match_SoftmaxCrossEntropyLossCastPattern +0 -0 #it=6 maxmatch=2 i=0 - time=0.001981257997613284
STAT match_SplitConcatPattern +0 -0 #it=5 maxmatch=2 i=0 - time=3.38849986292189e-05
STAT match_SqueezeAddPattern +0 -0 #it=6 maxmatch=2 i=0 - time=0.00011344200174789876
STAT match_SqueezeUnsqueezePattern +0 -0 #it=6 maxmatch=2 i=0 - time=6.647400186921004e-05
STAT match_StaticConcatReshapePattern +0 -0 #it=6 maxmatch=2 i=0 - time=4.6860999646014534e-05
STAT match_Sub1MulPattern +0 -0 #it=5 maxmatch=2 i=0 - time=3.114200080744922e-05
STAT match_SwitchOrderBinaryPattern +0 -0 #it=5 maxmatch=2 i=0 - time=5.62149998586392e-05
STAT match_SwitchReshapeActivationPattern +0 -0 #it=5 maxmatch=2 i=0 - time=9.913300164043903e-05
STAT match_TransposeEqualReshapePattern +0 -0 #it=5 maxmatch=2 i=1 - time=0.00012121800136810634
STAT match_TransposeMatMulPattern +0 -0 #it=5 maxmatch=2 i=0 - time=0.0002322609998373082
STAT match_TransposeReshapeMatMulPattern +0 -0 #it=5 maxmatch=2 i=0 - time=7.186100083345082e-05
STAT match_TransposeReshapeTransposePattern +0 -0 #it=6 maxmatch=2 i=0 - time=6.346800182654988e-05
STAT match_TransposeTransposePattern +0 -0 #it=6 maxmatch=2 i=0 - time=5.7784001910476945e-05
STAT match_UnsqueezeEqualPattern +0 -0 #it=5 maxmatch=2 i=0 - time=3.106499934801832e-05
STAT match_UnsqueezeUnsqueezePattern +0 -0 #it=6 maxmatch=2 i=0 - time=3.584799742384348e-05
STAT pattern_optimization +0 -0 #it=0 maxmatch=0 i=0 - time=0.014893425999616738
STAT remove_duplicated_initializer +0 -0 #it=0 maxmatch=0 i=0 - time=4.3788999391836114e-05
STAT remove_identity_nodes +0 -0 #it=3 maxmatch=0 i=0 - time=0.0009794929992494872
STAT remove_unused +0 -0 #it=6 maxmatch=0 i=0 - time=0.0015925719944789307
--MODEL: 3 nodes, 1 inputs, 1 outputs, 4 initializers--
INPUT: 1 x 1t
INPUT-SEQ: 1 x Falset
OUTPUT: 1 x 1t
OUTPUT-SEQ: 1 x Falset
INIT: 4 x 1t
NODE: 2 x Gemm
NODE: 1 x Relu
--MODEL: 3 nodes, 1 inputs, 1 outputs, 4 initializers--DETAILED--
INPUT: 1 x 1t[3x10]
OUTPUT: 1 x 1t[3x1]
INIT: 1 x 1t[1]
INIT: 1 x 1t[1x32]
INIT: 1 x 1t[32]
INIT: 1 x 1t[32x10]
NODE: 1 x Gemm -SIG- 1t[3x10], 1t[32x10], 1t[32]
NODE: 1 x Gemm -SIG- 1t[3x32], 1t[1x32], 1t[1]
NODE: 1 x Relu -SIG- 1t[3x32]
[GraphBuilder-VNK.to_onnx] make_model 4 inits 0 params
[GraphBuilder-VNK.time_evaluation_constants_] 0
[GraphBuilder-VNK._build_initializers] start with 4 initializers, large_model=False, external_threshold=1024
[GraphBuilder-VNK._build_initializers] switch low/high order
[GraphBuilder-VNK._build_initializers] TensorProto-layers.0.bias:1[(32,)]
[GraphBuilder-VNK._build_initializers] TensorProto-layers.2.bias:1[(1,)]
[GraphBuilder-VNK._build_initializers] <Tensor>-GemmTransposePattern--p_layers_0_weight::T10:torch.float32[torch.Size([32, 10])]
[proto_from_array] 1[torch.Size([32, 10])]
[GraphBuilder-VNK._build_initializers] <ndarray>-GemmTransposePattern--p_layers_2_weight::T10:float32[(1, 32)]
[GraphBuilder-VNK._build_initializers] done in 3.3670003176666796e-06s with 4 initializers, 0 large initializers
[GraphBuilder-VNK._add_shape_information] dynamic shapes replacements={}
Select the pattern to use¶
Class OptimizationOptions
is used to enable or disable patterns.
<<<
import onnx
from experimental_experiment.xbuilder import GraphBuilder, OptimizationOptions
onx = onnx.load("temp_doc_mlp.onnx")
gr = GraphBuilder(
onx,
infer_shapes_options=True,
optimization_options=OptimizationOptions(
patterns="TransposeTranspose,TransposeMatMul", verbose=1
),
)
opt_onx = gr.to_onnx(optimize=True)
>>>
[GraphBuilder-LXO.optimize] start with 5 nodes
[GraphBuilder-LXO.optimize] #patterns=2
[GraphBuilderPatternOptimization-LXO.optimize] start with 5 nodes, 4 initializers, 2 patterns, priorities=[0, 1], max_iter=20
[GraphBuilderPatternOptimization-LXO.optimize] iteration 0: 5 nodes, priority=0
[GraphBuilderPatternOptimization-LXO.optimize] increase priority to 1
[GraphBuilderPatternOptimization-LXO.optimize] iteration 1: 5 nodes, priority=1
[GraphBuilderPatternOptimization-LXO.optimize] stops current_priority_index=2, priorities=[0, 1]
[GraphBuilderPatternOptimization-LXO.optimize] done after 2 iterations with 5 nodes in 0.001
[GraphBuilder-LXO.optimize] done with 5 nodes in 0.002
There exists some predefined lists of patterns:
default
: includes all patterns using only standard onnx patterns.onnxruntime
: patterns specific to onnxruntime, the final model may be executed by onnxruntime and possibly only onnxruntime as it may introduce patterns from Supported Operators and Data Types.
<<<
import onnx
from experimental_experiment.xbuilder import GraphBuilder, OptimizationOptions
onx = onnx.load("temp_doc_mlp.onnx")
gr = GraphBuilder(
onx,
infer_shapes_options=True,
optimization_options=OptimizationOptions(patterns="default+onnxruntime", verbose=1),
)
opt_onx = gr.to_onnx(optimize=True)
>>>
[GraphBuilder-ISM.optimize] start with 5 nodes
[GraphBuilder-ISM.optimize] #patterns=92
[GraphBuilderPatternOptimization-ISM.optimize] start with 5 nodes, 4 initializers, 92 patterns, priorities=[0, 1, 2, 3], max_iter=40
[GraphBuilderPatternOptimization-ISM.optimize] iteration 0: 5 nodes, priority=0
[GraphBuilderPatternOptimization-ISM.optimize] increase priority to 1
[GraphBuilderPatternOptimization-ISM.optimize] iteration 1: 5 nodes, priority=1
[GraphBuilderPatternOptimization-ISM.optimize] increase priority to 2
[GraphBuilderPatternOptimization-ISM.optimize] iteration 2: 5 nodes, priority=2
[GraphBuilderPatternOptimization-ISM.optimize] increase priority to 3
[GraphBuilderPatternOptimization-ISM.optimize] iteration 3: 5 nodes, priority=3
[GraphBuilderPatternOptimization-ISM.optimize] applies 2 matches, 2*MatMulAddPattern - time=0.001 | max_time=IdentityPattern:0.000
[GraphBuilderPatternOptimization-ISM.optimize] iteration 4: 3 nodes, priority=3
[GraphBuilderPatternOptimization-ISM.optimize] applies 2 matches, 2*GemmTransposePattern - time=0.001 | max_time=MulMulMatMulPattern:0.000
[GraphBuilderPatternOptimization-ISM.optimize] iteration 5: 5 nodes, priority=3
[GraphBuilderPatternOptimization-ISM.optimize] applies 1 matches, [0]=MatchResult: TransposeEqualReshapePattern replaces ['Transpose'] - time=0.001 | max_time=TransposeMatMulPattern:0.000
[GraphBuilderPatternOptimization-ISM.optimize] iteration 6: 5 nodes, priority=3
[GraphBuilderPatternOptimization-ISM.optimize] stops current_priority_index=4, priorities=[0, 1, 2, 3]
[GraphBuilderPatternOptimization-ISM.optimize] done after 7 iterations with 5 nodes in 0.016
[GraphBuilder-ISM.optimize] done with 3 nodes in 0.018
Statistics¶
This can be used to see when a pattern is applied and how long it takes.
<<<
import pandas
import onnx
from experimental_experiment.xbuilder import GraphBuilder, OptimizationOptions
onx = onnx.load("temp_doc_mlp.onnx")
gr = GraphBuilder(
onx,
infer_shapes_options=True,
optimization_options=OptimizationOptions(patterns="default"),
)
stat = gr.optimize()
print(pandas.DataFrame(stat))
>>>
pattern removed added time_in iteration value instances match_index
0 dynamic_dimension_naming 0.0 0.0 0.000018 NaN NaN NaN NaN
1 check_A-dynamic_dimension_naming NaN NaN 0.000017 NaN NaN NaN NaN
2 check_A-opt-sub NaN NaN 0.000011 NaN NaN NaN NaN
3 remove_identity_nodes 0.0 0.0 0.000083 NaN NaN NaN NaN
4 check_B-remove-identity NaN NaN 0.000013 NaN NaN NaN NaN
.. ... ... ... ... ... ... ... ...
445 remove_duplicated_initializer 0.0 0.0 0.000022 NaN NaN NaN NaN
446 check_H-remove-duplicated-initializer NaN NaN 0.000008 NaN NaN NaN NaN
447 remove_identity_nodes 0.0 0.0 0.000031 NaN NaN NaN NaN
448 remove_unused 0.0 NaN 0.000036 NaN NaN NaN NaN
449 check_H-remove-unused NaN NaN 0.000009 NaN NaN NaN NaN
[450 rows x 8 columns]
It can be aggregated:
<<<
import pandas
import onnx
from experimental_experiment.xbuilder import GraphBuilder, OptimizationOptions
onx = onnx.load("temp_doc_mlp.onnx")
gr = GraphBuilder(
onx,
infer_shapes_options=True,
optimization_options=OptimizationOptions(patterns="default"),
)
stat = gr.optimize()
df = pandas.DataFrame(stat)
for c in df.columns:
if "time" not in c and "pattern" not in c:
df[c] = df[c].fillna(0).astype(int)
aggs = {
"time_in": "sum",
"added": "sum",
"removed": "sum",
"iteration": "max",
"match_index": "max",
"instances": "sum",
}
print(df.groupby("pattern").agg(aggs))
>>>
time_in added removed iteration match_index instances
pattern
apply_GemmTransposePattern 0.000326 4 2 3 1 2
apply_MatMulAddPattern 0.000239 2 4 2 1 2
apply_TransposeEqualReshapePattern 0.000361 1 1 4 0 1
apply_constant_folding 0.000512 0 2 1 0 0
apply_constant_folding__Reshape 0.000000 0 0 1 0 0
... ... ... ... ... ... ...
match_UnsqueezeUnsqueezePattern 0.000023 0 0 5 2 0
pattern_optimization 0.011033 0 0 0 0 0
remove_duplicated_initializer 0.000045 0 0 0 0 0
remove_identity_nodes 0.000763 0 0 2 0 0
remove_unused 0.001222 0 0 5 0 0
[101 rows x 6 columns]
Shape inference¶
The optimizers require to know the shapes to ensure they can rewrite some nodes and avoid producing a model which does not return the same results. If it is missing, some patterns cannot match for sure and they will not match.
This information can be built by running shape inference on the onnx models. That’s what is done is the previous examples. However, the best case is when this information comes from torch.
Function to_onnx
converts a torch model into ONNX. While doing so, it stores the shape
information coming from torch. There is no need to run shape inference
on the onnx model it generates before optimizing it.
Available Patterns and API¶
All patterns may be found at .xoptim.patterns and .xoptim.patterns_ort.
When writing a pattern, walking along the graph or checking the shape
is very common. Class GraphBuilderPatternOptimization
provides the following methods.
Opsets¶
Patterns must rewrite using the nodes of the opset defined in the model.
main_opset
: returns the opset
Shapes, Types¶
has_type
: tells if a result type is knownget_type
: returns a result type, fails if not knownhas_shape
: tells if a result shape is knownget_shape
: returns a result shape, fails if not knownhas_rank
: tells if a result rank is knownget_rank
: returns a result rank, fails if not knowntry_infer_type
: returns a type if it can be guessedtry_infer_shape
: returns a shape if it can be guessed
Constants¶
is_constant
: tells if a node is a constant (it may be a constant, an initializer or any value built on other constants)is_constant_scalar
: checks a constant is a scalar and compares its value to a numberget_computed_constant
: returns the constant, computes it is a constant built from other constantsget_attribute
: returns an attribute of a node
Graph¶
next_node
: returns the next node only if there is only onenext_nodes
: returns the node consuming this resultnode_before
: returns the node producing the resultis_output
: tells if a result is an outputis_used_by_subgraph
: tells if a result is used by a subgraphis_used_more_than_once
: tells if a result is used more than onceis_used_only_by
: tells if a result is only used by specific nodes
Nodes¶
make_node
: creates a node without adding it to the graphmake_node_check_opset
: creates a node without adding it to the graph, deals with some constraints related to opset version