yobx.xoptim.patterns_ort#
modules
get_onnxruntime_patterns#
- yobx.xoptim.patterns_ort.get_onnxruntime_patterns(verbose: int = 0) List[PatternOptimization][source]#
Returns a default list of optimization patterns for onnxruntime. It is equal to the following list.
<<<
from yobx.xoptim.patterns_api import pattern_table_doc from yobx.xoptim.patterns_ort import get_onnxruntime_patterns print(pattern_table_doc(get_onnxruntime_patterns(), as_rst=True)) print()
>>>
name
short_name
priority
doc
0
Attention3DPattern
Attention3D
2
Fuses nodes into Attention from com.microsoft domain. In progress.
1
BiasGeluPattern
BiasGelu
1
Replaces by
y = BiasGelu(x, B)2
BiasSoftmaxPattern
BiasSoftmax
1
Replaces Softmax(Add(x,y), axis=-1) by BiasSoftmax(x,y,axis=-1) Model with nodes to be fused…
3
ContribRotaryEmbeddingPattern
ContribRotaryEmbedding
2
Very similar to
yobx.xoptim.patterns.onnx_rotary.RotaryEmbeddingPattern. Model with nodes to be fused…4
ContribRotaryEmbedding3DPattern
ContribRotaryEmbedding3D
1
Extension to
yobx.xoptim.patterns_ort.llm_optim.ContribRotaryEmbeddingPattern, turn the operator into a 3D operator including the transpose. Model with nodes to be fused…5
GeluOrtPattern
GeluOrt
0
Detects the decomposed version of Gelu with Tanh
6
GeluErfPattern
GeluErf
0
Detects the decomposed version of Gelu with Erf. Model with nodes to be fused…
7
GroupQueryAttention3DPattern
GroupQueryAttention3D
2
Fuse LocalAttention into GroupQueryAttention.
biasis not supported by this kernel on CUDA.8
FusedConvPattern
FusedConv
2
Replaces the Conv + Relu into FusedConv. Model with nodes to be fused…
9
FastGeluPattern
FastGelu
1
Replaces Gelu by FastGelu. Model with nodes to be fused…
10
FusedMatMulPattern
FusedMatMul
2
Replaces the sequence Transpose, Matmul into FusedMatMul. Model with nodes to be fused…
11
FusedMatMulx2Pattern
FusedMatMulx2
3
Replaces the sequence Div by a scalar consumed by two FusedMatMul. Model with nodes to be fused…
12
FusedMatMulDivPattern
FusedMatMulDiv
2
Replaces the Matmul, Div into FusedMatMul. Model with nodes to be fused…
13
FusedMatMulTransposePattern
FusedMatMulTranspose
3
Replaces the sequence (Fused)Matmul(A,B) + Transpose into FusedMatMul(B.T, A.T). Model with nodes to be fused…
14
MissingCosSinPattern
MissingCosSin
1
Replaces Cos/Sin by Cast Cos/Sin Cast because of some missing kernels.
15
MissingRangePattern
MissingRange
1
Replaces Range by Cast Range Cast because of some missing kernels.
16
MissingReduceMaxPattern
MissingReduceMax
1
Replaces Range by Cast Range Cast because of some missing kernels.
17
MissingTopKPattern
MissingTopK
1
Replaces Range by Cast Range Cast because of some missing kernels.
18
MultiHeadAttention3DPattern
MultiHeadAttention3D
2
Merges multiple nodes into MultiHeadAttention. It assumes pattern
yobx.xoptim.patterns.onnx_attention.FunctionAttentionPatternwas triggered before. Model with nodes to be fused…19
QuickGeluPattern
QuickGelu
1
Replaces Mul(x, Sigmoid(x)) by QuickGelu(x, alpha=1) Model with nodes to be fused…
20
ReshapeGemmPattern
ReshapeGemm
3
Replaces the sequence Reshape(-1, …) + Gemm into FusedMatMul(). Model with nodes to be fused…
21
ReshapeGemmReshapePattern
ReshapeGemmReshape
3
Replaces the sequence Reshape + Gemm + Reshape into FusedMatMul. Model with nodes to be fused…
22
SimplifiedLayerNormalizationPattern
SimplifiedLayerNormalization
1
Fuses the nodes equivalent to SimplifiedLayerNormalization. Model with nodes to be fused…
23
SimplifiedLayerNormalizationMulPattern
SimplifiedLayerNormalizationMul
1
Replaces the sequence SimplifiedLayerNormalization + Mul by SimplifiedLayerNormalization. Model with nodes to be fused…
24
SkipLayerNormalizationPattern
SkipLayerNormalization
1
Replaces the sequence Add + LayerNormalization into SkipLayerNormalization. Model with nodes to be fused…
25
SkipSimplifiedLayerNormalizationPattern
SkipSimplifiedLayerNormalization
1
Replaces the sequence Add + SimplifiedLayerNormalization by SkipSimplifiedLayerNormalization. Model with nodes to be fused…
26
SkipSimplifiedLayerNormalizationMulPattern
SkipSimplifiedLayerNormalizationMul
1
Replaces the sequence SkipSimplifiedLayerNormalization + Mul by SkipSimplifiedLayerNormalization. Model with nodes to be fused…
27
TransposeFusedMatMulBPattern
TransposeFusedMatMulB
3
Replaces the sequence Transpose(B, [0, 2, 3, 1] + (Fused)Matmul(A,B) into Transpose(A, [0, 2, 1, 3]) + FusedMatMul(A, B, transB=1). Model with nodes to be fused…