yobx.xoptim.patterns_ort#

get_onnxruntime_patterns#

yobx.xoptim.patterns_ort.get_onnxruntime_patterns(verbose: int = 0) List[PatternOptimization][source]#

Returns a default list of optimization patterns for onnxruntime. It is equal to the following list.

<<<

from yobx.xoptim.patterns_api import pattern_table_doc
from yobx.xoptim.patterns_ort import get_onnxruntime_patterns

print(pattern_table_doc(get_onnxruntime_patterns(), as_rst=True))
print()

>>>

name

short_name

priority

doc

0

Attention3DPattern

Attention3D

2

Fuses nodes into Attention from com.microsoft domain. In progress.

1

BiasGeluPattern

BiasGelu

1

Replaces by y = BiasGelu(x, B)

2

BiasSoftmaxPattern

BiasSoftmax

1

Replaces Softmax(Add(x,y), axis=-1) by BiasSoftmax(x,y,axis=-1) Model with nodes to be fused…

3

ContribRotaryEmbeddingPattern

ContribRotaryEmbedding

2

Very similar to yobx.xoptim.patterns.onnx_rotary.RotaryEmbeddingPattern. Model with nodes to be fused…

4

ContribRotaryEmbedding3DPattern

ContribRotaryEmbedding3D

1

Extension to yobx.xoptim.patterns_ort.llm_optim.ContribRotaryEmbeddingPattern, turn the operator into a 3D operator including the transpose. Model with nodes to be fused…

5

GeluOrtPattern

GeluOrt

0

Detects the decomposed version of Gelu with Tanh

6

GeluErfPattern

GeluErf

0

Detects the decomposed version of Gelu with Erf. Model with nodes to be fused…

7

GroupQueryAttention3DPattern

GroupQueryAttention3D

2

Fuse LocalAttention into GroupQueryAttention. bias is not supported by this kernel on CUDA.

8

FusedConvPattern

FusedConv

2

Replaces the Conv + Relu into FusedConv. Model with nodes to be fused…

9

FastGeluPattern

FastGelu

1

Replaces Gelu by FastGelu. Model with nodes to be fused…

10

FusedMatMulPattern

FusedMatMul

2

Replaces the sequence Transpose, Matmul into FusedMatMul. Model with nodes to be fused…

11

FusedMatMulx2Pattern

FusedMatMulx2

3

Replaces the sequence Div by a scalar consumed by two FusedMatMul. Model with nodes to be fused…

12

FusedMatMulDivPattern

FusedMatMulDiv

2

Replaces the Matmul, Div into FusedMatMul. Model with nodes to be fused…

13

FusedMatMulTransposePattern

FusedMatMulTranspose

3

Replaces the sequence (Fused)Matmul(A,B) + Transpose into FusedMatMul(B.T, A.T). Model with nodes to be fused…

14

MissingCosSinPattern

MissingCosSin

1

Replaces Cos/Sin by Cast Cos/Sin Cast because of some missing kernels.

15

MissingRangePattern

MissingRange

1

Replaces Range by Cast Range Cast because of some missing kernels.

16

MissingReduceMaxPattern

MissingReduceMax

1

Replaces Range by Cast Range Cast because of some missing kernels.

17

MissingTopKPattern

MissingTopK

1

Replaces Range by Cast Range Cast because of some missing kernels.

18

MultiHeadAttention3DPattern

MultiHeadAttention3D

2

Merges multiple nodes into MultiHeadAttention. It assumes pattern yobx.xoptim.patterns.onnx_attention.FunctionAttentionPattern was triggered before. Model with nodes to be fused…

19

QuickGeluPattern

QuickGelu

1

Replaces Mul(x, Sigmoid(x)) by QuickGelu(x, alpha=1) Model with nodes to be fused…

20

ReshapeGemmPattern

ReshapeGemm

3

Replaces the sequence Reshape(-1, …) + Gemm into FusedMatMul(). Model with nodes to be fused…

21

ReshapeGemmReshapePattern

ReshapeGemmReshape

3

Replaces the sequence Reshape + Gemm + Reshape into FusedMatMul. Model with nodes to be fused…

22

SimplifiedLayerNormalizationPattern

SimplifiedLayerNormalization

1

Fuses the nodes equivalent to SimplifiedLayerNormalization. Model with nodes to be fused…

23

SimplifiedLayerNormalizationMulPattern

SimplifiedLayerNormalizationMul

1

Replaces the sequence SimplifiedLayerNormalization + Mul by SimplifiedLayerNormalization. Model with nodes to be fused…

24

SkipLayerNormalizationPattern

SkipLayerNormalization

1

Replaces the sequence Add + LayerNormalization into SkipLayerNormalization. Model with nodes to be fused…

25

SkipSimplifiedLayerNormalizationPattern

SkipSimplifiedLayerNormalization

1

Replaces the sequence Add + SimplifiedLayerNormalization by SkipSimplifiedLayerNormalization. Model with nodes to be fused…

26

SkipSimplifiedLayerNormalizationMulPattern

SkipSimplifiedLayerNormalizationMul

1

Replaces the sequence SkipSimplifiedLayerNormalization + Mul by SkipSimplifiedLayerNormalization. Model with nodes to be fused…

27

TransposeFusedMatMulBPattern

TransposeFusedMatMulB

3

Replaces the sequence Transpose(B, [0, 2, 3, 1] + (Fused)Matmul(A,B) into Transpose(A, [0, 2, 1, 3]) + FusedMatMul(A, B, transB=1). Model with nodes to be fused…