python -m experimental_experiment optimizeΒΆ

The command line optimizes a model, mostly by fusing nodes maching one of the patterns defined in this library. One simple example:

../_images/fuse.png

The argument of the command lines.

    usage: optimize [-h] -i INPUT -o OUTPUT [-v VERBOSE]
                    [--infer_shapes | --no-infer_shapes]
                    [--identity | --no-identity] [--folding | --no-folding]
                    [--processor PROCESSOR] [--patterns PATTERNS]
                    [--max_iter MAX_ITER]
                    [--dump_applied_patterns DUMP_APPLIED_PATTERNS]
                    [--remove_shape_info | --no-remove_shape_info]
    
    Optimizes an onnx model by fusing nodes. It looks for patterns in the graphs
    and replaces them by the corresponding nodes. It also does basic optimization
    such as removing identity nodes or unused nodes.
    
    options:
      -h, --help            show this help message and exit
      -i INPUT, --input INPUT
                            onnx model to optimize
      -o OUTPUT, --output OUTPUT
                            onnx model to output
      -v VERBOSE, --verbose VERBOSE
                            verbosity
      --infer_shapes, --no-infer_shapes
                            infer shapes before optimizing the model
      --identity, --no-identity
                            remove identity nodes
      --folding, --no-folding
                            does constant folding
      --processor PROCESSOR
                            optimization for a specific processor, CPU, CUDA or both CPU,CUDA,
                            some operators are only available in one processor
      --patterns PATTERNS   patterns optimization to apply, see below
      --max_iter MAX_ITER   number of iterations for pattern optimization, -1 for many
      --dump_applied_patterns DUMP_APPLIED_PATTERNS
                            dumps applied patterns in this folder if specified
      --remove_shape_info, --no-remove_shape_info
                            remove shape information before outputting the model
    
    The goal is to make the model faster.
    Argument patterns defines the patterns to apply or the set of patterns.
    It defines the following sets of patterns
    
    - '' or none    : no pattern optimization
    - default       : rewrites standard onnx operators into other standard onnx operators
    - ml            : does the same with operators defined in domain 'ai.onnx.ml'
    - onnxruntime   : introduces fused nodes defined in onnxruntime
    - experimental  : introduces fused nodes defined in module onnx-extended
    
    Examples of values:
    
    - none
    - default
    - ml
    - default+ml+onnxruntime+experimental
    - default+ml+onnxruntime+experimental-ReshapeReshapePattern
    
    The last one applies all patterns but one. The list of patterns can be
    obtained by running:
    
        python -m experimental_experiment optimize --patterns=list -i '' -o ''

The list of available patterns.

<<<

from experimental_experiment.xoptim import get_pattern_list

for s in ["default", "ml", "onnxruntime", "experimental"]:
    print()
    print(f"-- {s} patterns")
    pats = get_pattern_list(s)
    for p in pats:
        print(p)

>>>

    
    -- default patterns
    BatchNormalizationPattern
    BatchNormalizationTrainingPattern
    CastLayerNormalizationCastPattern
    CastPattern
    CastCastBinaryPattern
    CastOpCastPattern
    ClipClipPattern
    ComputationCastOpCastPattern
    ConvBiasNullPattern
    DropoutPattern
    ExpandPattern
    ExpandBroadcastPattern
    ExpandSwapPattern
    GeluPattern
    IdentityPattern
    LayerNormalizationPattern
    LayerNormalizationScalePattern
    LeakyReluPattern
    MulMulMulScalarPattern
    ReduceReshapePattern
    ReduceSumNormalizePattern
    ReshapePattern
    ReshapeMatMulReshapePattern
    Reshape2Of3Pattern
    ReshapeReshapeBinaryPattern
    MatMulAddPattern
    GemmTransposePattern
    MatMulReshape2Of3Pattern
    MulMulMatMulPattern
    ReshapeReshapePattern
    RotaryConcatPartPattern
    SameChildrenPattern
    SequenceConstructAtPattern
    SliceSlicePattern
    SlicesSplitPattern
    SoftmaxCrossEntropyLossCastPattern
    Sub1MulPattern
    SwitchOrderBinaryPattern
    TransposeMatMulPattern
    TransposeReshapeMatMulPattern
    TransposeReshapeTransposePattern
    TransposeTransposePattern
    UnsqueezeEqualPattern
    UnsqueezeUnsqueezePattern
    
    -- ml patterns
    TreeEnsembleRegressorConcatPattern
    TreeEnsembleRegressorMulPattern
    
    -- onnxruntime patterns
    BiasGeluPattern
    BiasSoftmaxPattern
    GeluOrtPattern
    GeluErfPattern
    FusedConvPattern
    FastGeluPattern
    FusedMatMulPattern
    FusedMatMulx2Pattern
    FusedMatMulDivPattern
    FusedMatMulTransposePattern
    SimplifiedLayerNormalizationPattern
    SoftmaxGradPattern
    
    -- experimental patterns
    AddAddMulMulPattern
    AddAddMulMulBroadcastPattern
    AddMulPattern
    AddMulBroadcastPattern
    AddMulSharedInputPattern
    AddMulSharedInputBroadcastPattern
    AddMulTransposePattern
    ConstantOfShapeScatterNDPattern
    MaskedShapeScatterNDPattern
    MulSigmoidPattern
    NegXplus1Pattern
    ReplaceZeroPattern
    SimpleRotaryPattern
    SubMulPattern
    SubMulBroadcastPattern
    TransposeCastPattern
    TriMatrixPattern