-m onnx_diagnostic partition … move layer nodes in local functions

The command line leverages the metadata added by the exporter. Every node is tagged with information indicating which part of the model it comes from. In particular the key namespace:

transformers.models.llama.modeling_llama.LlamaForCausalLM/model:
transformers.models.llama.modeling_llama.LlamaModel/model.layers.0:
transformers.models.llama.modeling_llama.LlamaDecoderLayer/model.layers.0.self_attn:
transformers.models.llama.modeling_llama.LlamaAttention/unsqueeze_15:
aten.unsqueeze.default

Description

See onnx_diagnostic.helpers.onnx_helper.make_model_with_local_functions().

    usage: partition [-h] [-r REGEX] [-p META_PREFIX] [-v VERBOSE] input output
    
    Partitions an onnx model by moving nodes into local functions.
    Exporters may add metadata to the onnx nodes telling which part
    of the model it comes from (namespace, source, ...).
    This nodes are moved into local functions.
    
    positional arguments:
      input                 input model
      output                output model
    
    options:
      -h, --help            show this help message and exit
      -r REGEX, --regex REGEX
                            merges all nodes sharing the same value in node metadata,
                            these values must match the regular expression specified by
                            this parameter, the default value matches what transformers
                            usually to define a layer
      -p META_PREFIX, --meta-prefix META_PREFIX
                            allowed prefixes for keys in the metadata
      -v VERBOSE, --verbose VERBOSE
                            verbosity
    
    The regular may match the following values,
    'model.layers.0.forward', 'model.layers.1.forward', ...
    A local function will be created for each distinct layer.

Example

python -m onnx_diagnostic partition arnir0_Tiny-LLM-onnx-dynamo-ir-f16-cuda-op18.onnx partition.onnx -r ".*[.]layers[.][0-9]+$" -v 1

This produces the following output:

-- load 'arnir0_Tiny-LLM-onnx-dynamo-ir-f16-cuda-op18.onnx'
-- partition
[make_model_with_local_functions] matched 1 partitions
[make_model_with_local_functions] move 89 nodes in partition 'transformers_models_llama_modeling_llama_LlamaModel/model_layers_0'
-- save into 'partition.onnx'
-- done

The partitioned model includes the following node:

../_images/_img_partition.png