-m onnx_diagnostic validate … validate a model id

The command line is a wrapper around function onnx_diagnostic.torch_models.validate.validate_model().

Description

The command lines validate a model id available on HuggingFace but not only. It creates dummy inputs, runs the models on them, exports the model, measures the discrepancies…

    usage: validate [-h] [-m MID] [-t TASK] [-e EXPORT] [--opt OPT] [-r | --run | --no-run] [-q | --quiet | --no-quiet] [--patch [PATCH ...]] [--rewrite | --no-rewrite]
                    [--stop-if-static STOP_IF_STATIC] [--same-as-trained | --no-same-as-trained] [--trained | --no-trained] [--inputs2 INPUTS2]
                    [--runtime {onnxruntime,torch,ref,orteval,orteval10}] [-o DUMP_FOLDER] [--drop DROP] [--opset OPSET] [--subfolder SUBFOLDER] [--ortfusiontype ORTFUSIONTYPE]
                    [-v VERBOSE] [--dtype DTYPE] [--device DEVICE] [--iop [KEY=VALUE ...]] [--mop [KEY=VALUE ...]] [--repeat REPEAT] [--warmup WARMUP] [--outnames OUTNAMES]
                    [--ort-logs | --no-ort-logs] [--quiet-input-sets QUIET_INPUT_SETS]
    
    Validates a model for a particular task given the model id.
    It exports the model and then validates it by computing the discrepancies
    on different input sets.
    
    options:
      -h, --help            show this help message and exit
      -m MID, --mid MID     model id, usually <author>/<name>
      -t TASK, --task TASK  force the task to use
      -e EXPORT, --export EXPORT
                            export the model with this exporter
      --opt OPT             optimization to apply after the export
      -r, --run, --no-run   Runs the model to check it runs.
      -q, --quiet, --no-quiet
                            Catches exception, reports them in the summary.
      --patch [PATCH ...]   Applies patches before exporting, it can be a boolean to enable to disable the patches or be more finetuned. It is possible to disable patch for torch by adding --patch "patch_sympy=False" --patch "patch_torch=False", default is True.
      --rewrite, --no-rewrite
                            Applies rewrite before exporting.
      --stop-if-static STOP_IF_STATIC
                            Raises an exception if a dynamic dimension becomes static.
      --same-as-trained, --no-same-as-trained
                            Validates or exports a model identical to the trained model but not trained.
      --trained, --no-trained
                            Validates or exports the trained model (requires downloading).
      --inputs2 INPUTS2     Validates or exports the model on a second set of inputs
                            to check the exported model supports dynamism. The values is used as an increment to the first set of inputs. A high value may trick a different behavior in the model and missed by the exporter.
      --runtime {onnxruntime,torch,ref,orteval,orteval10}
                            onnx runtime to use, `onnxruntime` by default
      -o DUMP_FOLDER, --dump-folder DUMP_FOLDER
                            A folder is created to dumps statistics,
                            exported program, onnx...
      --drop DROP           Drops the following inputs names, it should be a list
                            with comma separated values, example:
                            --drop position_ids
      --opset OPSET         onnx opset to use, 18 by default
      --subfolder SUBFOLDER
                            Subfolder where to find the model and the configuration.
      --ortfusiontype ORTFUSIONTYPE
                            Applies onnxruntime fusion, this parameter should contain the
                            model type or multiple values separated by `|`. `ALL` can be used
                            to run them all.
      -v VERBOSE, --verbose VERBOSE
                            verbosity
      --dtype DTYPE         Changes dtype if necessary.
      --device DEVICE       Changes the device if necessary.
      --iop [KEY=VALUE ...]
                            Additional input options, use to change the defaultinputs use to export, example:
                              --iop cls_cache=SlidingWindowCache
                              --iop cls_cache=StaticCache
      --mop [KEY=VALUE ...]
                            Additional model options, use to change some parameters of the model, example:
                              --mop attn_implementation=sdpa --mop attn_implementation=eager
                              --mop "rope_scaling={'rope_type': 'dynamic', 'factor': 10.0}"
      --repeat REPEAT       number of times to run the model to measures inference time
      --warmup WARMUP       number of times to run the model to do warmup
      --outnames OUTNAMES   This comma separated list defines the output names the onnx exporter should use.
      --ort-logs, --no-ort-logs
                            Enables onnxruntime logging when the session is created
      --quiet-input-sets QUIET_INPUT_SETS
                            Avoids raising an exception when an input sets does not work with the exported model.
                            Example: --quiet-input-sets=inputs,inputs22
    
    If the model id is specified, one untrained version of it is instantiated.
    Examples:
    
    python -m onnx_diagnostic validate -m microsoft/Phi-4-mini-reasoning \
        --run -v 1 -o dump_test --no-quiet --repeat 2 --warmup 2 \
        --dtype float16 --device cuda --patch --export onnx-dynamo --opt ir
    
    python -m onnx_diagnostic validate -m microsoft/Phi-4-mini-reasoning \
        --run -v 1 -o dump_test --no-quiet --repeat 2 --warmup 2 \
        --dtype float16 --device cuda --patch --export custom --opt default
    
    python -m onnx_diagnostic validate -m microsoft/Phi-4-mini-reasoning \
        --run -v 1 -o dump_test --no-quiet --repeat 2 --warmup 2 \
        --dtype float16 --device cuda --export modelbuilder
    
    position_ids is usually not needed, they can be removed by adding:
    
        --drop position_ids
    
    The behaviour may be modified compare the original configuration,
    the following argument can be rope_scaling to dynamic:
    
        --mop "rope_scaling={'rope_type': 'dynamic', 'factor': 10.0}""
    
    You can profile the command line by running:
    
        pyinstrument -m onnx_diagnostic validate ...
        pyinstrument -r html -o profile.html -m onnx_diagnostic validate ...

Get the list of supported tasks

The task are the same defined by HuggingFace. The tool only supports a subset of them.

python -m onnx_diagnostic validate
    -- list of supported tasks:
    MoE
    automatic-speech-recognition
    feature-extraction
    fill-mask
    image-classification
    image-text-to-text
    image-to-video
    mask-generation
    object-detection
    sentence-similarity
    summarization
    text-classification
    text-generation
    text-to-image
    text2text-generation
    zero-shot-image-classification

Get the default inputs for a specific task

This returns the dummy inputs for a specific task. There may be too many inputs. Only those the forward method defines are kept.

python -m onnx_diagnostic validate -t text-generation
    -- inputs
      + input_ids       : T7s2x3
      + attention_mask  : T7s2x33
      + position_ids    : T7s2x3
      + past_key_values : DynamicCache(key_cache=#4[T1s2x24x30x16,T1s2x24x30x16,T1s2x24x30x16,T1s2x24x30x16], value_cache=#4[T1s2x24x30x16,T1s2x24x30x16,T1s2x24x30x16,T1s2x24x30x16])
    -- dynamic_shapes
      + input_ids       : {0:DYN(batch),1:DYN(seq_length)}
      + attention_mask  : {0:DYN(batch),1:DYN(cache+seq)}
      + position_ids    : {0:DYN(batch),1:DYN(seq_length)}
      + past_key_values : #8[{0:DYN(batch),2:DYN(cache_length)},{0:DYN(batch),2:DYN(cache_length)},{0:DYN(batch),2:DYN(cache_length)},{0:DYN(batch),2:DYN(cache_length)},{0:DYN(batch),2:DYN(cache_length)},{0:DYN(batch),2:DYN(cache_length)},{0:DYN(batch),2:DYN(cache_length)},{0:DYN(batch),2:DYN(cache_length)}]

Validate dummy inputs for a model

The dummy inputs may not work for this model and this task. The following command line checks that. It is no use to export if this fails.

python -m onnx_diagnostic validate -m arnir0/Tiny-LLM --run -v 1
    [validate_model] validate model id 'arnir0/Tiny-LLM'
    [validate_model] patch={'patch': True}
    [validate_model] get dummy inputs with input_options=None...
    [validate_model] rewrite=True, patch_kwargs={'patch': True, 'patch_transformers': True, 'patch_diffusers': True}, stop_if_static=0
    [validate_model] exporter=None, optimization=None
    [validate_model] dump_folder=None
    [validate_model] output_names=None
    [get_untrained_model_with_inputs] model_id='arnir0/Tiny-LLM', subfolder=None
    [get_untrained_model_with_inputs] use preinstalled 'arnir0/Tiny-LLM'
    [get_untrained_model_with_inputs] architecture='LlamaForCausalLM'
    [get_untrained_model_with_inputs] cls='LlamaConfig'
    [get_untrained_model_with_inputs] task='text-generation'
    [get_untrained_model_with_inputs] default config._attn_implementation=None
    [get_untrained_model_with_inputs] package_source=transformers from ~/github/transformers/src/transformers/__init__.py
    [get_untrained_model_with_inputs] instantiate model_id 'arnir0/Tiny-LLM', subfolder=None
    [get_untrained_model_with_inputs] -- done(2) in 3.645996912382543e-06s
    [get_untrained_model_with_inputs] instantiate_specific_model <class 'transformers.models.llama.modeling_llama.LlamaForCausalLM'>
    [get_untrained_model_with_inputs] -- done(3) in 2.0088002202101052e-05s (model is <class 'NoneType'>)
    [get_untrained_model_with_inputs] instantiate_specific_model(2) <class 'transformers.models.llama.modeling_llama.LlamaForCausalLM'>
    [get_untrained_model_with_inputs] -- done(4) in 0.15612829900055658s (model is <class 'transformers.models.llama.modeling_llama.LlamaForCausalLM'>)
    [get_untrained_model_with_inputs] use fct=<function get_inputs at 0x72ca08071e40>
    [validate_model] --
    [validate_model] task=text-generation
    [validate_model] size=49.549072265625 Mb
    [validate_model] n_weights=12.988992 millions parameters
    [validate_model] +INPUT input_ids=T7s2x3
    [validate_model] +INPUT attention_mask=T7s2x33
    [validate_model] +INPUT position_ids=T7s2x3
    [validate_model] +INPUT past_key_values=DynamicCache(key_cache=#1[T1s2x1x30x96], value_cache=#1[T1s2x1x30x96])
    [validate_model] +SHAPE input_ids={0:DYN(batch),1:DYN(seq_length)}
    [validate_model] +SHAPE attention_mask={0:DYN(batch),1:DYN(cache+seq)}
    [validate_model] +SHAPE position_ids={0:DYN(batch),1:DYN(seq_length)}
    [validate_model] +SHAPE past_key_values=#2[{0:DYN(batch),2:DYN(cache_length)},{0:DYN(batch),2:DYN(cache_length)}]
    [validate_model] second_input_keys=['inputs2', 'inputs_empty_cache', 'inputs_batch1']
    [validate_model] --
    [validate_model] -- run the model inputs='inputs'...
    [validate_model] inputs=dict(input_ids:T7s2x3,attention_mask:T7s2x33,position_ids:T7s2x3,past_key_values:DynamicCache(key_cache=#1[T1s2x1x30x96], value_cache=#1[T1s2x1x30x96]))
    [validate_model] done ([run])
    [validate_model] -- run the model inputs='inputs2'...
    [validate_model] inputs2=dict(input_ids:T7s3x4,attention_mask:T7s3x35,position_ids:T7s3x4,past_key_values:DynamicCache(key_cache=#1[T1s3x1x31x96], value_cache=#1[T1s3x1x31x96]))
    [validate_model] done ([run22])
    [validate_model] -- run the model inputs='inputs_empty_cache'...
    [validate_model] inputs_empty_cache=dict(input_ids:T7s2x3,attention_mask:T7s2x3,position_ids:T7s2x3,past_key_values:DynamicCache(key_cache=#1[T1s2x1x0x96], value_cache=#1[T1s2x1x0x96]))
    [validate_model] done ([run2_empty_cache])
    [validate_model] -- run the model inputs='inputs_batch1'...
    [validate_model] inputs_batch1=dict(input_ids:T7s1x3,attention_mask:T7s1x33,position_ids:T7s1x3,past_key_values:DynamicCache(key_cache=#1[T1s1x1x30x96], value_cache=#1[T1s1x1x30x96]))
    [validate_model] done ([run2_batch1])
    [validate_model] -- done (final)
    
    -- summary --
    :model_class,LlamaForCausalLM;
    :model_config,{'vocab_size':32000,'max_position_embeddings':1024,'hidden_size':192,'intermediate_size':1024,'num_hidden_layers':1,'num_attention_heads':2,'num_key_value_heads':1,'hidden_act':'silu','initializer_range':0.02,'rms_norm_eps':1e-05,'pretraining_tp':1,'use_cache':True,'attention_bias':False,'attention_dropout':0.0,'mlp_bias':False,'head_dim':96,'rope_parameters':{'rope_type':'default','rope_theta':10000.0},'return_dict':True,'output_hidden_states':False,'dtype':'float32','tie_word_embeddings':False,'chunk_size_feed_forward':0,'is_encoder_decoder':False,'is_decoder':False,'cross_attention_hidden_size':None,'add_cross_attention':False,'tie_encoder_decoder':False,'architectures':['LlamaForCausalLM'],'finetuning_task':None,'id2label':{0:'LABEL_0',1:'LABEL_1'},'label2id':{'LABEL_0':0,'LABEL_1':1},'task_specific_params':None,'problem_type':None,'tokenizer_class':None,'prefix':None,'bos_token_id':1,'pad_token_id':None,'eos_token_id':2,'sep_token_id':None,'decoder_start_token_id':None,'max_length':20,'min_length':0,'do_sample':False,'early_stopping':False,'num_beams':1,'temperature':1.0,'top_k':50,'top_p':1.0,'typical_p':1.0,'repetition_penalty':1.0,'length_penalty':1.0,'no_repeat_ngram_size':0,'encoder_no_repeat_ngram_size':0,'bad_words_ids':None,'num_return_sequences':1,'output_scores':False,'return_dict_in_generate':False,'forced_bos_token_id':None,'forced_eos_token_id':None,'remove_invalid_values':False,'exponential_decay_length_penalty':None,'suppress_tokens':None,'begin_suppress_tokens':None,'num_beam_groups':1,'diversity_penalty':0.0,'_name_or_path':'','transformers_version':'5.0.0.dev0','model_type':'llama','rope_theta':10000.0,'subfolder':None,'output_attentions':False};
    :model_config_class,LlamaConfig;
    :model_file,~/github/transformers/src/transformers/models/llama/modeling_llama.py;
    :model_id,arnir0/Tiny-LLM;
    :model_inputs,dict(input_ids:T7s2x3,attention_mask:T7s2x33,position_ids:T7s2x3,past_key_values:DynamicCache(key_cache=#1[T1s2x1x30x96], value_cache=#1[T1s2x1x30x96]));
    :model_inputs_options,;
    :model_module,transformers.models.llama.modeling_llama;
    :model_nweights,12988992;
    :model_shapes,dict(input_ids:{0:DYN(batch),1:DYN(seq_length)},attention_mask:{0:DYN(batch),1:DYN(cache+seq)},position_ids:{0:DYN(batch),1:DYN(seq_length)},past_key_values:#2[{0:DYN(batch),2:DYN(cache_length)},{0:DYN(batch),2:DYN(cache_length)}]);
    :model_size,51955968;
    :model_subfolder,;
    :model_task,text-generation;
    :run_expected,CausalLMOutputWithPast(logits:T1s2x3x32000,past_key_values:DynamicCache(key_cache=#1[T1s2x1x33x96], value_cache=#1[T1s2x1x33x96]));
    :run_expected22,CausalLMOutputWithPast(logits:T1s3x4x32000,past_key_values:DynamicCache(key_cache=#1[T1s3x1x35x96], value_cache=#1[T1s3x1x35x96]));
    :run_expected2_batch1,CausalLMOutputWithPast(logits:T1s1x3x32000,past_key_values:DynamicCache(key_cache=#1[T1s1x1x33x96], value_cache=#1[T1s1x1x33x96]));
    :run_expected2_empty_cache,CausalLMOutputWithPast(logits:T1s2x3x32000,past_key_values:DynamicCache(key_cache=#1[T1s2x1x3x96], value_cache=#1[T1s2x1x3x96]));
    :second_input_keys,inputs2,inputs_empty_cache,inputs_batch1;
    :time_create_torch_model,0.1617974380060332;
    :time_preprocess_model_id,3.765002475120127e-06;
    :time_run,0.014617895001720171;
    :time_run22,0.013006970002606977;
    :time_run2_batch1,0.026246571003866848;
    :time_run2_empty_cache,0.0057407230051467195;
    :time_total_validation_torch,0.06632335899485042;
    :version_date,2025-11-01T15:05:05;
    :version_device,;
    :version_do_run,True;
    :version_drop_inputs,[];
    :version_dtype,;
    :version_dump_folder,;
    :version_exporter,;
    :version_inputs2,1;
    :version_model_id,arnir0/Tiny-LLM;
    :version_numpy,2.3.4;
    :version_onnx,1.20.0;
    :version_onnx_diagnostic,0.8.0;
    :version_onnx_ir,0.1.13;
    :version_onnxruntime,1.24.0;
    :version_onnxscript,?;
    :version_opset,18;
    :version_optimization,;
    :version_ortfusiontype,;
    :version_patch,{'patch': True};
    :version_patch_kwargs,{'patch':True,'patch_transformers':True,'patch_diffusers':True};
    :version_quiet,False;
    :version_rewrite,True;
    :version_runtime,onnxruntime;
    :version_same_as_pretrained,False;
    :version_scipy,1.16.2;
    :version_stop_if_static,0;
    :version_torch,2.10.0.dev20251022+cu130;
    :version_transformers,5.0.0.dev0;
    :version_use_pretrained,False;

Validate and export a model

Exports a model given the task. Checks for discrepancies as well. The latency given are just for one run. It tells how long the benchmark runs but it is far from the latency measure we can get by running multiple times the same model.

python -m onnx_diagnostic validate -m arnir0/Tiny-LLM --run -v 1 --export export-nostrict -o dump_models --patch
    [validate_model] dump into 'arnir0_Tiny-LLM/export-nostrict/op18'
    [validate_model] validate model id 'arnir0/Tiny-LLM'
    [validate_model] patch={'patch': True}
    [validate_model] get dummy inputs with input_options=None...
    [validate_model] rewrite=True, patch_kwargs={'patch': True, 'patch_transformers': True, 'patch_diffusers': True}, stop_if_static=0
    [validate_model] exporter='export-nostrict', optimization=None
    [validate_model] dump_folder='dump_models/arnir0_Tiny-LLM/export-nostrict/op18'
    [validate_model] output_names=None
    [get_untrained_model_with_inputs] model_id='arnir0/Tiny-LLM', subfolder=None
    [get_untrained_model_with_inputs] use preinstalled 'arnir0/Tiny-LLM'
    [get_untrained_model_with_inputs] architecture='LlamaForCausalLM'
    [get_untrained_model_with_inputs] cls='LlamaConfig'
    [get_untrained_model_with_inputs] task='text-generation'
    [get_untrained_model_with_inputs] default config._attn_implementation=None
    [get_untrained_model_with_inputs] package_source=transformers from ~/github/transformers/src/transformers/__init__.py
    [get_untrained_model_with_inputs] instantiate model_id 'arnir0/Tiny-LLM', subfolder=None
    [get_untrained_model_with_inputs] -- done(2) in 1.838000025600195e-05s
    [get_untrained_model_with_inputs] instantiate_specific_model <class 'transformers.models.llama.modeling_llama.LlamaForCausalLM'>
    [get_untrained_model_with_inputs] -- done(3) in 2.58979998761788e-05s (model is <class 'NoneType'>)
    [get_untrained_model_with_inputs] instantiate_specific_model(2) <class 'transformers.models.llama.modeling_llama.LlamaForCausalLM'>
    [get_untrained_model_with_inputs] -- done(4) in 0.18862774699664442s (model is <class 'transformers.models.llama.modeling_llama.LlamaForCausalLM'>)
    [get_untrained_model_with_inputs] use fct=<function get_inputs at 0x72ca08071e40>
    [validate_model] --
    [validate_model] task=text-generation
    [validate_model] size=49.549072265625 Mb
    [validate_model] n_weights=12.988992 millions parameters
    [validate_model] +INPUT input_ids=T7s2x3
    [validate_model] +INPUT attention_mask=T7s2x33
    [validate_model] +INPUT position_ids=T7s2x3
    [validate_model] +INPUT past_key_values=DynamicCache(key_cache=#1[T1s2x1x30x96], value_cache=#1[T1s2x1x30x96])
    [validate_model] +SHAPE input_ids={0:DYN(batch),1:DYN(seq_length)}
    [validate_model] +SHAPE attention_mask={0:DYN(batch),1:DYN(cache+seq)}
    [validate_model] +SHAPE position_ids={0:DYN(batch),1:DYN(seq_length)}
    [validate_model] +SHAPE past_key_values=#2[{0:DYN(batch),2:DYN(cache_length)},{0:DYN(batch),2:DYN(cache_length)}]
    [validate_model] second_input_keys=['inputs2', 'inputs_empty_cache', 'inputs_batch1']
    [validate_model] --
    [validate_model] -- run the model inputs='inputs'...
    [validate_model] inputs=dict(input_ids:T7s2x3,attention_mask:T7s2x33,position_ids:T7s2x3,past_key_values:DynamicCache(key_cache=#1[T1s2x1x30x96], value_cache=#1[T1s2x1x30x96]))
    [validate_model] done ([run])
    [validate_model] -- run the model inputs='inputs2'...
    [validate_model] inputs2=dict(input_ids:T7s3x4,attention_mask:T7s3x35,position_ids:T7s3x4,past_key_values:DynamicCache(key_cache=#1[T1s3x1x31x96], value_cache=#1[T1s3x1x31x96]))
    [validate_model] done ([run22])
    [validate_model] -- run the model inputs='inputs_empty_cache'...
    [validate_model] inputs_empty_cache=dict(input_ids:T7s2x3,attention_mask:T7s2x3,position_ids:T7s2x3,past_key_values:DynamicCache(key_cache=#1[T1s2x1x0x96], value_cache=#1[T1s2x1x0x96]))
    [validate_model] done ([run2_empty_cache])
    [validate_model] -- run the model inputs='inputs_batch1'...
    [validate_model] inputs_batch1=dict(input_ids:T7s1x3,attention_mask:T7s1x33,position_ids:T7s1x3,past_key_values:DynamicCache(key_cache=#1[T1s1x1x30x96], value_cache=#1[T1s1x1x30x96]))
    [validate_model] done ([run2_batch1])
    [validate_model] -- export the model with 'export-nostrict', optimization=None
    [validate_model] applies patches before exporting stop_if_static=0
    [validate_model] run patched model...
    [validate_model] patched inputs=dict(input_ids:T7s2x3,attention_mask:T7s2x33,position_ids:T7s2x3,past_key_values:DynamicCache(key_cache=#1[T1s2x1x30x96], value_cache=#1[T1s2x1x30x96]))
    [validate_model] done (patched run)
    [validate_model] patched discrepancies=abs=0, rel=0
    [call_torch_export_export] exporter='export-nostrict', strict=False, optimization=None
    [call_torch_export_export] args=()
    [call_torch_export_export] kwargs=dict(input_ids:T7s2x3,attention_mask:T7s2x33,position_ids:T7s2x3,past_key_values:DynamicCache(key_cache=#1[T1s2x1x30x96], value_cache=#1[T1s2x1x30x96]))
    [call_torch_export_export] dynamic_shapes=dict(input_ids:{0:DYN(batch),1:DYN(seq_length)},attention_mask:{0:DYN(batch),1:DYN(cache+seq)},position_ids:{0:DYN(batch),1:DYN(seq_length)},past_key_values:#2[{0:DYN(batch),2:DYN(cache_length)},{0:DYN(batch),2:DYN(cache_length)}])
    [call_torch_export_export] dynamic_shapes_export_export=dict(input_ids:{0:DYNAMIC,1:DYNAMIC},attention_mask:{0:DYNAMIC,1:DYNAMIC},position_ids:{0:DYNAMIC,1:DYNAMIC},past_key_values:#2[{0:DYNAMIC,2:DYNAMIC},{0:DYNAMIC,2:DYNAMIC}])
    [call_torch_export_export] export...
    [call_torch_export_export] done (export) with 152 nodes
    [validate_model] run exported model...
    [validate_model] patched inputs=dict(input_ids:T7s2x3,attention_mask:T7s2x33,position_ids:T7s2x3,past_key_values:DynamicCache(key_cache=#1[T1s2x1x30x96], value_cache=#1[T1s2x1x30x96]))
    [validate_model] done (exported run)
    [validate_model] exported discrepancies=abs=0, rel=0
    [validate_model] -- dumps exported program in 'dump_models/arnir0_Tiny-LLM/export-nostrict/op18'...
    [validate_model] done (dump ep)
    [validate_model] dumps statistics in 'dump_models/arnir0_Tiny-LLM/export-nostrict/op18'...
    [validate_model] done (dump)
    [validate_model] -- done (final)
    
    -- summary --
    :disc_exported_abs,0;
    :disc_exported_dnan,0;
    :disc_exported_n,204672.0;
    :disc_exported_rel,0;
    :disc_exported_sum,0.0;
    :disc_patched_abs,0;
    :disc_patched_dnan,0;
    :disc_patched_n,204672.0;
    :disc_patched_rel,0;
    :disc_patched_sum,0.0;
    :dump_folder,dump_models/arnir0_Tiny-LLM/export-nostrict/op18;
    :dump_folder_name,arnir0_Tiny-LLM/export-nostrict/op18;
    :export_args,();
    :export_dynamic_shapes,dict(input_ids:{0:DYN(batch),1:DYN(seq_length)},attention_mask:{0:DYN(batch),1:DYN(cache+seq)},position_ids:{0:DYN(batch),1:DYN(seq_length)},past_key_values:#2[{0:DYN(batch),2:DYN(cache_length)},{0:DYN(batch),2:DYN(cache_length)}]);
    :export_dynamic_shapes_export_export,dict(input_ids:{0:DYNAMIC,1:DYNAMIC},attention_mask:{0:DYNAMIC,1:DYNAMIC},position_ids:{0:DYNAMIC,1:DYNAMIC},past_key_values:#2[{0:DYNAMIC,2:DYNAMIC},{0:DYNAMIC,2:DYNAMIC}]);
    :export_exporter,export-nostrict;
    :export_graph_nodes,152;
    :export_kwargs,dict(input_ids:T7s2x3,attention_mask:T7s2x33,position_ids:T7s2x3,past_key_values:DynamicCache(key_cache=#1[T1s2x1x30x96], value_cache=#1[T1s2x1x30x96]));
    :export_optimization,;
    :export_strict,False;
    :model_class,LlamaForCausalLM;
    :model_config,{'vocab_size':32000,'max_position_embeddings':1024,'hidden_size':192,'intermediate_size':1024,'num_hidden_layers':1,'num_attention_heads':2,'num_key_value_heads':1,'hidden_act':'silu','initializer_range':0.02,'rms_norm_eps':1e-05,'pretraining_tp':1,'use_cache':True,'attention_bias':False,'attention_dropout':0.0,'mlp_bias':False,'head_dim':96,'rope_parameters':{'rope_type':'default','rope_theta':10000.0},'return_dict':True,'output_hidden_states':False,'dtype':'float32','tie_word_embeddings':False,'chunk_size_feed_forward':0,'is_encoder_decoder':False,'is_decoder':False,'cross_attention_hidden_size':None,'add_cross_attention':False,'tie_encoder_decoder':False,'architectures':['LlamaForCausalLM'],'finetuning_task':None,'id2label':{0:'LABEL_0',1:'LABEL_1'},'label2id':{'LABEL_0':0,'LABEL_1':1},'task_specific_params':None,'problem_type':None,'tokenizer_class':None,'prefix':None,'bos_token_id':1,'pad_token_id':None,'eos_token_id':2,'sep_token_id':None,'decoder_start_token_id':None,'max_length':20,'min_length':0,'do_sample':False,'early_stopping':False,'num_beams':1,'temperature':1.0,'top_k':50,'top_p':1.0,'typical_p':1.0,'repetition_penalty':1.0,'length_penalty':1.0,'no_repeat_ngram_size':0,'encoder_no_repeat_ngram_size':0,'bad_words_ids':None,'num_return_sequences':1,'output_scores':False,'return_dict_in_generate':False,'forced_bos_token_id':None,'forced_eos_token_id':None,'remove_invalid_values':False,'exponential_decay_length_penalty':None,'suppress_tokens':None,'begin_suppress_tokens':None,'num_beam_groups':1,'diversity_penalty':0.0,'_name_or_path':'','transformers_version':'5.0.0.dev0','model_type':'llama','rope_theta':10000.0,'subfolder':None,'output_attentions':False};
    :model_config_class,LlamaConfig;
    :model_file,~/github/transformers/src/transformers/models/llama/modeling_llama.py;
    :model_id,arnir0/Tiny-LLM;
    :model_inputs,dict(input_ids:T7s2x3,attention_mask:T7s2x33,position_ids:T7s2x3,past_key_values:DynamicCache(key_cache=#1[T1s2x1x30x96], value_cache=#1[T1s2x1x30x96]));
    :model_inputs_options,;
    :model_module,transformers.models.llama.modeling_llama;
    :model_nweights,12988992;
    :model_shapes,dict(input_ids:{0:DYN(batch),1:DYN(seq_length)},attention_mask:{0:DYN(batch),1:DYN(cache+seq)},position_ids:{0:DYN(batch),1:DYN(seq_length)},past_key_values:#2[{0:DYN(batch),2:DYN(cache_length)},{0:DYN(batch),2:DYN(cache_length)}]);
    :model_size,51955968;
    :model_subfolder,;
    :model_task,text-generation;
    :run_expected,CausalLMOutputWithPast(logits:T1s2x3x32000,past_key_values:DynamicCache(key_cache=#1[T1s2x1x33x96], value_cache=#1[T1s2x1x33x96]));
    :run_expected22,CausalLMOutputWithPast(logits:T1s3x4x32000,past_key_values:DynamicCache(key_cache=#1[T1s3x1x35x96], value_cache=#1[T1s3x1x35x96]));
    :run_expected2_batch1,CausalLMOutputWithPast(logits:T1s1x3x32000,past_key_values:DynamicCache(key_cache=#1[T1s1x1x33x96], value_cache=#1[T1s1x1x33x96]));
    :run_expected2_empty_cache,CausalLMOutputWithPast(logits:T1s2x3x32000,past_key_values:DynamicCache(key_cache=#1[T1s2x1x3x96], value_cache=#1[T1s2x1x3x96]));
    :second_input_keys,inputs2,inputs_empty_cache,inputs_batch1;
    :time_create_torch_model,0.19912709499476478;
    :time_export_export,2.291683233997901;
    :time_preprocess_model_id,1.2106000212952495e-05;
    :time_run,0.007511919000535272;
    :time_run22,0.016417990998888854;
    :time_run2_batch1,0.01687990299978992;
    :time_run2_empty_cache,0.006715044997690711;
    :time_run_exported,0.012370118995022494;
    :time_run_patched,0.008229642000515014;
    :time_torch_export_export,2.291659355003503;
    :time_torch_export_export_n,1;
    :time_total_exporter,2.4549274969976977;
    :time_total_validation_torch,0.05480123299639672;
    :version_date,2025-11-01T15:05:05;
    :version_device,;
    :version_do_run,True;
    :version_drop_inputs,[];
    :version_dtype,;
    :version_dump_folder,dump_models;
    :version_exporter,export-nostrict;
    :version_inputs2,1;
    :version_model_id,arnir0/Tiny-LLM;
    :version_numpy,2.3.4;
    :version_onnx,1.20.0;
    :version_onnx_diagnostic,0.8.0;
    :version_onnx_ir,0.1.13;
    :version_onnxruntime,1.24.0;
    :version_onnxscript,?;
    :version_opset,18;
    :version_optimization,;
    :version_ortfusiontype,;
    :version_patch,{'patch': True};
    :version_patch_kwargs,{'patch':True,'patch_transformers':True,'patch_diffusers':True};
    :version_quiet,False;
    :version_rewrite,True;
    :version_runtime,onnxruntime;
    :version_same_as_pretrained,False;
    :version_scipy,1.16.2;
    :version_stop_if_static,0;
    :version_torch,2.10.0.dev20251022+cu130;
    :version_transformers,5.0.0.dev0;
    :version_use_pretrained,False;

Validate ONNX discrepancies

Let’s export with ONNX this time and checks for discrepancies.

python -m onnx_diagnostic validate -m arnir0/Tiny-LLM --run -v 1 --export onnx-dynamo -o dump_models --patch --opt ir
    [validate_model] dump into 'arnir0_Tiny-LLM/onnx-dynamo/ir/op18'
    [validate_model] validate model id 'arnir0/Tiny-LLM'
    [validate_model] patch={'patch': True}
    [validate_model] get dummy inputs with input_options=None...
    [validate_model] rewrite=True, patch_kwargs={'patch': True, 'patch_transformers': True, 'patch_diffusers': True}, stop_if_static=0
    [validate_model] exporter='onnx-dynamo', optimization='ir'
    [validate_model] dump_folder='dump_models/arnir0_Tiny-LLM/onnx-dynamo/ir/op18'
    [validate_model] output_names=None
    [get_untrained_model_with_inputs] model_id='arnir0/Tiny-LLM', subfolder=None
    [get_untrained_model_with_inputs] use preinstalled 'arnir0/Tiny-LLM'
    [get_untrained_model_with_inputs] architecture='LlamaForCausalLM'
    [get_untrained_model_with_inputs] cls='LlamaConfig'
    [get_untrained_model_with_inputs] task='text-generation'
    [get_untrained_model_with_inputs] default config._attn_implementation=None
    [get_untrained_model_with_inputs] package_source=transformers from ~/github/transformers/src/transformers/__init__.py
    [get_untrained_model_with_inputs] instantiate model_id 'arnir0/Tiny-LLM', subfolder=None
    [get_untrained_model_with_inputs] -- done(2) in 4.3024003389291465e-05s
    [get_untrained_model_with_inputs] instantiate_specific_model <class 'transformers.models.llama.modeling_llama.LlamaForCausalLM'>
    [get_untrained_model_with_inputs] -- done(3) in 1.2561999028548598e-05s (model is <class 'NoneType'>)
    [get_untrained_model_with_inputs] instantiate_specific_model(2) <class 'transformers.models.llama.modeling_llama.LlamaForCausalLM'>
    [get_untrained_model_with_inputs] -- done(4) in 0.21503722799388925s (model is <class 'transformers.models.llama.modeling_llama.LlamaForCausalLM'>)
    [get_untrained_model_with_inputs] use fct=<function get_inputs at 0x7fbf10b34900>
    [validate_model] --
    [validate_model] task=text-generation
    [validate_model] size=49.549072265625 Mb
    [validate_model] n_weights=12.988992 millions parameters
    [validate_model] +INPUT input_ids=T7s2x3
    [validate_model] +INPUT attention_mask=T7s2x33
    [validate_model] +INPUT position_ids=T7s2x3
    [validate_model] +INPUT past_key_values=DynamicCache(key_cache=#1[T1s2x1x30x96], value_cache=#1[T1s2x1x30x96])
    [validate_model] +SHAPE input_ids={0:DYN(batch),1:DYN(seq_length)}
    [validate_model] +SHAPE attention_mask={0:DYN(batch),1:DYN(cache+seq)}
    [validate_model] +SHAPE position_ids={0:DYN(batch),1:DYN(seq_length)}
    [validate_model] +SHAPE past_key_values=#2[{0:DYN(batch),2:DYN(cache_length)},{0:DYN(batch),2:DYN(cache_length)}]
    [validate_model] second_input_keys=['inputs2', 'inputs_empty_cache', 'inputs_batch1']
    [validate_model] --
    [validate_model] -- run the model inputs='inputs'...
    [validate_model] inputs=dict(input_ids:T7s2x3,attention_mask:T7s2x33,position_ids:T7s2x3,past_key_values:DynamicCache(key_cache=#1[T1s2x1x30x96], value_cache=#1[T1s2x1x30x96]))
    [validate_model] done ([run])
    [validate_model] -- run the model inputs='inputs2'...
    [validate_model] inputs2=dict(input_ids:T7s3x4,attention_mask:T7s3x35,position_ids:T7s3x4,past_key_values:DynamicCache(key_cache=#1[T1s3x1x31x96], value_cache=#1[T1s3x1x31x96]))
    [validate_model] done ([run22])
    [validate_model] -- run the model inputs='inputs_empty_cache'...
    [validate_model] inputs_empty_cache=dict(input_ids:T7s2x3,attention_mask:T7s2x3,position_ids:T7s2x3,past_key_values:DynamicCache(key_cache=#1[T1s2x1x0x96], value_cache=#1[T1s2x1x0x96]))
    [validate_model] done ([run2_empty_cache])
    [validate_model] -- run the model inputs='inputs_batch1'...
    [validate_model] inputs_batch1=dict(input_ids:T7s1x3,attention_mask:T7s1x33,position_ids:T7s1x3,past_key_values:DynamicCache(key_cache=#1[T1s1x1x30x96], value_cache=#1[T1s1x1x30x96]))
    [validate_model] done ([run2_batch1])
    [validate_model] -- export the model with 'onnx-dynamo', optimization='ir'
    [validate_model] applies patches before exporting stop_if_static=0
    [validate_model] run patched model...
    [validate_model] patched inputs=dict(input_ids:T7s2x3,attention_mask:T7s2x33,position_ids:T7s2x3,past_key_values:DynamicCache(key_cache=#1[T1s2x1x30x96], value_cache=#1[T1s2x1x30x96]))
    [validate_model] done (patched run)
    [validate_model] patched discrepancies=abs=0, rel=0
    [call_torch_export_onnx] exporter='onnx-dynamo', optimization='ir'
    [call_torch_export_onnx] args=()
    [call_torch_export_onnx] kwargs=dict(input_ids:T7s2x3,attention_mask:T7s2x33,position_ids:T7s2x3,past_key_values:DynamicCache(key_cache=#1[T1s2x1x30x96], value_cache=#1[T1s2x1x30x96]))
    [call_torch_export_onnx] dynamic_shapes=dict(input_ids:{0:DYN(batch),1:DYN(seq_length)},attention_mask:{0:DYN(batch),1:DYN(cache+seq)},position_ids:{0:DYN(batch),1:DYN(seq_length)},past_key_values:#2[{0:DYN(batch),2:DYN(cache_length)},{0:DYN(batch),2:DYN(cache_length)}])
    [call_torch_export_onnx] export...
    [call_torch_export_onnx] export_export_kwargs=dict(dynamo:bool,dynamic_shapes:dict(input_ids:{0:DYN(batch),1:DYN(seq_length)},attention_mask:{0:DYN(batch),1:DYN(cache+seq)},position_ids:{0:DYN(batch),1:DYN(seq_length)},past_key_values:#2[{0:DYN(batch),2:DYN(cache_length)},{0:DYN(batch),2:DYN(cache_length)}]),opset_version:int)
    [torch.onnx] Obtain model graph for `LlamaForCausalLM([...]` with `torch.export.export(..., strict=False)`...
    [torch.onnx] Obtain model graph for `LlamaForCausalLM([...]` with `torch.export.export(..., strict=False)`... ✅
    [torch.onnx] Run decomposition...
    [torch.onnx] Run decomposition... ✅
    [torch.onnx] Translate the graph into ONNX...
    [torch.onnx] Translate the graph into ONNX... ✅
    Applied 37 of general pattern rewrite rules.
    [call_torch_export_onnx] done (export)
    [call_torch_export_onnx] starts optimization='ir'...
    [call_torch_export_onnx] done (optimization)
    [validate_model] dumps onnx program in 'dump_models/arnir0_Tiny-LLM/onnx-dynamo/ir/op18'...
    [validate_model] done (dump onnx) in 0.48013086400169414
    [validate_model] dumps statistics in 'dump_models/arnir0_Tiny-LLM/onnx-dynamo/ir/op18'...
    [validate_model] done (dump)
    [validation_model] -- delete the model
    [validation_model] -- done
    [validate_onnx_model] verify onnx model with providers ['CPUExecutionProvider']..., flavour=None
    [validate_onnx_model] runtime is onnxruntime
    [validate_onnx_model] done (ort_session) flavour=None
    [validate_onnx_model] -- keys=[('inputs', 'run_expected', ''), ('inputs2', 'run_expected22', '22'), ('inputs_empty_cache', 'run_expected2_empty_cache', '2_empty_cache'), ('inputs_batch1', 'run_expected2_batch1', '2_batch1')]
    [validate_onnx_model] -- make_feeds for 'inputs'...
    [validate_onnx_model] inputs=dict(input_ids:T7s2x3,attention_mask:T7s2x33,position_ids:T7s2x3,past_key_values:DynamicCache(key_cache=#1[T1s2x1x30x96], value_cache=#1[T1s2x1x30x96]))
    [validate_onnx_model] ort inputs=dict(input_ids:A7s2x3,attention_mask:A7s2x33,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x30x96,past_key_values_value_0:A1s2x1x30x96)
    [validate_onnx_model] done (make_feeds)
    [validate_onnx_model] run session on inputs 'inputs'...
    [validate_onnx_model] done (run)
    [validate_onnx_model] got=#3[A1s2x3x32000,A1s2x1x33x96,A1s2x1x33x96]
    [validate_onnx_model] discrepancies=abs=7.748603820800781e-07, rel=0.00044309172106863606, n=204672.0
    [validate_onnx_model] -- make_feeds for 'inputs2'...
    [validate_onnx_model] inputs=dict(input_ids:T7s3x4,attention_mask:T7s3x35,position_ids:T7s3x4,past_key_values:DynamicCache(key_cache=#1[T1s3x1x31x96], value_cache=#1[T1s3x1x31x96]))
    [validate_onnx_model] ort inputs=dict(input_ids:A7s3x4,attention_mask:A7s3x35,position_ids:A7s3x4,past_key_values_key_0:A1s3x1x31x96,past_key_values_value_0:A1s3x1x31x96)
    [validate_onnx_model] done (make_feeds)
    [validate_onnx_model] run session on inputs 'inputs22'...
    [validate_onnx_model] done (run)
    [validate_onnx_model] got=#3[A1s3x4x32000,A1s3x1x35x96,A1s3x1x35x96]
    [validate_onnx_model] discrepancies=abs=8.344650268554688e-07, rel=0.00032628376058705446, n=404160.0
    [validate_onnx_model] -- make_feeds for 'inputs_empty_cache'...
    [validate_onnx_model] inputs=dict(input_ids:T7s2x3,attention_mask:T7s2x3,position_ids:T7s2x3,past_key_values:DynamicCache(key_cache=#1[T1s2x1x0x96], value_cache=#1[T1s2x1x0x96]))
    [validate_onnx_model] ort inputs=dict(input_ids:A7s2x3,attention_mask:A7s2x3,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x0x96,past_key_values_value_0:A1s2x1x0x96)
    [validate_onnx_model] done (make_feeds)
    [validate_onnx_model] run session on inputs 'inputs2_empty_cache'...
    [validate_onnx_model] done (run)
    [validate_onnx_model] got=#3[A1s2x3x32000,A1s2x1x3x96,A1s2x1x3x96]
    [validate_onnx_model] discrepancies=abs=6.556510925292969e-07, rel=0.0002520801563113858, n=193152.0
    [validate_onnx_model] -- make_feeds for 'inputs_batch1'...
    [validate_onnx_model] inputs=dict(input_ids:T7s1x3,attention_mask:T7s1x33,position_ids:T7s1x3,past_key_values:DynamicCache(key_cache=#1[T1s1x1x30x96], value_cache=#1[T1s1x1x30x96]))
    [validate_onnx_model] ort inputs=dict(input_ids:A7s1x3,attention_mask:A7s1x33,position_ids:A7s1x3,past_key_values_key_0:A1s1x1x30x96,past_key_values_value_0:A1s1x1x30x96)
    [validate_onnx_model] done (make_feeds)
    [validate_onnx_model] run session on inputs 'inputs2_batch1'...
    [validate_onnx_model] done (run)
    [validate_onnx_model] got=#3[A1s1x3x32000,A1s1x1x33x96,A1s1x1x33x96]
    [validate_onnx_model] discrepancies=abs=7.748603820800781e-07, rel=0.00037692802454639585, n=102336.0
    [validate_model] -- done (final)
    
    -- summary --
    :disc_onnx_ort_run22_abs,8.344650268554688e-07;
    :disc_onnx_ort_run22_dnan,0;
    :disc_onnx_ort_run22_n,404160.0;
    :disc_onnx_ort_run22_rel,0.00032628376058705446;
    :disc_onnx_ort_run22_sum,0.039355116007072866;
    :disc_onnx_ort_run2_batch1_abs,7.748603820800781e-07;
    :disc_onnx_ort_run2_batch1_dnan,0;
    :disc_onnx_ort_run2_batch1_n,102336.0;
    :disc_onnx_ort_run2_batch1_rel,0.00037692802454639585;
    :disc_onnx_ort_run2_batch1_sum,0.011718470417235949;
    :disc_onnx_ort_run2_empty_cache_abs,6.556510925292969e-07;
    :disc_onnx_ort_run2_empty_cache_dnan,0;
    :disc_onnx_ort_run2_empty_cache_n,193152.0;
    :disc_onnx_ort_run2_empty_cache_rel,0.0002520801563113858;
    :disc_onnx_ort_run2_empty_cache_sum,0.012811366097139398;
    :disc_onnx_ort_run_abs,7.748603820800781e-07;
    :disc_onnx_ort_run_dnan,0;
    :disc_onnx_ort_run_n,204672.0;
    :disc_onnx_ort_run_rel,0.00044309172106863606;
    :disc_onnx_ort_run_sum,0.02031988672524676;
    :disc_patched_abs,0;
    :disc_patched_dnan,0;
    :disc_patched_n,204672.0;
    :disc_patched_rel,0;
    :disc_patched_sum,0.0;
    :dump_folder,dump_models/arnir0_Tiny-LLM/onnx-dynamo/ir/op18;
    :dump_folder_name,arnir0_Tiny-LLM/onnx-dynamo/ir/op18;
    :export_args,();
    :export_dynamo,True;
    :export_exporter,onnx-dynamo;
    :export_kwargs,dict(input_ids:T7s2x3,attention_mask:T7s2x33,position_ids:T7s2x3,past_key_values:DynamicCache(key_cache=#1[T1s2x1x30x96], value_cache=#1[T1s2x1x30x96]));
    :export_opset,18;
    :export_optimization,ir;
    :model_class,LlamaForCausalLM;
    :model_config,{'vocab_size':32000,'max_position_embeddings':1024,'hidden_size':192,'intermediate_size':1024,'num_hidden_layers':1,'num_attention_heads':2,'num_key_value_heads':1,'hidden_act':'silu','initializer_range':0.02,'rms_norm_eps':1e-05,'pretraining_tp':1,'use_cache':True,'attention_bias':False,'attention_dropout':0.0,'mlp_bias':False,'head_dim':96,'rope_parameters':{'rope_type':'default','rope_theta':10000.0},'return_dict':True,'output_hidden_states':False,'dtype':'float32','tie_word_embeddings':False,'chunk_size_feed_forward':0,'is_encoder_decoder':False,'is_decoder':False,'cross_attention_hidden_size':None,'add_cross_attention':False,'tie_encoder_decoder':False,'architectures':['LlamaForCausalLM'],'finetuning_task':None,'id2label':{0:'LABEL_0',1:'LABEL_1'},'label2id':{'LABEL_0':0,'LABEL_1':1},'task_specific_params':None,'problem_type':None,'tokenizer_class':None,'prefix':None,'bos_token_id':1,'pad_token_id':None,'eos_token_id':2,'sep_token_id':None,'decoder_start_token_id':None,'max_length':20,'min_length':0,'do_sample':False,'early_stopping':False,'num_beams':1,'temperature':1.0,'top_k':50,'top_p':1.0,'typical_p':1.0,'repetition_penalty':1.0,'length_penalty':1.0,'no_repeat_ngram_size':0,'encoder_no_repeat_ngram_size':0,'bad_words_ids':None,'num_return_sequences':1,'output_scores':False,'return_dict_in_generate':False,'forced_bos_token_id':None,'forced_eos_token_id':None,'remove_invalid_values':False,'exponential_decay_length_penalty':None,'suppress_tokens':None,'begin_suppress_tokens':None,'num_beam_groups':1,'diversity_penalty':0.0,'_name_or_path':'','transformers_version':'5.0.0.dev0','model_type':'llama','rope_theta':10000.0,'subfolder':None,'output_attentions':False};
    :model_config_class,LlamaConfig;
    :model_file,~/github/transformers/src/transformers/models/llama/modeling_llama.py;
    :model_id,arnir0/Tiny-LLM;
    :model_inputs,dict(input_ids:T7s2x3,attention_mask:T7s2x33,position_ids:T7s2x3,past_key_values:DynamicCache(key_cache=#1[T1s2x1x30x96], value_cache=#1[T1s2x1x30x96]));
    :model_inputs_options,;
    :model_module,transformers.models.llama.modeling_llama;
    :model_nweights,12988992;
    :model_shapes,dict(input_ids:{0:DYN(batch),1:DYN(seq_length)},attention_mask:{0:DYN(batch),1:DYN(cache+seq)},position_ids:{0:DYN(batch),1:DYN(seq_length)},past_key_values:#2[{0:DYN(batch),2:DYN(cache_length)},{0:DYN(batch),2:DYN(cache_length)}]);
    :model_size,51955968;
    :model_subfolder,;
    :model_task,text-generation;
    :n_node_Add,10;
    :n_node_And,2;
    :n_node_Cast,2;
    :n_node_Concat,16;
    :n_node_Cos,1;
    :n_node_Expand,6;
    :n_node_Gather,1;
    :n_node_GatherND,1;
    :n_node_IsNaN,1;
    :n_node_LessOrEqual,1;
    :n_node_MatMul,11;
    :n_node_Mul,14;
    :n_node_Neg,2;
    :n_node_Pow,3;
    :n_node_Range,3;
    :n_node_Reciprocal,3;
    :n_node_ReduceMean,3;
    :n_node_Reshape,13;
    :n_node_Shape,5;
    :n_node_Sigmoid,1;
    :n_node_Sin,1;
    :n_node_Slice,7;
    :n_node_Softmax,1;
    :n_node_Sqrt,3;
    :n_node_Squeeze,4;
    :n_node_Transpose,6;
    :n_node_Unsqueeze,7;
    :n_node_Where,2;
    :n_node_functions,0;
    :n_node_initializer_1,16;
    :n_node_initializer_7,15;
    :n_node_initializer_9,1;
    :n_node_nodes,130;
    :n_node_nodes_nocst,130;
    :onnx_filename,dump_models/arnir0_Tiny-LLM/onnx-dynamo/ir/op18/arnir0_Tiny-LLM-onnx-dynamo-ir-op18.onnx;
    :onnx_ort_inputs,dict(input_ids:A7s2x3,attention_mask:A7s2x33,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x30x96,past_key_values_value_0:A1s2x1x30x96);
    :onnx_ort_inputs22,dict(input_ids:A7s3x4,attention_mask:A7s3x35,position_ids:A7s3x4,past_key_values_key_0:A1s3x1x31x96,past_key_values_value_0:A1s3x1x31x96);
    :onnx_ort_inputs2_batch1,dict(input_ids:A7s1x3,attention_mask:A7s1x33,position_ids:A7s1x3,past_key_values_key_0:A1s1x1x30x96,past_key_values_value_0:A1s1x1x30x96);
    :onnx_ort_inputs2_empty_cache,dict(input_ids:A7s2x3,attention_mask:A7s2x3,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x0x96,past_key_values_value_0:A1s2x1x0x96);
    :onnx_size,200588;
    :run_expected,CausalLMOutputWithPast(logits:T1s2x3x32000,past_key_values:DynamicCache(key_cache=#1[T1s2x1x33x96], value_cache=#1[T1s2x1x33x96]));
    :run_expected22,CausalLMOutputWithPast(logits:T1s3x4x32000,past_key_values:DynamicCache(key_cache=#1[T1s3x1x35x96], value_cache=#1[T1s3x1x35x96]));
    :run_expected2_batch1,CausalLMOutputWithPast(logits:T1s1x3x32000,past_key_values:DynamicCache(key_cache=#1[T1s1x1x33x96], value_cache=#1[T1s1x1x33x96]));
    :run_expected2_empty_cache,CausalLMOutputWithPast(logits:T1s2x3x32000,past_key_values:DynamicCache(key_cache=#1[T1s2x1x3x96], value_cache=#1[T1s2x1x3x96]));
    :run_feeds_inputs,dict(input_ids:A7s2x3,attention_mask:A7s2x33,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x30x96,past_key_values_value_0:A1s2x1x30x96);
    :run_feeds_inputs2,dict(input_ids:A7s3x4,attention_mask:A7s3x35,position_ids:A7s3x4,past_key_values_key_0:A1s3x1x31x96,past_key_values_value_0:A1s3x1x31x96);
    :run_feeds_inputs_batch1,dict(input_ids:A7s1x3,attention_mask:A7s1x33,position_ids:A7s1x3,past_key_values_key_0:A1s1x1x30x96,past_key_values_value_0:A1s1x1x30x96);
    :run_feeds_inputs_empty_cache,dict(input_ids:A7s2x3,attention_mask:A7s2x3,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x0x96,past_key_values_value_0:A1s2x1x0x96);
    :run_output_inputs,#3[A1s2x3x32000,A1s2x1x33x96,A1s2x1x33x96];
    :run_output_inputs2,#3[A1s3x4x32000,A1s3x1x35x96,A1s3x1x35x96];
    :run_output_inputs_batch1,#3[A1s1x3x32000,A1s1x1x33x96,A1s1x1x33x96];
    :run_output_inputs_empty_cache,#3[A1s2x3x32000,A1s2x1x3x96,A1s2x1x3x96];
    :second_input_keys,inputs2,inputs_empty_cache,inputs_batch1;
    :time_create_onnx_ort,0.16598876799980644;
    :time_create_torch_model,0.3134976149958675;
    :time_export_onnx,9.357770736998646;
    :time_export_onnx_opt_ir,0.07243620100052794;
    :time_onnx_save,0.48013086400169414;
    :time_preprocess_model_id,3.469001967459917e-06;
    :time_run,0.021039114995801356;
    :time_run22,0.037541099001828115;
    :time_run2_batch1,0.02231757599656703;
    :time_run2_empty_cache,0.025835968997853342;
    :time_run_onnx_ort,0.015803921000042465;
    :time_run_onnx_ort22,0.002945838001323864;
    :time_run_onnx_ort2_batch1,0.0026574869989417493;
    :time_run_onnx_ort2_empty_cache,0.0017342149949399754;
    :time_run_patched,0.005344776000129059;
    :time_torch_export_export,2.5741547889992944;
    :time_torch_export_export_n,1;
    :time_total,13.848575436997635;
    :time_total_exporter,12.041747443996428;
    :time_total_validation_onnx,0.27190283500385704;
    :time_total_validation_torch,0.1268702560046222;
    :version_date,2025-11-01T15:05:21;
    :version_device,;
    :version_do_run,True;
    :version_drop_inputs,[];
    :version_dtype,;
    :version_dump_folder,dump_models;
    :version_exporter,onnx-dynamo;
    :version_inputs2,1;
    :version_model_id,arnir0/Tiny-LLM;
    :version_numpy,2.3.4;
    :version_onnx,1.20.0;
    :version_onnx_diagnostic,0.8.0;
    :version_onnx_ir,0.1.13;
    :version_onnxruntime,1.24.0;
    :version_onnxscript,?;
    :version_opset,18;
    :version_optimization,ir;
    :version_ortfusiontype,;
    :version_patch,{'patch': True};
    :version_patch_kwargs,{'patch':True,'patch_transformers':True,'patch_diffusers':True};
    :version_quiet,False;
    :version_rewrite,True;
    :version_runtime,onnxruntime;
    :version_same_as_pretrained,False;
    :version_scipy,1.16.2;
    :version_stop_if_static,0;
    :version_torch,2.10.0.dev20251022+cu130;
    :version_transformers,5.0.0.dev0;
    :version_use_pretrained,False;
    [runpythonerror]
    W1101 15:05:26.957000 74592 torch/fx/experimental/symbolic_shapes.py:6918] _maybe_guard_rel() was called on non-relation expression Eq(s72, 1) | Eq(Max(s44, s72), s72)
    W1101 15:05:26.965000 74592 torch/fx/experimental/symbolic_shapes.py:6918] _maybe_guard_rel() was called on non-relation expression Eq(s70, 1) | Eq(Max(s70, s9), s70)
    W1101 15:05:27.003000 74592 torch/fx/experimental/symbolic_shapes.py:6918] _maybe_guard_rel() was called on non-relation expression Eq(s44, 1) | Eq(Max(s44, s72), s44)
    W1101 15:05:27.009000 74592 torch/fx/experimental/symbolic_shapes.py:6918] _maybe_guard_rel() was called on non-relation expression Eq(s9, 1) | Eq(Max(s70, s9), s9)
    ~/vv/this312/lib/python3.12/site-packages/torch/onnx/_internal/exporter/_dynamic_shapes.py:272: UserWarning: # The axis name: batch will not be used, since it shares the same shape constraints with another axis: batch.
      warnings.warn(
    ~/vv/this312/lib/python3.12/site-packages/torch/onnx/_internal/exporter/_dynamic_shapes.py:272: UserWarning: # The axis name: seq_length will not be used, since it shares the same shape constraints with another axis: seq_length.
      warnings.warn(

Run onnxruntime fusions

This option runs transformers optimizations implemented in onnxruntime. The list of supported model_type can be found in the documentation of function onnx_diagnostic.torch_models.validate.run_ort_fusion().

python -m onnx_diagnostic validate -m arnir0/Tiny-LLM --run -v 1 --export onnx-dynamo -o dump_models --patch --opt ir --ortfusiontype ALL
    [validate_model] dump into 'arnir0_Tiny-LLM/onnx-dynamo/ir/op18'
    [validate_model] validate model id 'arnir0/Tiny-LLM'
    [validate_model] patch={'patch': True}
    [validate_model] get dummy inputs with input_options=None...
    [validate_model] rewrite=True, patch_kwargs={'patch': True, 'patch_transformers': True, 'patch_diffusers': True}, stop_if_static=0
    [validate_model] exporter='onnx-dynamo', optimization='ir'
    [validate_model] dump_folder='dump_models/arnir0_Tiny-LLM/onnx-dynamo/ir/op18'
    [validate_model] output_names=None
    [get_untrained_model_with_inputs] model_id='arnir0/Tiny-LLM', subfolder=None
    [get_untrained_model_with_inputs] use preinstalled 'arnir0/Tiny-LLM'
    [get_untrained_model_with_inputs] architecture='LlamaForCausalLM'
    [get_untrained_model_with_inputs] cls='LlamaConfig'
    [get_untrained_model_with_inputs] task='text-generation'
    [get_untrained_model_with_inputs] default config._attn_implementation=None
    [get_untrained_model_with_inputs] package_source=transformers from ~/github/transformers/src/transformers/__init__.py
    [get_untrained_model_with_inputs] instantiate model_id 'arnir0/Tiny-LLM', subfolder=None
    [get_untrained_model_with_inputs] -- done(2) in 5.611199594568461e-05s
    [get_untrained_model_with_inputs] instantiate_specific_model <class 'transformers.models.llama.modeling_llama.LlamaForCausalLM'>
    [get_untrained_model_with_inputs] -- done(3) in 1.229000190505758e-05s (model is <class 'NoneType'>)
    [get_untrained_model_with_inputs] instantiate_specific_model(2) <class 'transformers.models.llama.modeling_llama.LlamaForCausalLM'>
    [get_untrained_model_with_inputs] -- done(4) in 0.19395697700383607s (model is <class 'transformers.models.llama.modeling_llama.LlamaForCausalLM'>)
    [get_untrained_model_with_inputs] use fct=<function get_inputs at 0x754493230a40>
    [validate_model] --
    [validate_model] task=text-generation
    [validate_model] size=49.549072265625 Mb
    [validate_model] n_weights=12.988992 millions parameters
    [validate_model] +INPUT input_ids=T7s2x3
    [validate_model] +INPUT attention_mask=T7s2x33
    [validate_model] +INPUT position_ids=T7s2x3
    [validate_model] +INPUT past_key_values=DynamicCache(key_cache=#1[T1s2x1x30x96], value_cache=#1[T1s2x1x30x96])
    [validate_model] +SHAPE input_ids={0:DYN(batch),1:DYN(seq_length)}
    [validate_model] +SHAPE attention_mask={0:DYN(batch),1:DYN(cache+seq)}
    [validate_model] +SHAPE position_ids={0:DYN(batch),1:DYN(seq_length)}
    [validate_model] +SHAPE past_key_values=#2[{0:DYN(batch),2:DYN(cache_length)},{0:DYN(batch),2:DYN(cache_length)}]
    [validate_model] second_input_keys=['inputs2', 'inputs_empty_cache', 'inputs_batch1']
    [validate_model] --
    [validate_model] -- run the model inputs='inputs'...
    [validate_model] inputs=dict(input_ids:T7s2x3,attention_mask:T7s2x33,position_ids:T7s2x3,past_key_values:DynamicCache(key_cache=#1[T1s2x1x30x96], value_cache=#1[T1s2x1x30x96]))
    [validate_model] done ([run])
    [validate_model] -- run the model inputs='inputs2'...
    [validate_model] inputs2=dict(input_ids:T7s3x4,attention_mask:T7s3x35,position_ids:T7s3x4,past_key_values:DynamicCache(key_cache=#1[T1s3x1x31x96], value_cache=#1[T1s3x1x31x96]))
    [validate_model] done ([run22])
    [validate_model] -- run the model inputs='inputs_empty_cache'...
    [validate_model] inputs_empty_cache=dict(input_ids:T7s2x3,attention_mask:T7s2x3,position_ids:T7s2x3,past_key_values:DynamicCache(key_cache=#1[T1s2x1x0x96], value_cache=#1[T1s2x1x0x96]))
    [validate_model] done ([run2_empty_cache])
    [validate_model] -- run the model inputs='inputs_batch1'...
    [validate_model] inputs_batch1=dict(input_ids:T7s1x3,attention_mask:T7s1x33,position_ids:T7s1x3,past_key_values:DynamicCache(key_cache=#1[T1s1x1x30x96], value_cache=#1[T1s1x1x30x96]))
    [validate_model] done ([run2_batch1])
    [validate_model] -- export the model with 'onnx-dynamo', optimization='ir'
    [validate_model] applies patches before exporting stop_if_static=0
    [validate_model] run patched model...
    [validate_model] patched inputs=dict(input_ids:T7s2x3,attention_mask:T7s2x33,position_ids:T7s2x3,past_key_values:DynamicCache(key_cache=#1[T1s2x1x30x96], value_cache=#1[T1s2x1x30x96]))
    [validate_model] done (patched run)
    [validate_model] patched discrepancies=abs=0, rel=0
    [call_torch_export_onnx] exporter='onnx-dynamo', optimization='ir'
    [call_torch_export_onnx] args=()
    [call_torch_export_onnx] kwargs=dict(input_ids:T7s2x3,attention_mask:T7s2x33,position_ids:T7s2x3,past_key_values:DynamicCache(key_cache=#1[T1s2x1x30x96], value_cache=#1[T1s2x1x30x96]))
    [call_torch_export_onnx] dynamic_shapes=dict(input_ids:{0:DYN(batch),1:DYN(seq_length)},attention_mask:{0:DYN(batch),1:DYN(cache+seq)},position_ids:{0:DYN(batch),1:DYN(seq_length)},past_key_values:#2[{0:DYN(batch),2:DYN(cache_length)},{0:DYN(batch),2:DYN(cache_length)}])
    [call_torch_export_onnx] export...
    [call_torch_export_onnx] export_export_kwargs=dict(dynamo:bool,dynamic_shapes:dict(input_ids:{0:DYN(batch),1:DYN(seq_length)},attention_mask:{0:DYN(batch),1:DYN(cache+seq)},position_ids:{0:DYN(batch),1:DYN(seq_length)},past_key_values:#2[{0:DYN(batch),2:DYN(cache_length)},{0:DYN(batch),2:DYN(cache_length)}]),opset_version:int)
    [torch.onnx] Obtain model graph for `LlamaForCausalLM([...]` with `torch.export.export(..., strict=False)`...
    [torch.onnx] Obtain model graph for `LlamaForCausalLM([...]` with `torch.export.export(..., strict=False)`... ✅
    [torch.onnx] Run decomposition...
    [torch.onnx] Run decomposition... ✅
    [torch.onnx] Translate the graph into ONNX...
    [torch.onnx] Translate the graph into ONNX... ✅
    Applied 37 of general pattern rewrite rules.
    [call_torch_export_onnx] done (export)
    [call_torch_export_onnx] starts optimization='ir'...
    [call_torch_export_onnx] done (optimization)
    [validate_model] dumps onnx program in 'dump_models/arnir0_Tiny-LLM/onnx-dynamo/ir/op18'...
    [validate_model] done (dump onnx) in 0.3300936319938046
    [validate_model] dumps statistics in 'dump_models/arnir0_Tiny-LLM/onnx-dynamo/ir/op18'...
    [validate_model] done (dump)
    [validation_model] -- delete the model
    [validation_model] -- done
    [validate_onnx_model] verify onnx model with providers ['CPUExecutionProvider']..., flavour=None
    [validate_onnx_model] runtime is onnxruntime
    [validate_onnx_model] done (ort_session) flavour=None
    [validate_onnx_model] -- keys=[('inputs', 'run_expected', ''), ('inputs2', 'run_expected22', '22'), ('inputs_empty_cache', 'run_expected2_empty_cache', '2_empty_cache'), ('inputs_batch1', 'run_expected2_batch1', '2_batch1')]
    [validate_onnx_model] -- make_feeds for 'inputs'...
    [validate_onnx_model] inputs=dict(input_ids:T7s2x3,attention_mask:T7s2x33,position_ids:T7s2x3,past_key_values:DynamicCache(key_cache=#1[T1s2x1x30x96], value_cache=#1[T1s2x1x30x96]))
    [validate_onnx_model] ort inputs=dict(input_ids:A7s2x3,attention_mask:A7s2x33,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x30x96,past_key_values_value_0:A1s2x1x30x96)
    [validate_onnx_model] done (make_feeds)
    [validate_onnx_model] run session on inputs 'inputs'...
    [validate_onnx_model] done (run)
    [validate_onnx_model] got=#3[A1s2x3x32000,A1s2x1x33x96,A1s2x1x33x96]
    [validate_onnx_model] discrepancies=abs=7.748603820800781e-07, rel=0.00044309172106863606, n=204672.0
    [validate_onnx_model] -- make_feeds for 'inputs2'...
    [validate_onnx_model] inputs=dict(input_ids:T7s3x4,attention_mask:T7s3x35,position_ids:T7s3x4,past_key_values:DynamicCache(key_cache=#1[T1s3x1x31x96], value_cache=#1[T1s3x1x31x96]))
    [validate_onnx_model] ort inputs=dict(input_ids:A7s3x4,attention_mask:A7s3x35,position_ids:A7s3x4,past_key_values_key_0:A1s3x1x31x96,past_key_values_value_0:A1s3x1x31x96)
    [validate_onnx_model] done (make_feeds)
    [validate_onnx_model] run session on inputs 'inputs22'...
    [validate_onnx_model] done (run)
    [validate_onnx_model] got=#3[A1s3x4x32000,A1s3x1x35x96,A1s3x1x35x96]
    [validate_onnx_model] discrepancies=abs=8.344650268554688e-07, rel=0.00032628376058705446, n=404160.0
    [validate_onnx_model] -- make_feeds for 'inputs_empty_cache'...
    [validate_onnx_model] inputs=dict(input_ids:T7s2x3,attention_mask:T7s2x3,position_ids:T7s2x3,past_key_values:DynamicCache(key_cache=#1[T1s2x1x0x96], value_cache=#1[T1s2x1x0x96]))
    [validate_onnx_model] ort inputs=dict(input_ids:A7s2x3,attention_mask:A7s2x3,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x0x96,past_key_values_value_0:A1s2x1x0x96)
    [validate_onnx_model] done (make_feeds)
    [validate_onnx_model] run session on inputs 'inputs2_empty_cache'...
    [validate_onnx_model] done (run)
    [validate_onnx_model] got=#3[A1s2x3x32000,A1s2x1x3x96,A1s2x1x3x96]
    [validate_onnx_model] discrepancies=abs=6.556510925292969e-07, rel=0.0002520801563113858, n=193152.0
    [validate_onnx_model] -- make_feeds for 'inputs_batch1'...
    [validate_onnx_model] inputs=dict(input_ids:T7s1x3,attention_mask:T7s1x33,position_ids:T7s1x3,past_key_values:DynamicCache(key_cache=#1[T1s1x1x30x96], value_cache=#1[T1s1x1x30x96]))
    [validate_onnx_model] ort inputs=dict(input_ids:A7s1x3,attention_mask:A7s1x33,position_ids:A7s1x3,past_key_values_key_0:A1s1x1x30x96,past_key_values_value_0:A1s1x1x30x96)
    [validate_onnx_model] done (make_feeds)
    [validate_onnx_model] run session on inputs 'inputs2_batch1'...
    [validate_onnx_model] done (run)
    [validate_onnx_model] got=#3[A1s1x3x32000,A1s1x1x33x96,A1s1x1x33x96]
    [validate_onnx_model] discrepancies=abs=7.748603820800781e-07, rel=0.00037692802454639585, n=102336.0
    [validate_model] run onnxruntime fusion for 'bart'
    failed in shape inference <class 'AssertionError'>
    [validate_model] done 'bart' in 0.28666278300079284, saved into 'dump_models/arnir0_Tiny-LLM/onnx-dynamo/ir/op18/arnir0_Tiny-LLM-onnx-dynamo-ir-op18.ort.bart.onnx'
    [validate_onnx_model] verify onnx model with providers ['CPUExecutionProvider']..., flavour='ortbart'
    [validate_onnx_model] runtime is onnxruntime
    [validate_onnx_model] done (ort_session) flavour='ortbart'
    [validate_onnx_model] -- keys=[('inputs', 'run_expected', ''), ('inputs2', 'run_expected22', '22'), ('inputs_empty_cache', 'run_expected2_empty_cache', '2_empty_cache'), ('inputs_batch1', 'run_expected2_batch1', '2_batch1')]
    [validate_onnx_model] -- make_feeds for 'inputs'...
    [validate_onnx_model] inputs=dict(input_ids:T7s2x3,attention_mask:T7s2x33,position_ids:T7s2x3,past_key_values:DynamicCache(key_cache=#1[T1s2x1x30x96], value_cache=#1[T1s2x1x30x96]))
    [validate_onnx_model] ort inputs=dict(input_ids:A7s2x3,attention_mask:A7s2x33,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x30x96,past_key_values_value_0:A1s2x1x30x96)
    [validate_onnx_model] done (make_feeds)
    [validate_onnx_model] run session on inputs 'inputs'...
    [validate_onnx_model] done (run)
    [validate_onnx_model] got=#3[A1s2x3x32000,A1s2x1x33x96,A1s2x1x33x96]
    [validate_onnx_model] discrepancies=abs=8.344650268554688e-07, rel=0.00038373230338287646, n=204672.0
    [validate_onnx_model] -- make_feeds for 'inputs2'...
    [validate_onnx_model] inputs=dict(input_ids:T7s3x4,attention_mask:T7s3x35,position_ids:T7s3x4,past_key_values:DynamicCache(key_cache=#1[T1s3x1x31x96], value_cache=#1[T1s3x1x31x96]))
    [validate_onnx_model] ort inputs=dict(input_ids:A7s3x4,attention_mask:A7s3x35,position_ids:A7s3x4,past_key_values_key_0:A1s3x1x31x96,past_key_values_value_0:A1s3x1x31x96)
    [validate_onnx_model] done (make_feeds)
    [validate_onnx_model] run session on inputs 'inputs22'...
    [validate_onnx_model] done (run)
    [validate_onnx_model] got=#3[A1s3x4x32000,A1s3x1x35x96,A1s3x1x35x96]
    [validate_onnx_model] discrepancies=abs=1.0132789611816406e-06, rel=0.00026781923006472165, n=404160.0
    [validate_onnx_model] -- make_feeds for 'inputs_empty_cache'...
    [validate_onnx_model] inputs=dict(input_ids:T7s2x3,attention_mask:T7s2x3,position_ids:T7s2x3,past_key_values:DynamicCache(key_cache=#1[T1s2x1x0x96], value_cache=#1[T1s2x1x0x96]))
    [validate_onnx_model] ort inputs=dict(input_ids:A7s2x3,attention_mask:A7s2x3,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x0x96,past_key_values_value_0:A1s2x1x0x96)
    [validate_onnx_model] done (make_feeds)
    [validate_onnx_model] run session on inputs 'inputs2_empty_cache'...
    [validate_onnx_model] done (run)
    [validate_onnx_model] got=#3[A1s2x3x32000,A1s2x1x3x96,A1s2x1x3x96]
    [validate_onnx_model] discrepancies=abs=1.0728836059570312e-06, rel=0.00034613720184126256, n=193152.0
    [validate_onnx_model] -- make_feeds for 'inputs_batch1'...
    [validate_onnx_model] inputs=dict(input_ids:T7s1x3,attention_mask:T7s1x33,position_ids:T7s1x3,past_key_values:DynamicCache(key_cache=#1[T1s1x1x30x96], value_cache=#1[T1s1x1x30x96]))
    [validate_onnx_model] ort inputs=dict(input_ids:A7s1x3,attention_mask:A7s1x33,position_ids:A7s1x3,past_key_values_key_0:A1s1x1x30x96,past_key_values_value_0:A1s1x1x30x96)
    [validate_onnx_model] done (make_feeds)
    [validate_onnx_model] run session on inputs 'inputs2_batch1'...
    [validate_onnx_model] done (run)
    [validate_onnx_model] got=#3[A1s1x3x32000,A1s1x1x33x96,A1s1x1x33x96]
    [validate_onnx_model] discrepancies=abs=1.1026859283447266e-06, rel=0.0003551108443373418, n=102336.0
    [validate_model] run onnxruntime fusion for 'bert'
    failed in shape inference <class 'AssertionError'>
    [validate_model] done 'bert' in 0.309065601999464, saved into 'dump_models/arnir0_Tiny-LLM/onnx-dynamo/ir/op18/arnir0_Tiny-LLM-onnx-dynamo-ir-op18.ort.bert.onnx'
    [validate_onnx_model] verify onnx model with providers ['CPUExecutionProvider']..., flavour='ortbert'
    [validate_onnx_model] runtime is onnxruntime
    [validate_onnx_model] done (ort_session) flavour='ortbert'
    [validate_onnx_model] -- keys=[('inputs', 'run_expected', ''), ('inputs2', 'run_expected22', '22'), ('inputs_empty_cache', 'run_expected2_empty_cache', '2_empty_cache'), ('inputs_batch1', 'run_expected2_batch1', '2_batch1')]
    [validate_onnx_model] -- make_feeds for 'inputs'...
    [validate_onnx_model] inputs=dict(input_ids:T7s2x3,attention_mask:T7s2x33,position_ids:T7s2x3,past_key_values:DynamicCache(key_cache=#1[T1s2x1x30x96], value_cache=#1[T1s2x1x30x96]))
    [validate_onnx_model] ort inputs=dict(input_ids:A7s2x3,attention_mask:A7s2x33,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x30x96,past_key_values_value_0:A1s2x1x30x96)
    [validate_onnx_model] done (make_feeds)
    [validate_onnx_model] run session on inputs 'inputs'...
    [validate_onnx_model] done (run)
    [validate_onnx_model] got=#3[A1s2x3x32000,A1s2x1x33x96,A1s2x1x33x96]
    [validate_onnx_model] discrepancies=abs=8.344650268554688e-07, rel=0.00038373230338287646, n=204672.0
    [validate_onnx_model] -- make_feeds for 'inputs2'...
    [validate_onnx_model] inputs=dict(input_ids:T7s3x4,attention_mask:T7s3x35,position_ids:T7s3x4,past_key_values:DynamicCache(key_cache=#1[T1s3x1x31x96], value_cache=#1[T1s3x1x31x96]))
    [validate_onnx_model] ort inputs=dict(input_ids:A7s3x4,attention_mask:A7s3x35,position_ids:A7s3x4,past_key_values_key_0:A1s3x1x31x96,past_key_values_value_0:A1s3x1x31x96)
    [validate_onnx_model] done (make_feeds)
    [validate_onnx_model] run session on inputs 'inputs22'...
    [validate_onnx_model] done (run)
    [validate_onnx_model] got=#3[A1s3x4x32000,A1s3x1x35x96,A1s3x1x35x96]
    [validate_onnx_model] discrepancies=abs=1.0132789611816406e-06, rel=0.00026781923006472165, n=404160.0
    [validate_onnx_model] -- make_feeds for 'inputs_empty_cache'...
    [validate_onnx_model] inputs=dict(input_ids:T7s2x3,attention_mask:T7s2x3,position_ids:T7s2x3,past_key_values:DynamicCache(key_cache=#1[T1s2x1x0x96], value_cache=#1[T1s2x1x0x96]))
    [validate_onnx_model] ort inputs=dict(input_ids:A7s2x3,attention_mask:A7s2x3,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x0x96,past_key_values_value_0:A1s2x1x0x96)
    [validate_onnx_model] done (make_feeds)
    [validate_onnx_model] run session on inputs 'inputs2_empty_cache'...
    [validate_onnx_model] done (run)
    [validate_onnx_model] got=#3[A1s2x3x32000,A1s2x1x3x96,A1s2x1x3x96]
    [validate_onnx_model] discrepancies=abs=1.0728836059570312e-06, rel=0.00034613720184126256, n=193152.0
    [validate_onnx_model] -- make_feeds for 'inputs_batch1'...
    [validate_onnx_model] inputs=dict(input_ids:T7s1x3,attention_mask:T7s1x33,position_ids:T7s1x3,past_key_values:DynamicCache(key_cache=#1[T1s1x1x30x96], value_cache=#1[T1s1x1x30x96]))
    [validate_onnx_model] ort inputs=dict(input_ids:A7s1x3,attention_mask:A7s1x33,position_ids:A7s1x3,past_key_values_key_0:A1s1x1x30x96,past_key_values_value_0:A1s1x1x30x96)
    [validate_onnx_model] done (make_feeds)
    [validate_onnx_model] run session on inputs 'inputs2_batch1'...
    [validate_onnx_model] done (run)
    [validate_onnx_model] got=#3[A1s1x3x32000,A1s1x1x33x96,A1s1x1x33x96]
    [validate_onnx_model] discrepancies=abs=1.1026859283447266e-06, rel=0.0003551108443373418, n=102336.0
    [validate_model] run onnxruntime fusion for 'bert_keras'
    failed in shape inference <class 'AssertionError'>
    [validate_model] done 'bert_keras' in 0.29284356900461717, saved into 'dump_models/arnir0_Tiny-LLM/onnx-dynamo/ir/op18/arnir0_Tiny-LLM-onnx-dynamo-ir-op18.ort.bert_keras.onnx'
    [validate_onnx_model] verify onnx model with providers ['CPUExecutionProvider']..., flavour='ortbert_keras'
    [validate_onnx_model] runtime is onnxruntime
    [validate_onnx_model] done (ort_session) flavour='ortbert_keras'
    [validate_onnx_model] -- keys=[('inputs', 'run_expected', ''), ('inputs2', 'run_expected22', '22'), ('inputs_empty_cache', 'run_expected2_empty_cache', '2_empty_cache'), ('inputs_batch1', 'run_expected2_batch1', '2_batch1')]
    [validate_onnx_model] -- make_feeds for 'inputs'...
    [validate_onnx_model] inputs=dict(input_ids:T7s2x3,attention_mask:T7s2x33,position_ids:T7s2x3,past_key_values:DynamicCache(key_cache=#1[T1s2x1x30x96], value_cache=#1[T1s2x1x30x96]))
    [validate_onnx_model] ort inputs=dict(input_ids:A7s2x3,attention_mask:A7s2x33,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x30x96,past_key_values_value_0:A1s2x1x30x96)
    [validate_onnx_model] done (make_feeds)
    [validate_onnx_model] run session on inputs 'inputs'...
    [validate_onnx_model] done (run)
    [validate_onnx_model] got=#3[A1s2x3x32000,A1s2x1x33x96,A1s2x1x33x96]
    [validate_onnx_model] discrepancies=abs=8.344650268554688e-07, rel=0.00038373230338287646, n=204672.0
    [validate_onnx_model] -- make_feeds for 'inputs2'...
    [validate_onnx_model] inputs=dict(input_ids:T7s3x4,attention_mask:T7s3x35,position_ids:T7s3x4,past_key_values:DynamicCache(key_cache=#1[T1s3x1x31x96], value_cache=#1[T1s3x1x31x96]))
    [validate_onnx_model] ort inputs=dict(input_ids:A7s3x4,attention_mask:A7s3x35,position_ids:A7s3x4,past_key_values_key_0:A1s3x1x31x96,past_key_values_value_0:A1s3x1x31x96)
    [validate_onnx_model] done (make_feeds)
    [validate_onnx_model] run session on inputs 'inputs22'...
    [validate_onnx_model] done (run)
    [validate_onnx_model] got=#3[A1s3x4x32000,A1s3x1x35x96,A1s3x1x35x96]
    [validate_onnx_model] discrepancies=abs=1.0132789611816406e-06, rel=0.00026781923006472165, n=404160.0
    [validate_onnx_model] -- make_feeds for 'inputs_empty_cache'...
    [validate_onnx_model] inputs=dict(input_ids:T7s2x3,attention_mask:T7s2x3,position_ids:T7s2x3,past_key_values:DynamicCache(key_cache=#1[T1s2x1x0x96], value_cache=#1[T1s2x1x0x96]))
    [validate_onnx_model] ort inputs=dict(input_ids:A7s2x3,attention_mask:A7s2x3,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x0x96,past_key_values_value_0:A1s2x1x0x96)
    [validate_onnx_model] done (make_feeds)
    [validate_onnx_model] run session on inputs 'inputs2_empty_cache'...
    [validate_onnx_model] done (run)
    [validate_onnx_model] got=#3[A1s2x3x32000,A1s2x1x3x96,A1s2x1x3x96]
    [validate_onnx_model] discrepancies=abs=1.0728836059570312e-06, rel=0.00034613720184126256, n=193152.0
    [validate_onnx_model] -- make_feeds for 'inputs_batch1'...
    [validate_onnx_model] inputs=dict(input_ids:T7s1x3,attention_mask:T7s1x33,position_ids:T7s1x3,past_key_values:DynamicCache(key_cache=#1[T1s1x1x30x96], value_cache=#1[T1s1x1x30x96]))
    [validate_onnx_model] ort inputs=dict(input_ids:A7s1x3,attention_mask:A7s1x33,position_ids:A7s1x3,past_key_values_key_0:A1s1x1x30x96,past_key_values_value_0:A1s1x1x30x96)
    [validate_onnx_model] done (make_feeds)
    [validate_onnx_model] run session on inputs 'inputs2_batch1'...
    [validate_onnx_model] done (run)
    [validate_onnx_model] got=#3[A1s1x3x32000,A1s1x1x33x96,A1s1x1x33x96]
    [validate_onnx_model] discrepancies=abs=1.1026859283447266e-06, rel=0.0003551108443373418, n=102336.0
    [validate_model] run onnxruntime fusion for 'bert_tf'
    failed in shape inference <class 'AssertionError'>
    [validate_model] done 'bert_tf' in 0.1332150369998999, saved into 'dump_models/arnir0_Tiny-LLM/onnx-dynamo/ir/op18/arnir0_Tiny-LLM-onnx-dynamo-ir-op18.ort.bert_tf.onnx'
    [validate_onnx_model] verify onnx model with providers ['CPUExecutionProvider']..., flavour='ortbert_tf'
    [validate_onnx_model] runtime is onnxruntime
    [validate_onnx_model] done (ort_session) flavour='ortbert_tf'
    [validate_onnx_model] -- keys=[('inputs', 'run_expected', ''), ('inputs2', 'run_expected22', '22'), ('inputs_empty_cache', 'run_expected2_empty_cache', '2_empty_cache'), ('inputs_batch1', 'run_expected2_batch1', '2_batch1')]
    [validate_onnx_model] -- make_feeds for 'inputs'...
    [validate_onnx_model] inputs=dict(input_ids:T7s2x3,attention_mask:T7s2x33,position_ids:T7s2x3,past_key_values:DynamicCache(key_cache=#1[T1s2x1x30x96], value_cache=#1[T1s2x1x30x96]))
    [validate_onnx_model] ort inputs=dict(input_ids:A7s2x3,attention_mask:A7s2x33,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x30x96,past_key_values_value_0:A1s2x1x30x96)
    [validate_onnx_model] done (make_feeds)
    [validate_onnx_model] run session on inputs 'inputs'...
    [validate_onnx_model] done (run)
    [validate_onnx_model] got=#3[A1s2x3x32000,A1s2x1x33x96,A1s2x1x33x96]
    [validate_onnx_model] discrepancies=abs=8.344650268554688e-07, rel=0.00038373230338287646, n=204672.0
    [validate_onnx_model] -- make_feeds for 'inputs2'...
    [validate_onnx_model] inputs=dict(input_ids:T7s3x4,attention_mask:T7s3x35,position_ids:T7s3x4,past_key_values:DynamicCache(key_cache=#1[T1s3x1x31x96], value_cache=#1[T1s3x1x31x96]))
    [validate_onnx_model] ort inputs=dict(input_ids:A7s3x4,attention_mask:A7s3x35,position_ids:A7s3x4,past_key_values_key_0:A1s3x1x31x96,past_key_values_value_0:A1s3x1x31x96)
    [validate_onnx_model] done (make_feeds)
    [validate_onnx_model] run session on inputs 'inputs22'...
    [validate_onnx_model] done (run)
    [validate_onnx_model] got=#3[A1s3x4x32000,A1s3x1x35x96,A1s3x1x35x96]
    [validate_onnx_model] discrepancies=abs=1.0132789611816406e-06, rel=0.00026781923006472165, n=404160.0
    [validate_onnx_model] -- make_feeds for 'inputs_empty_cache'...
    [validate_onnx_model] inputs=dict(input_ids:T7s2x3,attention_mask:T7s2x3,position_ids:T7s2x3,past_key_values:DynamicCache(key_cache=#1[T1s2x1x0x96], value_cache=#1[T1s2x1x0x96]))
    [validate_onnx_model] ort inputs=dict(input_ids:A7s2x3,attention_mask:A7s2x3,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x0x96,past_key_values_value_0:A1s2x1x0x96)
    [validate_onnx_model] done (make_feeds)
    [validate_onnx_model] run session on inputs 'inputs2_empty_cache'...
    [validate_onnx_model] done (run)
    [validate_onnx_model] got=#3[A1s2x3x32000,A1s2x1x3x96,A1s2x1x3x96]
    [validate_onnx_model] discrepancies=abs=1.0728836059570312e-06, rel=0.00034613720184126256, n=193152.0
    [validate_onnx_model] -- make_feeds for 'inputs_batch1'...
    [validate_onnx_model] inputs=dict(input_ids:T7s1x3,attention_mask:T7s1x33,position_ids:T7s1x3,past_key_values:DynamicCache(key_cache=#1[T1s1x1x30x96], value_cache=#1[T1s1x1x30x96]))
    [validate_onnx_model] ort inputs=dict(input_ids:A7s1x3,attention_mask:A7s1x33,position_ids:A7s1x3,past_key_values_key_0:A1s1x1x30x96,past_key_values_value_0:A1s1x1x30x96)
    [validate_onnx_model] done (make_feeds)
    [validate_onnx_model] run session on inputs 'inputs2_batch1'...
    [validate_onnx_model] done (run)
    [validate_onnx_model] got=#3[A1s1x3x32000,A1s1x1x33x96,A1s1x1x33x96]
    [validate_onnx_model] discrepancies=abs=1.1026859283447266e-06, rel=0.0003551108443373418, n=102336.0
    [validate_model] run onnxruntime fusion for 'clip'
    failed in shape inference <class 'AssertionError'>
    [validate_model] done 'clip' in 0.3415136390030966, saved into 'dump_models/arnir0_Tiny-LLM/onnx-dynamo/ir/op18/arnir0_Tiny-LLM-onnx-dynamo-ir-op18.ort.clip.onnx'
    [validate_onnx_model] verify onnx model with providers ['CPUExecutionProvider']..., flavour='ortclip'
    [validate_onnx_model] runtime is onnxruntime
    [validate_onnx_model] done (ort_session) flavour='ortclip'
    [validate_onnx_model] -- keys=[('inputs', 'run_expected', ''), ('inputs2', 'run_expected22', '22'), ('inputs_empty_cache', 'run_expected2_empty_cache', '2_empty_cache'), ('inputs_batch1', 'run_expected2_batch1', '2_batch1')]
    [validate_onnx_model] -- make_feeds for 'inputs'...
    [validate_onnx_model] inputs=dict(input_ids:T7s2x3,attention_mask:T7s2x33,position_ids:T7s2x3,past_key_values:DynamicCache(key_cache=#1[T1s2x1x30x96], value_cache=#1[T1s2x1x30x96]))
    [validate_onnx_model] ort inputs=dict(input_ids:A7s2x3,attention_mask:A7s2x33,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x30x96,past_key_values_value_0:A1s2x1x30x96)
    [validate_onnx_model] done (make_feeds)
    [validate_onnx_model] run session on inputs 'inputs'...
    [validate_onnx_model] done (run)
    [validate_onnx_model] got=#3[A1s2x3x32000,A1s2x1x33x96,A1s2x1x33x96]
    [validate_onnx_model] discrepancies=abs=8.344650268554688e-07, rel=0.00038373230338287646, n=204672.0
    [validate_onnx_model] -- make_feeds for 'inputs2'...
    [validate_onnx_model] inputs=dict(input_ids:T7s3x4,attention_mask:T7s3x35,position_ids:T7s3x4,past_key_values:DynamicCache(key_cache=#1[T1s3x1x31x96], value_cache=#1[T1s3x1x31x96]))
    [validate_onnx_model] ort inputs=dict(input_ids:A7s3x4,attention_mask:A7s3x35,position_ids:A7s3x4,past_key_values_key_0:A1s3x1x31x96,past_key_values_value_0:A1s3x1x31x96)
    [validate_onnx_model] done (make_feeds)
    [validate_onnx_model] run session on inputs 'inputs22'...
    [validate_onnx_model] done (run)
    [validate_onnx_model] got=#3[A1s3x4x32000,A1s3x1x35x96,A1s3x1x35x96]
    [validate_onnx_model] discrepancies=abs=1.0132789611816406e-06, rel=0.00026781923006472165, n=404160.0
    [validate_onnx_model] -- make_feeds for 'inputs_empty_cache'...
    [validate_onnx_model] inputs=dict(input_ids:T7s2x3,attention_mask:T7s2x3,position_ids:T7s2x3,past_key_values:DynamicCache(key_cache=#1[T1s2x1x0x96], value_cache=#1[T1s2x1x0x96]))
    [validate_onnx_model] ort inputs=dict(input_ids:A7s2x3,attention_mask:A7s2x3,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x0x96,past_key_values_value_0:A1s2x1x0x96)
    [validate_onnx_model] done (make_feeds)
    [validate_onnx_model] run session on inputs 'inputs2_empty_cache'...
    [validate_onnx_model] done (run)
    [validate_onnx_model] got=#3[A1s2x3x32000,A1s2x1x3x96,A1s2x1x3x96]
    [validate_onnx_model] discrepancies=abs=1.0728836059570312e-06, rel=0.00034613720184126256, n=193152.0
    [validate_onnx_model] -- make_feeds for 'inputs_batch1'...
    [validate_onnx_model] inputs=dict(input_ids:T7s1x3,attention_mask:T7s1x33,position_ids:T7s1x3,past_key_values:DynamicCache(key_cache=#1[T1s1x1x30x96], value_cache=#1[T1s1x1x30x96]))
    [validate_onnx_model] ort inputs=dict(input_ids:A7s1x3,attention_mask:A7s1x33,position_ids:A7s1x3,past_key_values_key_0:A1s1x1x30x96,past_key_values_value_0:A1s1x1x30x96)
    [validate_onnx_model] done (make_feeds)
    [validate_onnx_model] run session on inputs 'inputs2_batch1'...
    [validate_onnx_model] done (run)
    [validate_onnx_model] got=#3[A1s1x3x32000,A1s1x1x33x96,A1s1x1x33x96]
    [validate_onnx_model] discrepancies=abs=1.1026859283447266e-06, rel=0.0003551108443373418, n=102336.0
    [validate_model] run onnxruntime fusion for 'conformer'
    failed in shape inference <class 'AssertionError'>
    [validate_model] done 'conformer' in 0.2503558129974408, saved into 'dump_models/arnir0_Tiny-LLM/onnx-dynamo/ir/op18/arnir0_Tiny-LLM-onnx-dynamo-ir-op18.ort.conformer.onnx'
    [validate_onnx_model] verify onnx model with providers ['CPUExecutionProvider']..., flavour='ortconformer'
    [validate_onnx_model] runtime is onnxruntime
    [validate_onnx_model] done (ort_session) flavour='ortconformer'
    [validate_onnx_model] -- keys=[('inputs', 'run_expected', ''), ('inputs2', 'run_expected22', '22'), ('inputs_empty_cache', 'run_expected2_empty_cache', '2_empty_cache'), ('inputs_batch1', 'run_expected2_batch1', '2_batch1')]
    [validate_onnx_model] -- make_feeds for 'inputs'...
    [validate_onnx_model] inputs=dict(input_ids:T7s2x3,attention_mask:T7s2x33,position_ids:T7s2x3,past_key_values:DynamicCache(key_cache=#1[T1s2x1x30x96], value_cache=#1[T1s2x1x30x96]))
    [validate_onnx_model] ort inputs=dict(input_ids:A7s2x3,attention_mask:A7s2x33,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x30x96,past_key_values_value_0:A1s2x1x30x96)
    [validate_onnx_model] done (make_feeds)
    [validate_onnx_model] run session on inputs 'inputs'...
    [validate_onnx_model] done (run)
    [validate_onnx_model] got=#3[A1s2x3x32000,A1s2x1x33x96,A1s2x1x33x96]
    [validate_onnx_model] discrepancies=abs=8.344650268554688e-07, rel=0.00038373230338287646, n=204672.0
    [validate_onnx_model] -- make_feeds for 'inputs2'...
    [validate_onnx_model] inputs=dict(input_ids:T7s3x4,attention_mask:T7s3x35,position_ids:T7s3x4,past_key_values:DynamicCache(key_cache=#1[T1s3x1x31x96], value_cache=#1[T1s3x1x31x96]))
    [validate_onnx_model] ort inputs=dict(input_ids:A7s3x4,attention_mask:A7s3x35,position_ids:A7s3x4,past_key_values_key_0:A1s3x1x31x96,past_key_values_value_0:A1s3x1x31x96)
    [validate_onnx_model] done (make_feeds)
    [validate_onnx_model] run session on inputs 'inputs22'...
    [validate_onnx_model] done (run)
    [validate_onnx_model] got=#3[A1s3x4x32000,A1s3x1x35x96,A1s3x1x35x96]
    [validate_onnx_model] discrepancies=abs=1.0132789611816406e-06, rel=0.00026781923006472165, n=404160.0
    [validate_onnx_model] -- make_feeds for 'inputs_empty_cache'...
    [validate_onnx_model] inputs=dict(input_ids:T7s2x3,attention_mask:T7s2x3,position_ids:T7s2x3,past_key_values:DynamicCache(key_cache=#1[T1s2x1x0x96], value_cache=#1[T1s2x1x0x96]))
    [validate_onnx_model] ort inputs=dict(input_ids:A7s2x3,attention_mask:A7s2x3,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x0x96,past_key_values_value_0:A1s2x1x0x96)
    [validate_onnx_model] done (make_feeds)
    [validate_onnx_model] run session on inputs 'inputs2_empty_cache'...
    [validate_onnx_model] done (run)
    [validate_onnx_model] got=#3[A1s2x3x32000,A1s2x1x3x96,A1s2x1x3x96]
    [validate_onnx_model] discrepancies=abs=1.0728836059570312e-06, rel=0.00034613720184126256, n=193152.0
    [validate_onnx_model] -- make_feeds for 'inputs_batch1'...
    [validate_onnx_model] inputs=dict(input_ids:T7s1x3,attention_mask:T7s1x33,position_ids:T7s1x3,past_key_values:DynamicCache(key_cache=#1[T1s1x1x30x96], value_cache=#1[T1s1x1x30x96]))
    [validate_onnx_model] ort inputs=dict(input_ids:A7s1x3,attention_mask:A7s1x33,position_ids:A7s1x3,past_key_values_key_0:A1s1x1x30x96,past_key_values_value_0:A1s1x1x30x96)
    [validate_onnx_model] done (make_feeds)
    [validate_onnx_model] run session on inputs 'inputs2_batch1'...
    [validate_onnx_model] done (run)
    [validate_onnx_model] got=#3[A1s1x3x32000,A1s1x1x33x96,A1s1x1x33x96]
    [validate_onnx_model] discrepancies=abs=1.1026859283447266e-06, rel=0.0003551108443373418, n=102336.0
    [validate_model] run onnxruntime fusion for 'gpt2'
    failed in shape inference <class 'AssertionError'>
    [validate_model] done 'gpt2' in 0.33758883599512046, saved into 'dump_models/arnir0_Tiny-LLM/onnx-dynamo/ir/op18/arnir0_Tiny-LLM-onnx-dynamo-ir-op18.ort.gpt2.onnx'
    [validate_onnx_model] verify onnx model with providers ['CPUExecutionProvider']..., flavour='ortgpt2'
    [validate_onnx_model] runtime is onnxruntime
    [validate_onnx_model] done (ort_session) flavour='ortgpt2'
    [validate_onnx_model] -- keys=[('inputs', 'run_expected', ''), ('inputs2', 'run_expected22', '22'), ('inputs_empty_cache', 'run_expected2_empty_cache', '2_empty_cache'), ('inputs_batch1', 'run_expected2_batch1', '2_batch1')]
    [validate_onnx_model] -- make_feeds for 'inputs'...
    [validate_onnx_model] inputs=dict(input_ids:T7s2x3,attention_mask:T7s2x33,position_ids:T7s2x3,past_key_values:DynamicCache(key_cache=#1[T1s2x1x30x96], value_cache=#1[T1s2x1x30x96]))
    [validate_onnx_model] ort inputs=dict(input_ids:A7s2x3,attention_mask:A7s2x33,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x30x96,past_key_values_value_0:A1s2x1x30x96)
    [validate_onnx_model] done (make_feeds)
    [validate_onnx_model] run session on inputs 'inputs'...
    [validate_onnx_model] done (run)
    [validate_onnx_model] got=#3[A1s2x3x32000,A1s2x1x33x96,A1s2x1x33x96]
    [validate_onnx_model] discrepancies=abs=8.344650268554688e-07, rel=0.00038373230338287646, n=204672.0
    [validate_onnx_model] -- make_feeds for 'inputs2'...
    [validate_onnx_model] inputs=dict(input_ids:T7s3x4,attention_mask:T7s3x35,position_ids:T7s3x4,past_key_values:DynamicCache(key_cache=#1[T1s3x1x31x96], value_cache=#1[T1s3x1x31x96]))
    [validate_onnx_model] ort inputs=dict(input_ids:A7s3x4,attention_mask:A7s3x35,position_ids:A7s3x4,past_key_values_key_0:A1s3x1x31x96,past_key_values_value_0:A1s3x1x31x96)
    [validate_onnx_model] done (make_feeds)
    [validate_onnx_model] run session on inputs 'inputs22'...
    [validate_onnx_model] done (run)
    [validate_onnx_model] got=#3[A1s3x4x32000,A1s3x1x35x96,A1s3x1x35x96]
    [validate_onnx_model] discrepancies=abs=1.0132789611816406e-06, rel=0.00026781923006472165, n=404160.0
    [validate_onnx_model] -- make_feeds for 'inputs_empty_cache'...
    [validate_onnx_model] inputs=dict(input_ids:T7s2x3,attention_mask:T7s2x3,position_ids:T7s2x3,past_key_values:DynamicCache(key_cache=#1[T1s2x1x0x96], value_cache=#1[T1s2x1x0x96]))
    [validate_onnx_model] ort inputs=dict(input_ids:A7s2x3,attention_mask:A7s2x3,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x0x96,past_key_values_value_0:A1s2x1x0x96)
    [validate_onnx_model] done (make_feeds)
    [validate_onnx_model] run session on inputs 'inputs2_empty_cache'...
    [validate_onnx_model] done (run)
    [validate_onnx_model] got=#3[A1s2x3x32000,A1s2x1x3x96,A1s2x1x3x96]
    [validate_onnx_model] discrepancies=abs=1.0728836059570312e-06, rel=0.00034613720184126256, n=193152.0
    [validate_onnx_model] -- make_feeds for 'inputs_batch1'...
    [validate_onnx_model] inputs=dict(input_ids:T7s1x3,attention_mask:T7s1x33,position_ids:T7s1x3,past_key_values:DynamicCache(key_cache=#1[T1s1x1x30x96], value_cache=#1[T1s1x1x30x96]))
    [validate_onnx_model] ort inputs=dict(input_ids:A7s1x3,attention_mask:A7s1x33,position_ids:A7s1x3,past_key_values_key_0:A1s1x1x30x96,past_key_values_value_0:A1s1x1x30x96)
    [validate_onnx_model] done (make_feeds)
    [validate_onnx_model] run session on inputs 'inputs2_batch1'...
    [validate_onnx_model] done (run)
    [validate_onnx_model] got=#3[A1s1x3x32000,A1s1x1x33x96,A1s1x1x33x96]
    [validate_onnx_model] discrepancies=abs=1.1026859283447266e-06, rel=0.0003551108443373418, n=102336.0
    [validate_model] run onnxruntime fusion for 'gpt2_tf'
    failed in shape inference <class 'AssertionError'>
    [validate_model] done 'gpt2_tf' in 0.2716229259967804, saved into 'dump_models/arnir0_Tiny-LLM/onnx-dynamo/ir/op18/arnir0_Tiny-LLM-onnx-dynamo-ir-op18.ort.gpt2_tf.onnx'
    [validate_onnx_model] verify onnx model with providers ['CPUExecutionProvider']..., flavour='ortgpt2_tf'
    [validate_onnx_model] runtime is onnxruntime
    [validate_onnx_model] done (ort_session) flavour='ortgpt2_tf'
    [validate_onnx_model] -- keys=[('inputs', 'run_expected', ''), ('inputs2', 'run_expected22', '22'), ('inputs_empty_cache', 'run_expected2_empty_cache', '2_empty_cache'), ('inputs_batch1', 'run_expected2_batch1', '2_batch1')]
    [validate_onnx_model] -- make_feeds for 'inputs'...
    [validate_onnx_model] inputs=dict(input_ids:T7s2x3,attention_mask:T7s2x33,position_ids:T7s2x3,past_key_values:DynamicCache(key_cache=#1[T1s2x1x30x96], value_cache=#1[T1s2x1x30x96]))
    [validate_onnx_model] ort inputs=dict(input_ids:A7s2x3,attention_mask:A7s2x33,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x30x96,past_key_values_value_0:A1s2x1x30x96)
    [validate_onnx_model] done (make_feeds)
    [validate_onnx_model] run session on inputs 'inputs'...
    [validate_onnx_model] done (run)
    [validate_onnx_model] got=#3[A1s2x3x32000,A1s2x1x33x96,A1s2x1x33x96]
    [validate_onnx_model] discrepancies=abs=8.344650268554688e-07, rel=0.00038373230338287646, n=204672.0
    [validate_onnx_model] -- make_feeds for 'inputs2'...
    [validate_onnx_model] inputs=dict(input_ids:T7s3x4,attention_mask:T7s3x35,position_ids:T7s3x4,past_key_values:DynamicCache(key_cache=#1[T1s3x1x31x96], value_cache=#1[T1s3x1x31x96]))
    [validate_onnx_model] ort inputs=dict(input_ids:A7s3x4,attention_mask:A7s3x35,position_ids:A7s3x4,past_key_values_key_0:A1s3x1x31x96,past_key_values_value_0:A1s3x1x31x96)
    [validate_onnx_model] done (make_feeds)
    [validate_onnx_model] run session on inputs 'inputs22'...
    [validate_onnx_model] done (run)
    [validate_onnx_model] got=#3[A1s3x4x32000,A1s3x1x35x96,A1s3x1x35x96]
    [validate_onnx_model] discrepancies=abs=1.0132789611816406e-06, rel=0.00026781923006472165, n=404160.0
    [validate_onnx_model] -- make_feeds for 'inputs_empty_cache'...
    [validate_onnx_model] inputs=dict(input_ids:T7s2x3,attention_mask:T7s2x3,position_ids:T7s2x3,past_key_values:DynamicCache(key_cache=#1[T1s2x1x0x96], value_cache=#1[T1s2x1x0x96]))
    [validate_onnx_model] ort inputs=dict(input_ids:A7s2x3,attention_mask:A7s2x3,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x0x96,past_key_values_value_0:A1s2x1x0x96)
    [validate_onnx_model] done (make_feeds)
    [validate_onnx_model] run session on inputs 'inputs2_empty_cache'...
    [validate_onnx_model] done (run)
    [validate_onnx_model] got=#3[A1s2x3x32000,A1s2x1x3x96,A1s2x1x3x96]
    [validate_onnx_model] discrepancies=abs=1.0728836059570312e-06, rel=0.00034613720184126256, n=193152.0
    [validate_onnx_model] -- make_feeds for 'inputs_batch1'...
    [validate_onnx_model] inputs=dict(input_ids:T7s1x3,attention_mask:T7s1x33,position_ids:T7s1x3,past_key_values:DynamicCache(key_cache=#1[T1s1x1x30x96], value_cache=#1[T1s1x1x30x96]))
    [validate_onnx_model] ort inputs=dict(input_ids:A7s1x3,attention_mask:A7s1x33,position_ids:A7s1x3,past_key_values_key_0:A1s1x1x30x96,past_key_values_value_0:A1s1x1x30x96)
    [validate_onnx_model] done (make_feeds)
    [validate_onnx_model] run session on inputs 'inputs2_batch1'...
    [validate_onnx_model] done (run)
    [validate_onnx_model] got=#3[A1s1x3x32000,A1s1x1x33x96,A1s1x1x33x96]
    [validate_onnx_model] discrepancies=abs=1.1026859283447266e-06, rel=0.0003551108443373418, n=102336.0
    [validate_model] run onnxruntime fusion for 'gpt_neox'
    failed in shape inference <class 'AssertionError'>
    [validate_model] done 'gpt_neox' in 0.298198181000771, saved into 'dump_models/arnir0_Tiny-LLM/onnx-dynamo/ir/op18/arnir0_Tiny-LLM-onnx-dynamo-ir-op18.ort.gpt_neox.onnx'
    [validate_onnx_model] verify onnx model with providers ['CPUExecutionProvider']..., flavour='ortgpt_neox'
    [validate_onnx_model] runtime is onnxruntime
    [validate_onnx_model] done (ort_session) flavour='ortgpt_neox'
    [validate_onnx_model] -- keys=[('inputs', 'run_expected', ''), ('inputs2', 'run_expected22', '22'), ('inputs_empty_cache', 'run_expected2_empty_cache', '2_empty_cache'), ('inputs_batch1', 'run_expected2_batch1', '2_batch1')]
    [validate_onnx_model] -- make_feeds for 'inputs'...
    [validate_onnx_model] inputs=dict(input_ids:T7s2x3,attention_mask:T7s2x33,position_ids:T7s2x3,past_key_values:DynamicCache(key_cache=#1[T1s2x1x30x96], value_cache=#1[T1s2x1x30x96]))
    [validate_onnx_model] ort inputs=dict(input_ids:A7s2x3,attention_mask:A7s2x33,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x30x96,past_key_values_value_0:A1s2x1x30x96)
    [validate_onnx_model] done (make_feeds)
    [validate_onnx_model] run session on inputs 'inputs'...
    [validate_onnx_model] done (run)
    [validate_onnx_model] got=#3[A1s2x3x32000,A1s2x1x33x96,A1s2x1x33x96]
    [validate_onnx_model] discrepancies=abs=8.344650268554688e-07, rel=0.00038373230338287646, n=204672.0
    [validate_onnx_model] -- make_feeds for 'inputs2'...
    [validate_onnx_model] inputs=dict(input_ids:T7s3x4,attention_mask:T7s3x35,position_ids:T7s3x4,past_key_values:DynamicCache(key_cache=#1[T1s3x1x31x96], value_cache=#1[T1s3x1x31x96]))
    [validate_onnx_model] ort inputs=dict(input_ids:A7s3x4,attention_mask:A7s3x35,position_ids:A7s3x4,past_key_values_key_0:A1s3x1x31x96,past_key_values_value_0:A1s3x1x31x96)
    [validate_onnx_model] done (make_feeds)
    [validate_onnx_model] run session on inputs 'inputs22'...
    [validate_onnx_model] done (run)
    [validate_onnx_model] got=#3[A1s3x4x32000,A1s3x1x35x96,A1s3x1x35x96]
    [validate_onnx_model] discrepancies=abs=1.0132789611816406e-06, rel=0.00026781923006472165, n=404160.0
    [validate_onnx_model] -- make_feeds for 'inputs_empty_cache'...
    [validate_onnx_model] inputs=dict(input_ids:T7s2x3,attention_mask:T7s2x3,position_ids:T7s2x3,past_key_values:DynamicCache(key_cache=#1[T1s2x1x0x96], value_cache=#1[T1s2x1x0x96]))
    [validate_onnx_model] ort inputs=dict(input_ids:A7s2x3,attention_mask:A7s2x3,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x0x96,past_key_values_value_0:A1s2x1x0x96)
    [validate_onnx_model] done (make_feeds)
    [validate_onnx_model] run session on inputs 'inputs2_empty_cache'...
    [validate_onnx_model] done (run)
    [validate_onnx_model] got=#3[A1s2x3x32000,A1s2x1x3x96,A1s2x1x3x96]
    [validate_onnx_model] discrepancies=abs=1.0728836059570312e-06, rel=0.00034613720184126256, n=193152.0
    [validate_onnx_model] -- make_feeds for 'inputs_batch1'...
    [validate_onnx_model] inputs=dict(input_ids:T7s1x3,attention_mask:T7s1x33,position_ids:T7s1x3,past_key_values:DynamicCache(key_cache=#1[T1s1x1x30x96], value_cache=#1[T1s1x1x30x96]))
    [validate_onnx_model] ort inputs=dict(input_ids:A7s1x3,attention_mask:A7s1x33,position_ids:A7s1x3,past_key_values_key_0:A1s1x1x30x96,past_key_values_value_0:A1s1x1x30x96)
    [validate_onnx_model] done (make_feeds)
    [validate_onnx_model] run session on inputs 'inputs2_batch1'...
    [validate_onnx_model] done (run)
    [validate_onnx_model] got=#3[A1s1x3x32000,A1s1x1x33x96,A1s1x1x33x96]
    [validate_onnx_model] discrepancies=abs=1.1026859283447266e-06, rel=0.0003551108443373418, n=102336.0
    [validate_model] run onnxruntime fusion for 'mmdit'
    failed in shape inference <class 'AssertionError'>
    [validate_model] done 'mmdit' in 0.30184246999851894, saved into 'dump_models/arnir0_Tiny-LLM/onnx-dynamo/ir/op18/arnir0_Tiny-LLM-onnx-dynamo-ir-op18.ort.mmdit.onnx'
    [validate_onnx_model] verify onnx model with providers ['CPUExecutionProvider']..., flavour='ortmmdit'
    [validate_onnx_model] runtime is onnxruntime
    [validate_onnx_model] done (ort_session) flavour='ortmmdit'
    [validate_onnx_model] -- keys=[('inputs', 'run_expected', ''), ('inputs2', 'run_expected22', '22'), ('inputs_empty_cache', 'run_expected2_empty_cache', '2_empty_cache'), ('inputs_batch1', 'run_expected2_batch1', '2_batch1')]
    [validate_onnx_model] -- make_feeds for 'inputs'...
    [validate_onnx_model] inputs=dict(input_ids:T7s2x3,attention_mask:T7s2x33,position_ids:T7s2x3,past_key_values:DynamicCache(key_cache=#1[T1s2x1x30x96], value_cache=#1[T1s2x1x30x96]))
    [validate_onnx_model] ort inputs=dict(input_ids:A7s2x3,attention_mask:A7s2x33,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x30x96,past_key_values_value_0:A1s2x1x30x96)
    [validate_onnx_model] done (make_feeds)
    [validate_onnx_model] run session on inputs 'inputs'...
    [validate_onnx_model] done (run)
    [validate_onnx_model] got=#3[A1s2x3x32000,A1s2x1x33x96,A1s2x1x33x96]
    [validate_onnx_model] discrepancies=abs=8.344650268554688e-07, rel=0.00038373230338287646, n=204672.0
    [validate_onnx_model] -- make_feeds for 'inputs2'...
    [validate_onnx_model] inputs=dict(input_ids:T7s3x4,attention_mask:T7s3x35,position_ids:T7s3x4,past_key_values:DynamicCache(key_cache=#1[T1s3x1x31x96], value_cache=#1[T1s3x1x31x96]))
    [validate_onnx_model] ort inputs=dict(input_ids:A7s3x4,attention_mask:A7s3x35,position_ids:A7s3x4,past_key_values_key_0:A1s3x1x31x96,past_key_values_value_0:A1s3x1x31x96)
    [validate_onnx_model] done (make_feeds)
    [validate_onnx_model] run session on inputs 'inputs22'...
    [validate_onnx_model] done (run)
    [validate_onnx_model] got=#3[A1s3x4x32000,A1s3x1x35x96,A1s3x1x35x96]
    [validate_onnx_model] discrepancies=abs=1.0132789611816406e-06, rel=0.00026781923006472165, n=404160.0
    [validate_onnx_model] -- make_feeds for 'inputs_empty_cache'...
    [validate_onnx_model] inputs=dict(input_ids:T7s2x3,attention_mask:T7s2x3,position_ids:T7s2x3,past_key_values:DynamicCache(key_cache=#1[T1s2x1x0x96], value_cache=#1[T1s2x1x0x96]))
    [validate_onnx_model] ort inputs=dict(input_ids:A7s2x3,attention_mask:A7s2x3,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x0x96,past_key_values_value_0:A1s2x1x0x96)
    [validate_onnx_model] done (make_feeds)
    [validate_onnx_model] run session on inputs 'inputs2_empty_cache'...
    [validate_onnx_model] done (run)
    [validate_onnx_model] got=#3[A1s2x3x32000,A1s2x1x3x96,A1s2x1x3x96]
    [validate_onnx_model] discrepancies=abs=1.0728836059570312e-06, rel=0.00034613720184126256, n=193152.0
    [validate_onnx_model] -- make_feeds for 'inputs_batch1'...
    [validate_onnx_model] inputs=dict(input_ids:T7s1x3,attention_mask:T7s1x33,position_ids:T7s1x3,past_key_values:DynamicCache(key_cache=#1[T1s1x1x30x96], value_cache=#1[T1s1x1x30x96]))
    [validate_onnx_model] ort inputs=dict(input_ids:A7s1x3,attention_mask:A7s1x33,position_ids:A7s1x3,past_key_values_key_0:A1s1x1x30x96,past_key_values_value_0:A1s1x1x30x96)
    [validate_onnx_model] done (make_feeds)
    [validate_onnx_model] run session on inputs 'inputs2_batch1'...
    [validate_onnx_model] done (run)
    [validate_onnx_model] got=#3[A1s1x3x32000,A1s1x1x33x96,A1s1x1x33x96]
    [validate_onnx_model] discrepancies=abs=1.1026859283447266e-06, rel=0.0003551108443373418, n=102336.0
    [validate_model] run onnxruntime fusion for 'phi'
    [validate_model] done 'phi' in 0.14227298500190955, saved into 'dump_models/arnir0_Tiny-LLM/onnx-dynamo/ir/op18/arnir0_Tiny-LLM-onnx-dynamo-ir-op18.ort.phi.onnx'
    [validate_onnx_model] missing 'dump_models/arnir0_Tiny-LLM/onnx-dynamo/ir/op18/arnir0_Tiny-LLM-onnx-dynamo-ir-op18.ort.phi.onnx'
    [validate_model] run onnxruntime fusion for 'sam2'
    failed in shape inference <class 'AssertionError'>
    [validate_model] done 'sam2' in 0.2520889220031677, saved into 'dump_models/arnir0_Tiny-LLM/onnx-dynamo/ir/op18/arnir0_Tiny-LLM-onnx-dynamo-ir-op18.ort.sam2.onnx'
    [validate_onnx_model] verify onnx model with providers ['CPUExecutionProvider']..., flavour='ortsam2'
    [validate_onnx_model] runtime is onnxruntime
    [validate_onnx_model] done (ort_session) flavour='ortsam2'
    [validate_onnx_model] -- keys=[('inputs', 'run_expected', ''), ('inputs2', 'run_expected22', '22'), ('inputs_empty_cache', 'run_expected2_empty_cache', '2_empty_cache'), ('inputs_batch1', 'run_expected2_batch1', '2_batch1')]
    [validate_onnx_model] -- make_feeds for 'inputs'...
    [validate_onnx_model] inputs=dict(input_ids:T7s2x3,attention_mask:T7s2x33,position_ids:T7s2x3,past_key_values:DynamicCache(key_cache=#1[T1s2x1x30x96], value_cache=#1[T1s2x1x30x96]))
    [validate_onnx_model] ort inputs=dict(input_ids:A7s2x3,attention_mask:A7s2x33,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x30x96,past_key_values_value_0:A1s2x1x30x96)
    [validate_onnx_model] done (make_feeds)
    [validate_onnx_model] run session on inputs 'inputs'...
    [validate_onnx_model] done (run)
    [validate_onnx_model] got=#3[A1s2x3x32000,A1s2x1x33x96,A1s2x1x33x96]
    [validate_onnx_model] discrepancies=abs=7.748603820800781e-07, rel=0.00044309172106863606, n=204672.0
    [validate_onnx_model] -- make_feeds for 'inputs2'...
    [validate_onnx_model] inputs=dict(input_ids:T7s3x4,attention_mask:T7s3x35,position_ids:T7s3x4,past_key_values:DynamicCache(key_cache=#1[T1s3x1x31x96], value_cache=#1[T1s3x1x31x96]))
    [validate_onnx_model] ort inputs=dict(input_ids:A7s3x4,attention_mask:A7s3x35,position_ids:A7s3x4,past_key_values_key_0:A1s3x1x31x96,past_key_values_value_0:A1s3x1x31x96)
    [validate_onnx_model] done (make_feeds)
    [validate_onnx_model] run session on inputs 'inputs22'...
    [validate_onnx_model] done (run)
    [validate_onnx_model] got=#3[A1s3x4x32000,A1s3x1x35x96,A1s3x1x35x96]
    [validate_onnx_model] discrepancies=abs=8.344650268554688e-07, rel=0.00032628376058705446, n=404160.0
    [validate_onnx_model] -- make_feeds for 'inputs_empty_cache'...
    [validate_onnx_model] inputs=dict(input_ids:T7s2x3,attention_mask:T7s2x3,position_ids:T7s2x3,past_key_values:DynamicCache(key_cache=#1[T1s2x1x0x96], value_cache=#1[T1s2x1x0x96]))
    [validate_onnx_model] ort inputs=dict(input_ids:A7s2x3,attention_mask:A7s2x3,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x0x96,past_key_values_value_0:A1s2x1x0x96)
    [validate_onnx_model] done (make_feeds)
    [validate_onnx_model] run session on inputs 'inputs2_empty_cache'...
    [validate_onnx_model] done (run)
    [validate_onnx_model] got=#3[A1s2x3x32000,A1s2x1x3x96,A1s2x1x3x96]
    [validate_onnx_model] discrepancies=abs=6.556510925292969e-07, rel=0.0002520801563113858, n=193152.0
    [validate_onnx_model] -- make_feeds for 'inputs_batch1'...
    [validate_onnx_model] inputs=dict(input_ids:T7s1x3,attention_mask:T7s1x33,position_ids:T7s1x3,past_key_values:DynamicCache(key_cache=#1[T1s1x1x30x96], value_cache=#1[T1s1x1x30x96]))
    [validate_onnx_model] ort inputs=dict(input_ids:A7s1x3,attention_mask:A7s1x33,position_ids:A7s1x3,past_key_values_key_0:A1s1x1x30x96,past_key_values_value_0:A1s1x1x30x96)
    [validate_onnx_model] done (make_feeds)
    [validate_onnx_model] run session on inputs 'inputs2_batch1'...
    [validate_onnx_model] done (run)
    [validate_onnx_model] got=#3[A1s1x3x32000,A1s1x1x33x96,A1s1x1x33x96]
    [validate_onnx_model] discrepancies=abs=7.748603820800781e-07, rel=0.00037692802454639585, n=102336.0
    [validate_model] run onnxruntime fusion for 'swin'
    failed in shape inference <class 'AssertionError'>
    [validate_model] done 'swin' in 0.2813860750029562, saved into 'dump_models/arnir0_Tiny-LLM/onnx-dynamo/ir/op18/arnir0_Tiny-LLM-onnx-dynamo-ir-op18.ort.swin.onnx'
    [validate_onnx_model] verify onnx model with providers ['CPUExecutionProvider']..., flavour='ortswin'
    [validate_onnx_model] runtime is onnxruntime
    [validate_onnx_model] done (ort_session) flavour='ortswin'
    [validate_onnx_model] -- keys=[('inputs', 'run_expected', ''), ('inputs2', 'run_expected22', '22'), ('inputs_empty_cache', 'run_expected2_empty_cache', '2_empty_cache'), ('inputs_batch1', 'run_expected2_batch1', '2_batch1')]
    [validate_onnx_model] -- make_feeds for 'inputs'...
    [validate_onnx_model] inputs=dict(input_ids:T7s2x3,attention_mask:T7s2x33,position_ids:T7s2x3,past_key_values:DynamicCache(key_cache=#1[T1s2x1x30x96], value_cache=#1[T1s2x1x30x96]))
    [validate_onnx_model] ort inputs=dict(input_ids:A7s2x3,attention_mask:A7s2x33,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x30x96,past_key_values_value_0:A1s2x1x30x96)
    [validate_onnx_model] done (make_feeds)
    [validate_onnx_model] run session on inputs 'inputs'...
    [validate_onnx_model] done (run)
    [validate_onnx_model] got=#3[A1s2x3x32000,A1s2x1x33x96,A1s2x1x33x96]
    [validate_onnx_model] discrepancies=abs=8.344650268554688e-07, rel=0.00038373230338287646, n=204672.0
    [validate_onnx_model] -- make_feeds for 'inputs2'...
    [validate_onnx_model] inputs=dict(input_ids:T7s3x4,attention_mask:T7s3x35,position_ids:T7s3x4,past_key_values:DynamicCache(key_cache=#1[T1s3x1x31x96], value_cache=#1[T1s3x1x31x96]))
    [validate_onnx_model] ort inputs=dict(input_ids:A7s3x4,attention_mask:A7s3x35,position_ids:A7s3x4,past_key_values_key_0:A1s3x1x31x96,past_key_values_value_0:A1s3x1x31x96)
    [validate_onnx_model] done (make_feeds)
    [validate_onnx_model] run session on inputs 'inputs22'...
    [validate_onnx_model] done (run)
    [validate_onnx_model] got=#3[A1s3x4x32000,A1s3x1x35x96,A1s3x1x35x96]
    [validate_onnx_model] discrepancies=abs=1.0132789611816406e-06, rel=0.00026781923006472165, n=404160.0
    [validate_onnx_model] -- make_feeds for 'inputs_empty_cache'...
    [validate_onnx_model] inputs=dict(input_ids:T7s2x3,attention_mask:T7s2x3,position_ids:T7s2x3,past_key_values:DynamicCache(key_cache=#1[T1s2x1x0x96], value_cache=#1[T1s2x1x0x96]))
    [validate_onnx_model] ort inputs=dict(input_ids:A7s2x3,attention_mask:A7s2x3,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x0x96,past_key_values_value_0:A1s2x1x0x96)
    [validate_onnx_model] done (make_feeds)
    [validate_onnx_model] run session on inputs 'inputs2_empty_cache'...
    [validate_onnx_model] done (run)
    [validate_onnx_model] got=#3[A1s2x3x32000,A1s2x1x3x96,A1s2x1x3x96]
    [validate_onnx_model] discrepancies=abs=1.0728836059570312e-06, rel=0.00034613720184126256, n=193152.0
    [validate_onnx_model] -- make_feeds for 'inputs_batch1'...
    [validate_onnx_model] inputs=dict(input_ids:T7s1x3,attention_mask:T7s1x33,position_ids:T7s1x3,past_key_values:DynamicCache(key_cache=#1[T1s1x1x30x96], value_cache=#1[T1s1x1x30x96]))
    [validate_onnx_model] ort inputs=dict(input_ids:A7s1x3,attention_mask:A7s1x33,position_ids:A7s1x3,past_key_values_key_0:A1s1x1x30x96,past_key_values_value_0:A1s1x1x30x96)
    [validate_onnx_model] done (make_feeds)
    [validate_onnx_model] run session on inputs 'inputs2_batch1'...
    [validate_onnx_model] done (run)
    [validate_onnx_model] got=#3[A1s1x3x32000,A1s1x1x33x96,A1s1x1x33x96]
    [validate_onnx_model] discrepancies=abs=1.1026859283447266e-06, rel=0.0003551108443373418, n=102336.0
    [validate_model] run onnxruntime fusion for 't5'
    failed in shape inference <class 'AssertionError'>
    [validate_model] done 't5' in 0.3476416880002944, saved into 'dump_models/arnir0_Tiny-LLM/onnx-dynamo/ir/op18/arnir0_Tiny-LLM-onnx-dynamo-ir-op18.ort.t5.onnx'
    [validate_onnx_model] verify onnx model with providers ['CPUExecutionProvider']..., flavour='ortt5'
    [validate_onnx_model] runtime is onnxruntime
    [validate_onnx_model] done (ort_session) flavour='ortt5'
    [validate_onnx_model] -- keys=[('inputs', 'run_expected', ''), ('inputs2', 'run_expected22', '22'), ('inputs_empty_cache', 'run_expected2_empty_cache', '2_empty_cache'), ('inputs_batch1', 'run_expected2_batch1', '2_batch1')]
    [validate_onnx_model] -- make_feeds for 'inputs'...
    [validate_onnx_model] inputs=dict(input_ids:T7s2x3,attention_mask:T7s2x33,position_ids:T7s2x3,past_key_values:DynamicCache(key_cache=#1[T1s2x1x30x96], value_cache=#1[T1s2x1x30x96]))
    [validate_onnx_model] ort inputs=dict(input_ids:A7s2x3,attention_mask:A7s2x33,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x30x96,past_key_values_value_0:A1s2x1x30x96)
    [validate_onnx_model] done (make_feeds)
    [validate_onnx_model] run session on inputs 'inputs'...
    [validate_onnx_model] done (run)
    [validate_onnx_model] got=#3[A1s2x3x32000,A1s2x1x33x96,A1s2x1x33x96]
    [validate_onnx_model] discrepancies=abs=8.344650268554688e-07, rel=0.00038373230338287646, n=204672.0
    [validate_onnx_model] -- make_feeds for 'inputs2'...
    [validate_onnx_model] inputs=dict(input_ids:T7s3x4,attention_mask:T7s3x35,position_ids:T7s3x4,past_key_values:DynamicCache(key_cache=#1[T1s3x1x31x96], value_cache=#1[T1s3x1x31x96]))
    [validate_onnx_model] ort inputs=dict(input_ids:A7s3x4,attention_mask:A7s3x35,position_ids:A7s3x4,past_key_values_key_0:A1s3x1x31x96,past_key_values_value_0:A1s3x1x31x96)
    [validate_onnx_model] done (make_feeds)
    [validate_onnx_model] run session on inputs 'inputs22'...
    [validate_onnx_model] done (run)
    [validate_onnx_model] got=#3[A1s3x4x32000,A1s3x1x35x96,A1s3x1x35x96]
    [validate_onnx_model] discrepancies=abs=1.0132789611816406e-06, rel=0.00026781923006472165, n=404160.0
    [validate_onnx_model] -- make_feeds for 'inputs_empty_cache'...
    [validate_onnx_model] inputs=dict(input_ids:T7s2x3,attention_mask:T7s2x3,position_ids:T7s2x3,past_key_values:DynamicCache(key_cache=#1[T1s2x1x0x96], value_cache=#1[T1s2x1x0x96]))
    [validate_onnx_model] ort inputs=dict(input_ids:A7s2x3,attention_mask:A7s2x3,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x0x96,past_key_values_value_0:A1s2x1x0x96)
    [validate_onnx_model] done (make_feeds)
    [validate_onnx_model] run session on inputs 'inputs2_empty_cache'...
    [validate_onnx_model] done (run)
    [validate_onnx_model] got=#3[A1s2x3x32000,A1s2x1x3x96,A1s2x1x3x96]
    [validate_onnx_model] discrepancies=abs=1.0728836059570312e-06, rel=0.00034613720184126256, n=193152.0
    [validate_onnx_model] -- make_feeds for 'inputs_batch1'...
    [validate_onnx_model] inputs=dict(input_ids:T7s1x3,attention_mask:T7s1x33,position_ids:T7s1x3,past_key_values:DynamicCache(key_cache=#1[T1s1x1x30x96], value_cache=#1[T1s1x1x30x96]))
    [validate_onnx_model] ort inputs=dict(input_ids:A7s1x3,attention_mask:A7s1x33,position_ids:A7s1x3,past_key_values_key_0:A1s1x1x30x96,past_key_values_value_0:A1s1x1x30x96)
    [validate_onnx_model] done (make_feeds)
    [validate_onnx_model] run session on inputs 'inputs2_batch1'...
    [validate_onnx_model] done (run)
    [validate_onnx_model] got=#3[A1s1x3x32000,A1s1x1x33x96,A1s1x1x33x96]
    [validate_onnx_model] discrepancies=abs=1.1026859283447266e-06, rel=0.0003551108443373418, n=102336.0
    [validate_model] run onnxruntime fusion for 'tnlr'
    failed in shape inference <class 'AssertionError'>
    [validate_model] done 'tnlr' in 0.28425363099813694, saved into 'dump_models/arnir0_Tiny-LLM/onnx-dynamo/ir/op18/arnir0_Tiny-LLM-onnx-dynamo-ir-op18.ort.tnlr.onnx'
    [validate_onnx_model] verify onnx model with providers ['CPUExecutionProvider']..., flavour='orttnlr'
    [validate_onnx_model] runtime is onnxruntime
    [validate_onnx_model] done (ort_session) flavour='orttnlr'
    [validate_onnx_model] -- keys=[('inputs', 'run_expected', ''), ('inputs2', 'run_expected22', '22'), ('inputs_empty_cache', 'run_expected2_empty_cache', '2_empty_cache'), ('inputs_batch1', 'run_expected2_batch1', '2_batch1')]
    [validate_onnx_model] -- make_feeds for 'inputs'...
    [validate_onnx_model] inputs=dict(input_ids:T7s2x3,attention_mask:T7s2x33,position_ids:T7s2x3,past_key_values:DynamicCache(key_cache=#1[T1s2x1x30x96], value_cache=#1[T1s2x1x30x96]))
    [validate_onnx_model] ort inputs=dict(input_ids:A7s2x3,attention_mask:A7s2x33,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x30x96,past_key_values_value_0:A1s2x1x30x96)
    [validate_onnx_model] done (make_feeds)
    [validate_onnx_model] run session on inputs 'inputs'...
    [validate_onnx_model] done (run)
    [validate_onnx_model] got=#3[A1s2x3x32000,A1s2x1x33x96,A1s2x1x33x96]
    [validate_onnx_model] discrepancies=abs=8.344650268554688e-07, rel=0.00038373230338287646, n=204672.0
    [validate_onnx_model] -- make_feeds for 'inputs2'...
    [validate_onnx_model] inputs=dict(input_ids:T7s3x4,attention_mask:T7s3x35,position_ids:T7s3x4,past_key_values:DynamicCache(key_cache=#1[T1s3x1x31x96], value_cache=#1[T1s3x1x31x96]))
    [validate_onnx_model] ort inputs=dict(input_ids:A7s3x4,attention_mask:A7s3x35,position_ids:A7s3x4,past_key_values_key_0:A1s3x1x31x96,past_key_values_value_0:A1s3x1x31x96)
    [validate_onnx_model] done (make_feeds)
    [validate_onnx_model] run session on inputs 'inputs22'...
    [validate_onnx_model] done (run)
    [validate_onnx_model] got=#3[A1s3x4x32000,A1s3x1x35x96,A1s3x1x35x96]
    [validate_onnx_model] discrepancies=abs=1.0132789611816406e-06, rel=0.00026781923006472165, n=404160.0
    [validate_onnx_model] -- make_feeds for 'inputs_empty_cache'...
    [validate_onnx_model] inputs=dict(input_ids:T7s2x3,attention_mask:T7s2x3,position_ids:T7s2x3,past_key_values:DynamicCache(key_cache=#1[T1s2x1x0x96], value_cache=#1[T1s2x1x0x96]))
    [validate_onnx_model] ort inputs=dict(input_ids:A7s2x3,attention_mask:A7s2x3,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x0x96,past_key_values_value_0:A1s2x1x0x96)
    [validate_onnx_model] done (make_feeds)
    [validate_onnx_model] run session on inputs 'inputs2_empty_cache'...
    [validate_onnx_model] done (run)
    [validate_onnx_model] got=#3[A1s2x3x32000,A1s2x1x3x96,A1s2x1x3x96]
    [validate_onnx_model] discrepancies=abs=1.0728836059570312e-06, rel=0.00034613720184126256, n=193152.0
    [validate_onnx_model] -- make_feeds for 'inputs_batch1'...
    [validate_onnx_model] inputs=dict(input_ids:T7s1x3,attention_mask:T7s1x33,position_ids:T7s1x3,past_key_values:DynamicCache(key_cache=#1[T1s1x1x30x96], value_cache=#1[T1s1x1x30x96]))
    [validate_onnx_model] ort inputs=dict(input_ids:A7s1x3,attention_mask:A7s1x33,position_ids:A7s1x3,past_key_values_key_0:A1s1x1x30x96,past_key_values_value_0:A1s1x1x30x96)
    [validate_onnx_model] done (make_feeds)
    [validate_onnx_model] run session on inputs 'inputs2_batch1'...
    [validate_onnx_model] done (run)
    [validate_onnx_model] got=#3[A1s1x3x32000,A1s1x1x33x96,A1s1x1x33x96]
    [validate_onnx_model] discrepancies=abs=1.1026859283447266e-06, rel=0.0003551108443373418, n=102336.0
    [validate_model] run onnxruntime fusion for 'unet'
    failed in shape inference <class 'AssertionError'>
    [validate_model] done 'unet' in 0.3744417580019217, saved into 'dump_models/arnir0_Tiny-LLM/onnx-dynamo/ir/op18/arnir0_Tiny-LLM-onnx-dynamo-ir-op18.ort.unet.onnx'
    [validate_onnx_model] verify onnx model with providers ['CPUExecutionProvider']..., flavour='ortunet'
    [validate_onnx_model] runtime is onnxruntime
    [validate_onnx_model] done (ort_session) flavour='ortunet'
    [validate_onnx_model] -- keys=[('inputs', 'run_expected', ''), ('inputs2', 'run_expected22', '22'), ('inputs_empty_cache', 'run_expected2_empty_cache', '2_empty_cache'), ('inputs_batch1', 'run_expected2_batch1', '2_batch1')]
    [validate_onnx_model] -- make_feeds for 'inputs'...
    [validate_onnx_model] inputs=dict(input_ids:T7s2x3,attention_mask:T7s2x33,position_ids:T7s2x3,past_key_values:DynamicCache(key_cache=#1[T1s2x1x30x96], value_cache=#1[T1s2x1x30x96]))
    [validate_onnx_model] ort inputs=dict(input_ids:A7s2x3,attention_mask:A7s2x33,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x30x96,past_key_values_value_0:A1s2x1x30x96)
    [validate_onnx_model] done (make_feeds)
    [validate_onnx_model] run session on inputs 'inputs'...
    [validate_onnx_model] done (run)
    [validate_onnx_model] got=#3[A1s2x3x32000,A1s2x1x33x96,A1s2x1x33x96]
    [validate_onnx_model] discrepancies=abs=7.748603820800781e-07, rel=0.00044309172106863606, n=204672.0
    [validate_onnx_model] -- make_feeds for 'inputs2'...
    [validate_onnx_model] inputs=dict(input_ids:T7s3x4,attention_mask:T7s3x35,position_ids:T7s3x4,past_key_values:DynamicCache(key_cache=#1[T1s3x1x31x96], value_cache=#1[T1s3x1x31x96]))
    [validate_onnx_model] ort inputs=dict(input_ids:A7s3x4,attention_mask:A7s3x35,position_ids:A7s3x4,past_key_values_key_0:A1s3x1x31x96,past_key_values_value_0:A1s3x1x31x96)
    [validate_onnx_model] done (make_feeds)
    [validate_onnx_model] run session on inputs 'inputs22'...
    [validate_onnx_model] done (run)
    [validate_onnx_model] got=#3[A1s3x4x32000,A1s3x1x35x96,A1s3x1x35x96]
    [validate_onnx_model] discrepancies=abs=8.344650268554688e-07, rel=0.00032628376058705446, n=404160.0
    [validate_onnx_model] -- make_feeds for 'inputs_empty_cache'...
    [validate_onnx_model] inputs=dict(input_ids:T7s2x3,attention_mask:T7s2x3,position_ids:T7s2x3,past_key_values:DynamicCache(key_cache=#1[T1s2x1x0x96], value_cache=#1[T1s2x1x0x96]))
    [validate_onnx_model] ort inputs=dict(input_ids:A7s2x3,attention_mask:A7s2x3,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x0x96,past_key_values_value_0:A1s2x1x0x96)
    [validate_onnx_model] done (make_feeds)
    [validate_onnx_model] run session on inputs 'inputs2_empty_cache'...
    [validate_onnx_model] done (run)
    [validate_onnx_model] got=#3[A1s2x3x32000,A1s2x1x3x96,A1s2x1x3x96]
    [validate_onnx_model] discrepancies=abs=6.556510925292969e-07, rel=0.0002520801563113858, n=193152.0
    [validate_onnx_model] -- make_feeds for 'inputs_batch1'...
    [validate_onnx_model] inputs=dict(input_ids:T7s1x3,attention_mask:T7s1x33,position_ids:T7s1x3,past_key_values:DynamicCache(key_cache=#1[T1s1x1x30x96], value_cache=#1[T1s1x1x30x96]))
    [validate_onnx_model] ort inputs=dict(input_ids:A7s1x3,attention_mask:A7s1x33,position_ids:A7s1x3,past_key_values_key_0:A1s1x1x30x96,past_key_values_value_0:A1s1x1x30x96)
    [validate_onnx_model] done (make_feeds)
    [validate_onnx_model] run session on inputs 'inputs2_batch1'...
    [validate_onnx_model] done (run)
    [validate_onnx_model] got=#3[A1s1x3x32000,A1s1x1x33x96,A1s1x1x33x96]
    [validate_onnx_model] discrepancies=abs=7.748603820800781e-07, rel=0.00037692802454639585, n=102336.0
    [validate_model] run onnxruntime fusion for 'vae'
    failed in shape inference <class 'AssertionError'>
    [validate_model] done 'vae' in 0.3916202269974747, saved into 'dump_models/arnir0_Tiny-LLM/onnx-dynamo/ir/op18/arnir0_Tiny-LLM-onnx-dynamo-ir-op18.ort.vae.onnx'
    [validate_onnx_model] verify onnx model with providers ['CPUExecutionProvider']..., flavour='ortvae'
    [validate_onnx_model] runtime is onnxruntime
    [validate_onnx_model] done (ort_session) flavour='ortvae'
    [validate_onnx_model] -- keys=[('inputs', 'run_expected', ''), ('inputs2', 'run_expected22', '22'), ('inputs_empty_cache', 'run_expected2_empty_cache', '2_empty_cache'), ('inputs_batch1', 'run_expected2_batch1', '2_batch1')]
    [validate_onnx_model] -- make_feeds for 'inputs'...
    [validate_onnx_model] inputs=dict(input_ids:T7s2x3,attention_mask:T7s2x33,position_ids:T7s2x3,past_key_values:DynamicCache(key_cache=#1[T1s2x1x30x96], value_cache=#1[T1s2x1x30x96]))
    [validate_onnx_model] ort inputs=dict(input_ids:A7s2x3,attention_mask:A7s2x33,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x30x96,past_key_values_value_0:A1s2x1x30x96)
    [validate_onnx_model] done (make_feeds)
    [validate_onnx_model] run session on inputs 'inputs'...
    [validate_onnx_model] done (run)
    [validate_onnx_model] got=#3[A1s2x3x32000,A1s2x1x33x96,A1s2x1x33x96]
    [validate_onnx_model] discrepancies=abs=7.748603820800781e-07, rel=0.00044309172106863606, n=204672.0
    [validate_onnx_model] -- make_feeds for 'inputs2'...
    [validate_onnx_model] inputs=dict(input_ids:T7s3x4,attention_mask:T7s3x35,position_ids:T7s3x4,past_key_values:DynamicCache(key_cache=#1[T1s3x1x31x96], value_cache=#1[T1s3x1x31x96]))
    [validate_onnx_model] ort inputs=dict(input_ids:A7s3x4,attention_mask:A7s3x35,position_ids:A7s3x4,past_key_values_key_0:A1s3x1x31x96,past_key_values_value_0:A1s3x1x31x96)
    [validate_onnx_model] done (make_feeds)
    [validate_onnx_model] run session on inputs 'inputs22'...
    [validate_onnx_model] done (run)
    [validate_onnx_model] got=#3[A1s3x4x32000,A1s3x1x35x96,A1s3x1x35x96]
    [validate_onnx_model] discrepancies=abs=8.344650268554688e-07, rel=0.00032628376058705446, n=404160.0
    [validate_onnx_model] -- make_feeds for 'inputs_empty_cache'...
    [validate_onnx_model] inputs=dict(input_ids:T7s2x3,attention_mask:T7s2x3,position_ids:T7s2x3,past_key_values:DynamicCache(key_cache=#1[T1s2x1x0x96], value_cache=#1[T1s2x1x0x96]))
    [validate_onnx_model] ort inputs=dict(input_ids:A7s2x3,attention_mask:A7s2x3,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x0x96,past_key_values_value_0:A1s2x1x0x96)
    [validate_onnx_model] done (make_feeds)
    [validate_onnx_model] run session on inputs 'inputs2_empty_cache'...
    [validate_onnx_model] done (run)
    [validate_onnx_model] got=#3[A1s2x3x32000,A1s2x1x3x96,A1s2x1x3x96]
    [validate_onnx_model] discrepancies=abs=6.556510925292969e-07, rel=0.0002520801563113858, n=193152.0
    [validate_onnx_model] -- make_feeds for 'inputs_batch1'...
    [validate_onnx_model] inputs=dict(input_ids:T7s1x3,attention_mask:T7s1x33,position_ids:T7s1x3,past_key_values:DynamicCache(key_cache=#1[T1s1x1x30x96], value_cache=#1[T1s1x1x30x96]))
    [validate_onnx_model] ort inputs=dict(input_ids:A7s1x3,attention_mask:A7s1x33,position_ids:A7s1x3,past_key_values_key_0:A1s1x1x30x96,past_key_values_value_0:A1s1x1x30x96)
    [validate_onnx_model] done (make_feeds)
    [validate_onnx_model] run session on inputs 'inputs2_batch1'...
    [validate_onnx_model] done (run)
    [validate_onnx_model] got=#3[A1s1x3x32000,A1s1x1x33x96,A1s1x1x33x96]
    [validate_onnx_model] discrepancies=abs=7.748603820800781e-07, rel=0.00037692802454639585, n=102336.0
    [validate_model] run onnxruntime fusion for 'vit'
    failed in shape inference <class 'AssertionError'>
    [validate_model] done 'vit' in 0.183626402002119, saved into 'dump_models/arnir0_Tiny-LLM/onnx-dynamo/ir/op18/arnir0_Tiny-LLM-onnx-dynamo-ir-op18.ort.vit.onnx'
    [validate_onnx_model] verify onnx model with providers ['CPUExecutionProvider']..., flavour='ortvit'
    [validate_onnx_model] runtime is onnxruntime
    [validate_onnx_model] done (ort_session) flavour='ortvit'
    [validate_onnx_model] -- keys=[('inputs', 'run_expected', ''), ('inputs2', 'run_expected22', '22'), ('inputs_empty_cache', 'run_expected2_empty_cache', '2_empty_cache'), ('inputs_batch1', 'run_expected2_batch1', '2_batch1')]
    [validate_onnx_model] -- make_feeds for 'inputs'...
    [validate_onnx_model] inputs=dict(input_ids:T7s2x3,attention_mask:T7s2x33,position_ids:T7s2x3,past_key_values:DynamicCache(key_cache=#1[T1s2x1x30x96], value_cache=#1[T1s2x1x30x96]))
    [validate_onnx_model] ort inputs=dict(input_ids:A7s2x3,attention_mask:A7s2x33,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x30x96,past_key_values_value_0:A1s2x1x30x96)
    [validate_onnx_model] done (make_feeds)
    [validate_onnx_model] run session on inputs 'inputs'...
    [validate_onnx_model] done (run)
    [validate_onnx_model] got=#3[A1s2x3x32000,A1s2x1x33x96,A1s2x1x33x96]
    [validate_onnx_model] discrepancies=abs=8.344650268554688e-07, rel=0.00038373230338287646, n=204672.0
    [validate_onnx_model] -- make_feeds for 'inputs2'...
    [validate_onnx_model] inputs=dict(input_ids:T7s3x4,attention_mask:T7s3x35,position_ids:T7s3x4,past_key_values:DynamicCache(key_cache=#1[T1s3x1x31x96], value_cache=#1[T1s3x1x31x96]))
    [validate_onnx_model] ort inputs=dict(input_ids:A7s3x4,attention_mask:A7s3x35,position_ids:A7s3x4,past_key_values_key_0:A1s3x1x31x96,past_key_values_value_0:A1s3x1x31x96)
    [validate_onnx_model] done (make_feeds)
    [validate_onnx_model] run session on inputs 'inputs22'...
    [validate_onnx_model] done (run)
    [validate_onnx_model] got=#3[A1s3x4x32000,A1s3x1x35x96,A1s3x1x35x96]
    [validate_onnx_model] discrepancies=abs=1.0132789611816406e-06, rel=0.00026781923006472165, n=404160.0
    [validate_onnx_model] -- make_feeds for 'inputs_empty_cache'...
    [validate_onnx_model] inputs=dict(input_ids:T7s2x3,attention_mask:T7s2x3,position_ids:T7s2x3,past_key_values:DynamicCache(key_cache=#1[T1s2x1x0x96], value_cache=#1[T1s2x1x0x96]))
    [validate_onnx_model] ort inputs=dict(input_ids:A7s2x3,attention_mask:A7s2x3,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x0x96,past_key_values_value_0:A1s2x1x0x96)
    [validate_onnx_model] done (make_feeds)
    [validate_onnx_model] run session on inputs 'inputs2_empty_cache'...
    [validate_onnx_model] done (run)
    [validate_onnx_model] got=#3[A1s2x3x32000,A1s2x1x3x96,A1s2x1x3x96]
    [validate_onnx_model] discrepancies=abs=1.0728836059570312e-06, rel=0.00034613720184126256, n=193152.0
    [validate_onnx_model] -- make_feeds for 'inputs_batch1'...
    [validate_onnx_model] inputs=dict(input_ids:T7s1x3,attention_mask:T7s1x33,position_ids:T7s1x3,past_key_values:DynamicCache(key_cache=#1[T1s1x1x30x96], value_cache=#1[T1s1x1x30x96]))
    [validate_onnx_model] ort inputs=dict(input_ids:A7s1x3,attention_mask:A7s1x33,position_ids:A7s1x3,past_key_values_key_0:A1s1x1x30x96,past_key_values_value_0:A1s1x1x30x96)
    [validate_onnx_model] done (make_feeds)
    [validate_onnx_model] run session on inputs 'inputs2_batch1'...
    [validate_onnx_model] done (run)
    [validate_onnx_model] got=#3[A1s1x3x32000,A1s1x1x33x96,A1s1x1x33x96]
    [validate_onnx_model] discrepancies=abs=1.1026859283447266e-06, rel=0.0003551108443373418, n=102336.0
    [validate_model] -- done (final)
    
    -- summary --
    :ERR_onnx_missing_ortphi,FileNotFoundError('dump_models/arnir0_Tiny-LLM/onnx-dynamo/ir/op18/arnir0_Tiny-LLM-onnx-dynamo-ir-op18.ort.phi.onnx');
    :ERR_opt_ort_phi,'method' object is not iterable;
    :disc_onnx_ort_run22_abs,8.344650268554688e-07;
    :disc_onnx_ort_run22_abs_ortbart,1.0132789611816406e-06;
    :disc_onnx_ort_run22_abs_ortbert,1.0132789611816406e-06;
    :disc_onnx_ort_run22_abs_ortbert_keras,1.0132789611816406e-06;
    :disc_onnx_ort_run22_abs_ortbert_tf,1.0132789611816406e-06;
    :disc_onnx_ort_run22_abs_ortclip,1.0132789611816406e-06;
    :disc_onnx_ort_run22_abs_ortconformer,1.0132789611816406e-06;
    :disc_onnx_ort_run22_abs_ortgpt2,1.0132789611816406e-06;
    :disc_onnx_ort_run22_abs_ortgpt2_tf,1.0132789611816406e-06;
    :disc_onnx_ort_run22_abs_ortgpt_neox,1.0132789611816406e-06;
    :disc_onnx_ort_run22_abs_ortmmdit,1.0132789611816406e-06;
    :disc_onnx_ort_run22_abs_ortsam2,8.344650268554688e-07;
    :disc_onnx_ort_run22_abs_ortswin,1.0132789611816406e-06;
    :disc_onnx_ort_run22_abs_ortt5,1.0132789611816406e-06;
    :disc_onnx_ort_run22_abs_orttnlr,1.0132789611816406e-06;
    :disc_onnx_ort_run22_abs_ortunet,8.344650268554688e-07;
    :disc_onnx_ort_run22_abs_ortvae,8.344650268554688e-07;
    :disc_onnx_ort_run22_abs_ortvit,1.0132789611816406e-06;
    :disc_onnx_ort_run22_dnan,0;
    :disc_onnx_ort_run22_dnan_ortbart,0;
    :disc_onnx_ort_run22_dnan_ortbert,0;
    :disc_onnx_ort_run22_dnan_ortbert_keras,0;
    :disc_onnx_ort_run22_dnan_ortbert_tf,0;
    :disc_onnx_ort_run22_dnan_ortclip,0;
    :disc_onnx_ort_run22_dnan_ortconformer,0;
    :disc_onnx_ort_run22_dnan_ortgpt2,0;
    :disc_onnx_ort_run22_dnan_ortgpt2_tf,0;
    :disc_onnx_ort_run22_dnan_ortgpt_neox,0;
    :disc_onnx_ort_run22_dnan_ortmmdit,0;
    :disc_onnx_ort_run22_dnan_ortsam2,0;
    :disc_onnx_ort_run22_dnan_ortswin,0;
    :disc_onnx_ort_run22_dnan_ortt5,0;
    :disc_onnx_ort_run22_dnan_orttnlr,0;
    :disc_onnx_ort_run22_dnan_ortunet,0;
    :disc_onnx_ort_run22_dnan_ortvae,0;
    :disc_onnx_ort_run22_dnan_ortvit,0;
    :disc_onnx_ort_run22_n,404160.0;
    :disc_onnx_ort_run22_n_ortbart,404160.0;
    :disc_onnx_ort_run22_n_ortbert,404160.0;
    :disc_onnx_ort_run22_n_ortbert_keras,404160.0;
    :disc_onnx_ort_run22_n_ortbert_tf,404160.0;
    :disc_onnx_ort_run22_n_ortclip,404160.0;
    :disc_onnx_ort_run22_n_ortconformer,404160.0;
    :disc_onnx_ort_run22_n_ortgpt2,404160.0;
    :disc_onnx_ort_run22_n_ortgpt2_tf,404160.0;
    :disc_onnx_ort_run22_n_ortgpt_neox,404160.0;
    :disc_onnx_ort_run22_n_ortmmdit,404160.0;
    :disc_onnx_ort_run22_n_ortsam2,404160.0;
    :disc_onnx_ort_run22_n_ortswin,404160.0;
    :disc_onnx_ort_run22_n_ortt5,404160.0;
    :disc_onnx_ort_run22_n_orttnlr,404160.0;
    :disc_onnx_ort_run22_n_ortunet,404160.0;
    :disc_onnx_ort_run22_n_ortvae,404160.0;
    :disc_onnx_ort_run22_n_ortvit,404160.0;
    :disc_onnx_ort_run22_rel,0.00032628376058705446;
    :disc_onnx_ort_run22_rel_ortbart,0.00026781923006472165;
    :disc_onnx_ort_run22_rel_ortbert,0.00026781923006472165;
    :disc_onnx_ort_run22_rel_ortbert_keras,0.00026781923006472165;
    :disc_onnx_ort_run22_rel_ortbert_tf,0.00026781923006472165;
    :disc_onnx_ort_run22_rel_ortclip,0.00026781923006472165;
    :disc_onnx_ort_run22_rel_ortconformer,0.00026781923006472165;
    :disc_onnx_ort_run22_rel_ortgpt2,0.00026781923006472165;
    :disc_onnx_ort_run22_rel_ortgpt2_tf,0.00026781923006472165;
    :disc_onnx_ort_run22_rel_ortgpt_neox,0.00026781923006472165;
    :disc_onnx_ort_run22_rel_ortmmdit,0.00026781923006472165;
    :disc_onnx_ort_run22_rel_ortsam2,0.00032628376058705446;
    :disc_onnx_ort_run22_rel_ortswin,0.00026781923006472165;
    :disc_onnx_ort_run22_rel_ortt5,0.00026781923006472165;
    :disc_onnx_ort_run22_rel_orttnlr,0.00026781923006472165;
    :disc_onnx_ort_run22_rel_ortunet,0.00032628376058705446;
    :disc_onnx_ort_run22_rel_ortvae,0.00032628376058705446;
    :disc_onnx_ort_run22_rel_ortvit,0.00026781923006472165;
    :disc_onnx_ort_run22_sum,0.039355116007072866;
    :disc_onnx_ort_run22_sum_ortbart,0.042834832951484714;
    :disc_onnx_ort_run22_sum_ortbert,0.042834832951484714;
    :disc_onnx_ort_run22_sum_ortbert_keras,0.042834832951484714;
    :disc_onnx_ort_run22_sum_ortbert_tf,0.042834832951484714;
    :disc_onnx_ort_run22_sum_ortclip,0.042834832951484714;
    :disc_onnx_ort_run22_sum_ortconformer,0.042834832951484714;
    :disc_onnx_ort_run22_sum_ortgpt2,0.042834832951484714;
    :disc_onnx_ort_run22_sum_ortgpt2_tf,0.042834832951484714;
    :disc_onnx_ort_run22_sum_ortgpt_neox,0.042834832951484714;
    :disc_onnx_ort_run22_sum_ortmmdit,0.042834832951484714;
    :disc_onnx_ort_run22_sum_ortsam2,0.039355116007072866;
    :disc_onnx_ort_run22_sum_ortswin,0.042834832951484714;
    :disc_onnx_ort_run22_sum_ortt5,0.042834832951484714;
    :disc_onnx_ort_run22_sum_orttnlr,0.042834832951484714;
    :disc_onnx_ort_run22_sum_ortunet,0.039355116007072866;
    :disc_onnx_ort_run22_sum_ortvae,0.039355116007072866;
    :disc_onnx_ort_run22_sum_ortvit,0.042834832951484714;
    :disc_onnx_ort_run2_batch1_abs,7.748603820800781e-07;
    :disc_onnx_ort_run2_batch1_abs_ortbart,1.1026859283447266e-06;
    :disc_onnx_ort_run2_batch1_abs_ortbert,1.1026859283447266e-06;
    :disc_onnx_ort_run2_batch1_abs_ortbert_keras,1.1026859283447266e-06;
    :disc_onnx_ort_run2_batch1_abs_ortbert_tf,1.1026859283447266e-06;
    :disc_onnx_ort_run2_batch1_abs_ortclip,1.1026859283447266e-06;
    :disc_onnx_ort_run2_batch1_abs_ortconformer,1.1026859283447266e-06;
    :disc_onnx_ort_run2_batch1_abs_ortgpt2,1.1026859283447266e-06;
    :disc_onnx_ort_run2_batch1_abs_ortgpt2_tf,1.1026859283447266e-06;
    :disc_onnx_ort_run2_batch1_abs_ortgpt_neox,1.1026859283447266e-06;
    :disc_onnx_ort_run2_batch1_abs_ortmmdit,1.1026859283447266e-06;
    :disc_onnx_ort_run2_batch1_abs_ortsam2,7.748603820800781e-07;
    :disc_onnx_ort_run2_batch1_abs_ortswin,1.1026859283447266e-06;
    :disc_onnx_ort_run2_batch1_abs_ortt5,1.1026859283447266e-06;
    :disc_onnx_ort_run2_batch1_abs_orttnlr,1.1026859283447266e-06;
    :disc_onnx_ort_run2_batch1_abs_ortunet,7.748603820800781e-07;
    :disc_onnx_ort_run2_batch1_abs_ortvae,7.748603820800781e-07;
    :disc_onnx_ort_run2_batch1_abs_ortvit,1.1026859283447266e-06;
    :disc_onnx_ort_run2_batch1_dnan,0;
    :disc_onnx_ort_run2_batch1_dnan_ortbart,0;
    :disc_onnx_ort_run2_batch1_dnan_ortbert,0;
    :disc_onnx_ort_run2_batch1_dnan_ortbert_keras,0;
    :disc_onnx_ort_run2_batch1_dnan_ortbert_tf,0;
    :disc_onnx_ort_run2_batch1_dnan_ortclip,0;
    :disc_onnx_ort_run2_batch1_dnan_ortconformer,0;
    :disc_onnx_ort_run2_batch1_dnan_ortgpt2,0;
    :disc_onnx_ort_run2_batch1_dnan_ortgpt2_tf,0;
    :disc_onnx_ort_run2_batch1_dnan_ortgpt_neox,0;
    :disc_onnx_ort_run2_batch1_dnan_ortmmdit,0;
    :disc_onnx_ort_run2_batch1_dnan_ortsam2,0;
    :disc_onnx_ort_run2_batch1_dnan_ortswin,0;
    :disc_onnx_ort_run2_batch1_dnan_ortt5,0;
    :disc_onnx_ort_run2_batch1_dnan_orttnlr,0;
    :disc_onnx_ort_run2_batch1_dnan_ortunet,0;
    :disc_onnx_ort_run2_batch1_dnan_ortvae,0;
    :disc_onnx_ort_run2_batch1_dnan_ortvit,0;
    :disc_onnx_ort_run2_batch1_n,102336.0;
    :disc_onnx_ort_run2_batch1_n_ortbart,102336.0;
    :disc_onnx_ort_run2_batch1_n_ortbert,102336.0;
    :disc_onnx_ort_run2_batch1_n_ortbert_keras,102336.0;
    :disc_onnx_ort_run2_batch1_n_ortbert_tf,102336.0;
    :disc_onnx_ort_run2_batch1_n_ortclip,102336.0;
    :disc_onnx_ort_run2_batch1_n_ortconformer,102336.0;
    :disc_onnx_ort_run2_batch1_n_ortgpt2,102336.0;
    :disc_onnx_ort_run2_batch1_n_ortgpt2_tf,102336.0;
    :disc_onnx_ort_run2_batch1_n_ortgpt_neox,102336.0;
    :disc_onnx_ort_run2_batch1_n_ortmmdit,102336.0;
    :disc_onnx_ort_run2_batch1_n_ortsam2,102336.0;
    :disc_onnx_ort_run2_batch1_n_ortswin,102336.0;
    :disc_onnx_ort_run2_batch1_n_ortt5,102336.0;
    :disc_onnx_ort_run2_batch1_n_orttnlr,102336.0;
    :disc_onnx_ort_run2_batch1_n_ortunet,102336.0;
    :disc_onnx_ort_run2_batch1_n_ortvae,102336.0;
    :disc_onnx_ort_run2_batch1_n_ortvit,102336.0;
    :disc_onnx_ort_run2_batch1_rel,0.00037692802454639585;
    :disc_onnx_ort_run2_batch1_rel_ortbart,0.0003551108443373418;
    :disc_onnx_ort_run2_batch1_rel_ortbert,0.0003551108443373418;
    :disc_onnx_ort_run2_batch1_rel_ortbert_keras,0.0003551108443373418;
    :disc_onnx_ort_run2_batch1_rel_ortbert_tf,0.0003551108443373418;
    :disc_onnx_ort_run2_batch1_rel_ortclip,0.0003551108443373418;
    :disc_onnx_ort_run2_batch1_rel_ortconformer,0.0003551108443373418;
    :disc_onnx_ort_run2_batch1_rel_ortgpt2,0.0003551108443373418;
    :disc_onnx_ort_run2_batch1_rel_ortgpt2_tf,0.0003551108443373418;
    :disc_onnx_ort_run2_batch1_rel_ortgpt_neox,0.0003551108443373418;
    :disc_onnx_ort_run2_batch1_rel_ortmmdit,0.0003551108443373418;
    :disc_onnx_ort_run2_batch1_rel_ortsam2,0.00037692802454639585;
    :disc_onnx_ort_run2_batch1_rel_ortswin,0.0003551108443373418;
    :disc_onnx_ort_run2_batch1_rel_ortt5,0.0003551108443373418;
    :disc_onnx_ort_run2_batch1_rel_orttnlr,0.0003551108443373418;
    :disc_onnx_ort_run2_batch1_rel_ortunet,0.00037692802454639585;
    :disc_onnx_ort_run2_batch1_rel_ortvae,0.00037692802454639585;
    :disc_onnx_ort_run2_batch1_rel_ortvit,0.0003551108443373418;
    :disc_onnx_ort_run2_batch1_sum,0.011718470417235949;
    :disc_onnx_ort_run2_batch1_sum_ortbart,0.012457452539820224;
    :disc_onnx_ort_run2_batch1_sum_ortbert,0.012457452539820224;
    :disc_onnx_ort_run2_batch1_sum_ortbert_keras,0.012457452539820224;
    :disc_onnx_ort_run2_batch1_sum_ortbert_tf,0.012457452539820224;
    :disc_onnx_ort_run2_batch1_sum_ortclip,0.012457452539820224;
    :disc_onnx_ort_run2_batch1_sum_ortconformer,0.012457452539820224;
    :disc_onnx_ort_run2_batch1_sum_ortgpt2,0.012457452539820224;
    :disc_onnx_ort_run2_batch1_sum_ortgpt2_tf,0.012457452539820224;
    :disc_onnx_ort_run2_batch1_sum_ortgpt_neox,0.012457452539820224;
    :disc_onnx_ort_run2_batch1_sum_ortmmdit,0.012457452539820224;
    :disc_onnx_ort_run2_batch1_sum_ortsam2,0.011718470417235949;
    :disc_onnx_ort_run2_batch1_sum_ortswin,0.012457452539820224;
    :disc_onnx_ort_run2_batch1_sum_ortt5,0.012457452539820224;
    :disc_onnx_ort_run2_batch1_sum_orttnlr,0.012457452539820224;
    :disc_onnx_ort_run2_batch1_sum_ortunet,0.011718470417235949;
    :disc_onnx_ort_run2_batch1_sum_ortvae,0.011718470417235949;
    :disc_onnx_ort_run2_batch1_sum_ortvit,0.012457452539820224;
    :disc_onnx_ort_run2_empty_cache_abs,6.556510925292969e-07;
    :disc_onnx_ort_run2_empty_cache_abs_ortbart,1.0728836059570312e-06;
    :disc_onnx_ort_run2_empty_cache_abs_ortbert,1.0728836059570312e-06;
    :disc_onnx_ort_run2_empty_cache_abs_ortbert_keras,1.0728836059570312e-06;
    :disc_onnx_ort_run2_empty_cache_abs_ortbert_tf,1.0728836059570312e-06;
    :disc_onnx_ort_run2_empty_cache_abs_ortclip,1.0728836059570312e-06;
    :disc_onnx_ort_run2_empty_cache_abs_ortconformer,1.0728836059570312e-06;
    :disc_onnx_ort_run2_empty_cache_abs_ortgpt2,1.0728836059570312e-06;
    :disc_onnx_ort_run2_empty_cache_abs_ortgpt2_tf,1.0728836059570312e-06;
    :disc_onnx_ort_run2_empty_cache_abs_ortgpt_neox,1.0728836059570312e-06;
    :disc_onnx_ort_run2_empty_cache_abs_ortmmdit,1.0728836059570312e-06;
    :disc_onnx_ort_run2_empty_cache_abs_ortsam2,6.556510925292969e-07;
    :disc_onnx_ort_run2_empty_cache_abs_ortswin,1.0728836059570312e-06;
    :disc_onnx_ort_run2_empty_cache_abs_ortt5,1.0728836059570312e-06;
    :disc_onnx_ort_run2_empty_cache_abs_orttnlr,1.0728836059570312e-06;
    :disc_onnx_ort_run2_empty_cache_abs_ortunet,6.556510925292969e-07;
    :disc_onnx_ort_run2_empty_cache_abs_ortvae,6.556510925292969e-07;
    :disc_onnx_ort_run2_empty_cache_abs_ortvit,1.0728836059570312e-06;
    :disc_onnx_ort_run2_empty_cache_dnan,0;
    :disc_onnx_ort_run2_empty_cache_dnan_ortbart,0;
    :disc_onnx_ort_run2_empty_cache_dnan_ortbert,0;
    :disc_onnx_ort_run2_empty_cache_dnan_ortbert_keras,0;
    :disc_onnx_ort_run2_empty_cache_dnan_ortbert_tf,0;
    :disc_onnx_ort_run2_empty_cache_dnan_ortclip,0;
    :disc_onnx_ort_run2_empty_cache_dnan_ortconformer,0;
    :disc_onnx_ort_run2_empty_cache_dnan_ortgpt2,0;
    :disc_onnx_ort_run2_empty_cache_dnan_ortgpt2_tf,0;
    :disc_onnx_ort_run2_empty_cache_dnan_ortgpt_neox,0;
    :disc_onnx_ort_run2_empty_cache_dnan_ortmmdit,0;
    :disc_onnx_ort_run2_empty_cache_dnan_ortsam2,0;
    :disc_onnx_ort_run2_empty_cache_dnan_ortswin,0;
    :disc_onnx_ort_run2_empty_cache_dnan_ortt5,0;
    :disc_onnx_ort_run2_empty_cache_dnan_orttnlr,0;
    :disc_onnx_ort_run2_empty_cache_dnan_ortunet,0;
    :disc_onnx_ort_run2_empty_cache_dnan_ortvae,0;
    :disc_onnx_ort_run2_empty_cache_dnan_ortvit,0;
    :disc_onnx_ort_run2_empty_cache_n,193152.0;
    :disc_onnx_ort_run2_empty_cache_n_ortbart,193152.0;
    :disc_onnx_ort_run2_empty_cache_n_ortbert,193152.0;
    :disc_onnx_ort_run2_empty_cache_n_ortbert_keras,193152.0;
    :disc_onnx_ort_run2_empty_cache_n_ortbert_tf,193152.0;
    :disc_onnx_ort_run2_empty_cache_n_ortclip,193152.0;
    :disc_onnx_ort_run2_empty_cache_n_ortconformer,193152.0;
    :disc_onnx_ort_run2_empty_cache_n_ortgpt2,193152.0;
    :disc_onnx_ort_run2_empty_cache_n_ortgpt2_tf,193152.0;
    :disc_onnx_ort_run2_empty_cache_n_ortgpt_neox,193152.0;
    :disc_onnx_ort_run2_empty_cache_n_ortmmdit,193152.0;
    :disc_onnx_ort_run2_empty_cache_n_ortsam2,193152.0;
    :disc_onnx_ort_run2_empty_cache_n_ortswin,193152.0;
    :disc_onnx_ort_run2_empty_cache_n_ortt5,193152.0;
    :disc_onnx_ort_run2_empty_cache_n_orttnlr,193152.0;
    :disc_onnx_ort_run2_empty_cache_n_ortunet,193152.0;
    :disc_onnx_ort_run2_empty_cache_n_ortvae,193152.0;
    :disc_onnx_ort_run2_empty_cache_n_ortvit,193152.0;
    :disc_onnx_ort_run2_empty_cache_rel,0.0002520801563113858;
    :disc_onnx_ort_run2_empty_cache_rel_ortbart,0.00034613720184126256;
    :disc_onnx_ort_run2_empty_cache_rel_ortbert,0.00034613720184126256;
    :disc_onnx_ort_run2_empty_cache_rel_ortbert_keras,0.00034613720184126256;
    :disc_onnx_ort_run2_empty_cache_rel_ortbert_tf,0.00034613720184126256;
    :disc_onnx_ort_run2_empty_cache_rel_ortclip,0.00034613720184126256;
    :disc_onnx_ort_run2_empty_cache_rel_ortconformer,0.00034613720184126256;
    :disc_onnx_ort_run2_empty_cache_rel_ortgpt2,0.00034613720184126256;
    :disc_onnx_ort_run2_empty_cache_rel_ortgpt2_tf,0.00034613720184126256;
    :disc_onnx_ort_run2_empty_cache_rel_ortgpt_neox,0.00034613720184126256;
    :disc_onnx_ort_run2_empty_cache_rel_ortmmdit,0.00034613720184126256;
    :disc_onnx_ort_run2_empty_cache_rel_ortsam2,0.0002520801563113858;
    :disc_onnx_ort_run2_empty_cache_rel_ortswin,0.00034613720184126256;
    :disc_onnx_ort_run2_empty_cache_rel_ortt5,0.00034613720184126256;
    :disc_onnx_ort_run2_empty_cache_rel_orttnlr,0.00034613720184126256;
    :disc_onnx_ort_run2_empty_cache_rel_ortunet,0.0002520801563113858;
    :disc_onnx_ort_run2_empty_cache_rel_ortvae,0.0002520801563113858;
    :disc_onnx_ort_run2_empty_cache_rel_ortvit,0.00034613720184126256;
    :disc_onnx_ort_run2_empty_cache_sum,0.012811366097139398;
    :disc_onnx_ort_run2_empty_cache_sum_ortbart,0.020510222977463854;
    :disc_onnx_ort_run2_empty_cache_sum_ortbert,0.020510222977463854;
    :disc_onnx_ort_run2_empty_cache_sum_ortbert_keras,0.020510222977463854;
    :disc_onnx_ort_run2_empty_cache_sum_ortbert_tf,0.020510222977463854;
    :disc_onnx_ort_run2_empty_cache_sum_ortclip,0.020510222977463854;
    :disc_onnx_ort_run2_empty_cache_sum_ortconformer,0.020510222977463854;
    :disc_onnx_ort_run2_empty_cache_sum_ortgpt2,0.020510222977463854;
    :disc_onnx_ort_run2_empty_cache_sum_ortgpt2_tf,0.020510222977463854;
    :disc_onnx_ort_run2_empty_cache_sum_ortgpt_neox,0.020510222977463854;
    :disc_onnx_ort_run2_empty_cache_sum_ortmmdit,0.020510222977463854;
    :disc_onnx_ort_run2_empty_cache_sum_ortsam2,0.012811366097139398;
    :disc_onnx_ort_run2_empty_cache_sum_ortswin,0.020510222977463854;
    :disc_onnx_ort_run2_empty_cache_sum_ortt5,0.020510222977463854;
    :disc_onnx_ort_run2_empty_cache_sum_orttnlr,0.020510222977463854;
    :disc_onnx_ort_run2_empty_cache_sum_ortunet,0.012811366097139398;
    :disc_onnx_ort_run2_empty_cache_sum_ortvae,0.012811366097139398;
    :disc_onnx_ort_run2_empty_cache_sum_ortvit,0.020510222977463854;
    :disc_onnx_ort_run_abs,7.748603820800781e-07;
    :disc_onnx_ort_run_abs_ortbart,8.344650268554688e-07;
    :disc_onnx_ort_run_abs_ortbert,8.344650268554688e-07;
    :disc_onnx_ort_run_abs_ortbert_keras,8.344650268554688e-07;
    :disc_onnx_ort_run_abs_ortbert_tf,8.344650268554688e-07;
    :disc_onnx_ort_run_abs_ortclip,8.344650268554688e-07;
    :disc_onnx_ort_run_abs_ortconformer,8.344650268554688e-07;
    :disc_onnx_ort_run_abs_ortgpt2,8.344650268554688e-07;
    :disc_onnx_ort_run_abs_ortgpt2_tf,8.344650268554688e-07;
    :disc_onnx_ort_run_abs_ortgpt_neox,8.344650268554688e-07;
    :disc_onnx_ort_run_abs_ortmmdit,8.344650268554688e-07;
    :disc_onnx_ort_run_abs_ortsam2,7.748603820800781e-07;
    :disc_onnx_ort_run_abs_ortswin,8.344650268554688e-07;
    :disc_onnx_ort_run_abs_ortt5,8.344650268554688e-07;
    :disc_onnx_ort_run_abs_orttnlr,8.344650268554688e-07;
    :disc_onnx_ort_run_abs_ortunet,7.748603820800781e-07;
    :disc_onnx_ort_run_abs_ortvae,7.748603820800781e-07;
    :disc_onnx_ort_run_abs_ortvit,8.344650268554688e-07;
    :disc_onnx_ort_run_dnan,0;
    :disc_onnx_ort_run_dnan_ortbart,0;
    :disc_onnx_ort_run_dnan_ortbert,0;
    :disc_onnx_ort_run_dnan_ortbert_keras,0;
    :disc_onnx_ort_run_dnan_ortbert_tf,0;
    :disc_onnx_ort_run_dnan_ortclip,0;
    :disc_onnx_ort_run_dnan_ortconformer,0;
    :disc_onnx_ort_run_dnan_ortgpt2,0;
    :disc_onnx_ort_run_dnan_ortgpt2_tf,0;
    :disc_onnx_ort_run_dnan_ortgpt_neox,0;
    :disc_onnx_ort_run_dnan_ortmmdit,0;
    :disc_onnx_ort_run_dnan_ortsam2,0;
    :disc_onnx_ort_run_dnan_ortswin,0;
    :disc_onnx_ort_run_dnan_ortt5,0;
    :disc_onnx_ort_run_dnan_orttnlr,0;
    :disc_onnx_ort_run_dnan_ortunet,0;
    :disc_onnx_ort_run_dnan_ortvae,0;
    :disc_onnx_ort_run_dnan_ortvit,0;
    :disc_onnx_ort_run_n,204672.0;
    :disc_onnx_ort_run_n_ortbart,204672.0;
    :disc_onnx_ort_run_n_ortbert,204672.0;
    :disc_onnx_ort_run_n_ortbert_keras,204672.0;
    :disc_onnx_ort_run_n_ortbert_tf,204672.0;
    :disc_onnx_ort_run_n_ortclip,204672.0;
    :disc_onnx_ort_run_n_ortconformer,204672.0;
    :disc_onnx_ort_run_n_ortgpt2,204672.0;
    :disc_onnx_ort_run_n_ortgpt2_tf,204672.0;
    :disc_onnx_ort_run_n_ortgpt_neox,204672.0;
    :disc_onnx_ort_run_n_ortmmdit,204672.0;
    :disc_onnx_ort_run_n_ortsam2,204672.0;
    :disc_onnx_ort_run_n_ortswin,204672.0;
    :disc_onnx_ort_run_n_ortt5,204672.0;
    :disc_onnx_ort_run_n_orttnlr,204672.0;
    :disc_onnx_ort_run_n_ortunet,204672.0;
    :disc_onnx_ort_run_n_ortvae,204672.0;
    :disc_onnx_ort_run_n_ortvit,204672.0;
    :disc_onnx_ort_run_rel,0.00044309172106863606;
    :disc_onnx_ort_run_rel_ortbart,0.00038373230338287646;
    :disc_onnx_ort_run_rel_ortbert,0.00038373230338287646;
    :disc_onnx_ort_run_rel_ortbert_keras,0.00038373230338287646;
    :disc_onnx_ort_run_rel_ortbert_tf,0.00038373230338287646;
    :disc_onnx_ort_run_rel_ortclip,0.00038373230338287646;
    :disc_onnx_ort_run_rel_ortconformer,0.00038373230338287646;
    :disc_onnx_ort_run_rel_ortgpt2,0.00038373230338287646;
    :disc_onnx_ort_run_rel_ortgpt2_tf,0.00038373230338287646;
    :disc_onnx_ort_run_rel_ortgpt_neox,0.00038373230338287646;
    :disc_onnx_ort_run_rel_ortmmdit,0.00038373230338287646;
    :disc_onnx_ort_run_rel_ortsam2,0.00044309172106863606;
    :disc_onnx_ort_run_rel_ortswin,0.00038373230338287646;
    :disc_onnx_ort_run_rel_ortt5,0.00038373230338287646;
    :disc_onnx_ort_run_rel_orttnlr,0.00038373230338287646;
    :disc_onnx_ort_run_rel_ortunet,0.00044309172106863606;
    :disc_onnx_ort_run_rel_ortvae,0.00044309172106863606;
    :disc_onnx_ort_run_rel_ortvit,0.00038373230338287646;
    :disc_onnx_ort_run_sum,0.02031988672524676;
    :disc_onnx_ort_run_sum_ortbart,0.022044641082175076;
    :disc_onnx_ort_run_sum_ortbert,0.022044641082175076;
    :disc_onnx_ort_run_sum_ortbert_keras,0.022044641082175076;
    :disc_onnx_ort_run_sum_ortbert_tf,0.022044641082175076;
    :disc_onnx_ort_run_sum_ortclip,0.022044641082175076;
    :disc_onnx_ort_run_sum_ortconformer,0.022044641082175076;
    :disc_onnx_ort_run_sum_ortgpt2,0.022044641082175076;
    :disc_onnx_ort_run_sum_ortgpt2_tf,0.022044641082175076;
    :disc_onnx_ort_run_sum_ortgpt_neox,0.022044641082175076;
    :disc_onnx_ort_run_sum_ortmmdit,0.022044641082175076;
    :disc_onnx_ort_run_sum_ortsam2,0.02031988672524676;
    :disc_onnx_ort_run_sum_ortswin,0.022044641082175076;
    :disc_onnx_ort_run_sum_ortt5,0.022044641082175076;
    :disc_onnx_ort_run_sum_orttnlr,0.022044641082175076;
    :disc_onnx_ort_run_sum_ortunet,0.02031988672524676;
    :disc_onnx_ort_run_sum_ortvae,0.02031988672524676;
    :disc_onnx_ort_run_sum_ortvit,0.022044641082175076;
    :disc_patched_abs,0;
    :disc_patched_dnan,0;
    :disc_patched_n,204672.0;
    :disc_patched_rel,0;
    :disc_patched_sum,0.0;
    :dump_folder,dump_models/arnir0_Tiny-LLM/onnx-dynamo/ir/op18;
    :dump_folder_name,arnir0_Tiny-LLM/onnx-dynamo/ir/op18;
    :export_args,();
    :export_dynamo,True;
    :export_exporter,onnx-dynamo;
    :export_kwargs,dict(input_ids:T7s2x3,attention_mask:T7s2x33,position_ids:T7s2x3,past_key_values:DynamicCache(key_cache=#1[T1s2x1x30x96], value_cache=#1[T1s2x1x30x96]));
    :export_opset,18;
    :export_optimization,ir;
    :model_class,LlamaForCausalLM;
    :model_config,{'vocab_size':32000,'max_position_embeddings':1024,'hidden_size':192,'intermediate_size':1024,'num_hidden_layers':1,'num_attention_heads':2,'num_key_value_heads':1,'hidden_act':'silu','initializer_range':0.02,'rms_norm_eps':1e-05,'pretraining_tp':1,'use_cache':True,'attention_bias':False,'attention_dropout':0.0,'mlp_bias':False,'head_dim':96,'rope_parameters':{'rope_type':'default','rope_theta':10000.0},'return_dict':True,'output_hidden_states':False,'dtype':'float32','tie_word_embeddings':False,'chunk_size_feed_forward':0,'is_encoder_decoder':False,'is_decoder':False,'cross_attention_hidden_size':None,'add_cross_attention':False,'tie_encoder_decoder':False,'architectures':['LlamaForCausalLM'],'finetuning_task':None,'id2label':{0:'LABEL_0',1:'LABEL_1'},'label2id':{'LABEL_0':0,'LABEL_1':1},'task_specific_params':None,'problem_type':None,'tokenizer_class':None,'prefix':None,'bos_token_id':1,'pad_token_id':None,'eos_token_id':2,'sep_token_id':None,'decoder_start_token_id':None,'max_length':20,'min_length':0,'do_sample':False,'early_stopping':False,'num_beams':1,'temperature':1.0,'top_k':50,'top_p':1.0,'typical_p':1.0,'repetition_penalty':1.0,'length_penalty':1.0,'no_repeat_ngram_size':0,'encoder_no_repeat_ngram_size':0,'bad_words_ids':None,'num_return_sequences':1,'output_scores':False,'return_dict_in_generate':False,'forced_bos_token_id':None,'forced_eos_token_id':None,'remove_invalid_values':False,'exponential_decay_length_penalty':None,'suppress_tokens':None,'begin_suppress_tokens':None,'num_beam_groups':1,'diversity_penalty':0.0,'_name_or_path':'','transformers_version':'5.0.0.dev0','model_type':'llama','rope_theta':10000.0,'subfolder':None,'output_attentions':False};
    :model_config_class,LlamaConfig;
    :model_file,~/github/transformers/src/transformers/models/llama/modeling_llama.py;
    :model_id,arnir0/Tiny-LLM;
    :model_inputs,dict(input_ids:T7s2x3,attention_mask:T7s2x33,position_ids:T7s2x3,past_key_values:DynamicCache(key_cache=#1[T1s2x1x30x96], value_cache=#1[T1s2x1x30x96]));
    :model_inputs_options,;
    :model_module,transformers.models.llama.modeling_llama;
    :model_nweights,12988992;
    :model_shapes,dict(input_ids:{0:DYN(batch),1:DYN(seq_length)},attention_mask:{0:DYN(batch),1:DYN(cache+seq)},position_ids:{0:DYN(batch),1:DYN(seq_length)},past_key_values:#2[{0:DYN(batch),2:DYN(cache_length)},{0:DYN(batch),2:DYN(cache_length)}]);
    :model_size,51955968;
    :model_subfolder,;
    :model_task,text-generation;
    :n_node_Add,10;
    :n_node_And,2;
    :n_node_Cast,2;
    :n_node_Concat,16;
    :n_node_Cos,1;
    :n_node_Expand,6;
    :n_node_Gather,1;
    :n_node_GatherND,1;
    :n_node_IsNaN,1;
    :n_node_LessOrEqual,1;
    :n_node_MatMul,11;
    :n_node_Mul,14;
    :n_node_Neg,2;
    :n_node_Pow,3;
    :n_node_Range,3;
    :n_node_Reciprocal,3;
    :n_node_ReduceMean,3;
    :n_node_Reshape,13;
    :n_node_Shape,5;
    :n_node_Sigmoid,1;
    :n_node_Sin,1;
    :n_node_Slice,7;
    :n_node_Softmax,1;
    :n_node_Sqrt,3;
    :n_node_Squeeze,4;
    :n_node_Transpose,6;
    :n_node_Unsqueeze,7;
    :n_node_Where,2;
    :n_node_functions,0;
    :n_node_initializer_1,16;
    :n_node_initializer_7,15;
    :n_node_initializer_9,1;
    :n_node_nodes,130;
    :n_node_nodes_nocst,130;
    :onnx_filename,dump_models/arnir0_Tiny-LLM/onnx-dynamo/ir/op18/arnir0_Tiny-LLM-onnx-dynamo-ir-op18.onnx;
    :onnx_filename_ortbart,dump_models/arnir0_Tiny-LLM/onnx-dynamo/ir/op18/arnir0_Tiny-LLM-onnx-dynamo-ir-op18.ort.bart.onnx;
    :onnx_filename_ortbert,dump_models/arnir0_Tiny-LLM/onnx-dynamo/ir/op18/arnir0_Tiny-LLM-onnx-dynamo-ir-op18.ort.bert.onnx;
    :onnx_filename_ortbert_keras,dump_models/arnir0_Tiny-LLM/onnx-dynamo/ir/op18/arnir0_Tiny-LLM-onnx-dynamo-ir-op18.ort.bert_keras.onnx;
    :onnx_filename_ortbert_tf,dump_models/arnir0_Tiny-LLM/onnx-dynamo/ir/op18/arnir0_Tiny-LLM-onnx-dynamo-ir-op18.ort.bert_tf.onnx;
    :onnx_filename_ortclip,dump_models/arnir0_Tiny-LLM/onnx-dynamo/ir/op18/arnir0_Tiny-LLM-onnx-dynamo-ir-op18.ort.clip.onnx;
    :onnx_filename_ortconformer,dump_models/arnir0_Tiny-LLM/onnx-dynamo/ir/op18/arnir0_Tiny-LLM-onnx-dynamo-ir-op18.ort.conformer.onnx;
    :onnx_filename_ortgpt2,dump_models/arnir0_Tiny-LLM/onnx-dynamo/ir/op18/arnir0_Tiny-LLM-onnx-dynamo-ir-op18.ort.gpt2.onnx;
    :onnx_filename_ortgpt2_tf,dump_models/arnir0_Tiny-LLM/onnx-dynamo/ir/op18/arnir0_Tiny-LLM-onnx-dynamo-ir-op18.ort.gpt2_tf.onnx;
    :onnx_filename_ortgpt_neox,dump_models/arnir0_Tiny-LLM/onnx-dynamo/ir/op18/arnir0_Tiny-LLM-onnx-dynamo-ir-op18.ort.gpt_neox.onnx;
    :onnx_filename_ortmmdit,dump_models/arnir0_Tiny-LLM/onnx-dynamo/ir/op18/arnir0_Tiny-LLM-onnx-dynamo-ir-op18.ort.mmdit.onnx;
    :onnx_filename_ortsam2,dump_models/arnir0_Tiny-LLM/onnx-dynamo/ir/op18/arnir0_Tiny-LLM-onnx-dynamo-ir-op18.ort.sam2.onnx;
    :onnx_filename_ortswin,dump_models/arnir0_Tiny-LLM/onnx-dynamo/ir/op18/arnir0_Tiny-LLM-onnx-dynamo-ir-op18.ort.swin.onnx;
    :onnx_filename_ortt5,dump_models/arnir0_Tiny-LLM/onnx-dynamo/ir/op18/arnir0_Tiny-LLM-onnx-dynamo-ir-op18.ort.t5.onnx;
    :onnx_filename_orttnlr,dump_models/arnir0_Tiny-LLM/onnx-dynamo/ir/op18/arnir0_Tiny-LLM-onnx-dynamo-ir-op18.ort.tnlr.onnx;
    :onnx_filename_ortunet,dump_models/arnir0_Tiny-LLM/onnx-dynamo/ir/op18/arnir0_Tiny-LLM-onnx-dynamo-ir-op18.ort.unet.onnx;
    :onnx_filename_ortvae,dump_models/arnir0_Tiny-LLM/onnx-dynamo/ir/op18/arnir0_Tiny-LLM-onnx-dynamo-ir-op18.ort.vae.onnx;
    :onnx_filename_ortvit,dump_models/arnir0_Tiny-LLM/onnx-dynamo/ir/op18/arnir0_Tiny-LLM-onnx-dynamo-ir-op18.ort.vit.onnx;
    :onnx_ort_inputs,dict(input_ids:A7s2x3,attention_mask:A7s2x33,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x30x96,past_key_values_value_0:A1s2x1x30x96);
    :onnx_ort_inputs22,dict(input_ids:A7s3x4,attention_mask:A7s3x35,position_ids:A7s3x4,past_key_values_key_0:A1s3x1x31x96,past_key_values_value_0:A1s3x1x31x96);
    :onnx_ort_inputs22_ortbart,dict(input_ids:A7s3x4,attention_mask:A7s3x35,position_ids:A7s3x4,past_key_values_key_0:A1s3x1x31x96,past_key_values_value_0:A1s3x1x31x96);
    :onnx_ort_inputs22_ortbert,dict(input_ids:A7s3x4,attention_mask:A7s3x35,position_ids:A7s3x4,past_key_values_key_0:A1s3x1x31x96,past_key_values_value_0:A1s3x1x31x96);
    :onnx_ort_inputs22_ortbert_keras,dict(input_ids:A7s3x4,attention_mask:A7s3x35,position_ids:A7s3x4,past_key_values_key_0:A1s3x1x31x96,past_key_values_value_0:A1s3x1x31x96);
    :onnx_ort_inputs22_ortbert_tf,dict(input_ids:A7s3x4,attention_mask:A7s3x35,position_ids:A7s3x4,past_key_values_key_0:A1s3x1x31x96,past_key_values_value_0:A1s3x1x31x96);
    :onnx_ort_inputs22_ortclip,dict(input_ids:A7s3x4,attention_mask:A7s3x35,position_ids:A7s3x4,past_key_values_key_0:A1s3x1x31x96,past_key_values_value_0:A1s3x1x31x96);
    :onnx_ort_inputs22_ortconformer,dict(input_ids:A7s3x4,attention_mask:A7s3x35,position_ids:A7s3x4,past_key_values_key_0:A1s3x1x31x96,past_key_values_value_0:A1s3x1x31x96);
    :onnx_ort_inputs22_ortgpt2,dict(input_ids:A7s3x4,attention_mask:A7s3x35,position_ids:A7s3x4,past_key_values_key_0:A1s3x1x31x96,past_key_values_value_0:A1s3x1x31x96);
    :onnx_ort_inputs22_ortgpt2_tf,dict(input_ids:A7s3x4,attention_mask:A7s3x35,position_ids:A7s3x4,past_key_values_key_0:A1s3x1x31x96,past_key_values_value_0:A1s3x1x31x96);
    :onnx_ort_inputs22_ortgpt_neox,dict(input_ids:A7s3x4,attention_mask:A7s3x35,position_ids:A7s3x4,past_key_values_key_0:A1s3x1x31x96,past_key_values_value_0:A1s3x1x31x96);
    :onnx_ort_inputs22_ortmmdit,dict(input_ids:A7s3x4,attention_mask:A7s3x35,position_ids:A7s3x4,past_key_values_key_0:A1s3x1x31x96,past_key_values_value_0:A1s3x1x31x96);
    :onnx_ort_inputs22_ortsam2,dict(input_ids:A7s3x4,attention_mask:A7s3x35,position_ids:A7s3x4,past_key_values_key_0:A1s3x1x31x96,past_key_values_value_0:A1s3x1x31x96);
    :onnx_ort_inputs22_ortswin,dict(input_ids:A7s3x4,attention_mask:A7s3x35,position_ids:A7s3x4,past_key_values_key_0:A1s3x1x31x96,past_key_values_value_0:A1s3x1x31x96);
    :onnx_ort_inputs22_ortt5,dict(input_ids:A7s3x4,attention_mask:A7s3x35,position_ids:A7s3x4,past_key_values_key_0:A1s3x1x31x96,past_key_values_value_0:A1s3x1x31x96);
    :onnx_ort_inputs22_orttnlr,dict(input_ids:A7s3x4,attention_mask:A7s3x35,position_ids:A7s3x4,past_key_values_key_0:A1s3x1x31x96,past_key_values_value_0:A1s3x1x31x96);
    :onnx_ort_inputs22_ortunet,dict(input_ids:A7s3x4,attention_mask:A7s3x35,position_ids:A7s3x4,past_key_values_key_0:A1s3x1x31x96,past_key_values_value_0:A1s3x1x31x96);
    :onnx_ort_inputs22_ortvae,dict(input_ids:A7s3x4,attention_mask:A7s3x35,position_ids:A7s3x4,past_key_values_key_0:A1s3x1x31x96,past_key_values_value_0:A1s3x1x31x96);
    :onnx_ort_inputs22_ortvit,dict(input_ids:A7s3x4,attention_mask:A7s3x35,position_ids:A7s3x4,past_key_values_key_0:A1s3x1x31x96,past_key_values_value_0:A1s3x1x31x96);
    :onnx_ort_inputs2_batch1,dict(input_ids:A7s1x3,attention_mask:A7s1x33,position_ids:A7s1x3,past_key_values_key_0:A1s1x1x30x96,past_key_values_value_0:A1s1x1x30x96);
    :onnx_ort_inputs2_batch1_ortbart,dict(input_ids:A7s1x3,attention_mask:A7s1x33,position_ids:A7s1x3,past_key_values_key_0:A1s1x1x30x96,past_key_values_value_0:A1s1x1x30x96);
    :onnx_ort_inputs2_batch1_ortbert,dict(input_ids:A7s1x3,attention_mask:A7s1x33,position_ids:A7s1x3,past_key_values_key_0:A1s1x1x30x96,past_key_values_value_0:A1s1x1x30x96);
    :onnx_ort_inputs2_batch1_ortbert_keras,dict(input_ids:A7s1x3,attention_mask:A7s1x33,position_ids:A7s1x3,past_key_values_key_0:A1s1x1x30x96,past_key_values_value_0:A1s1x1x30x96);
    :onnx_ort_inputs2_batch1_ortbert_tf,dict(input_ids:A7s1x3,attention_mask:A7s1x33,position_ids:A7s1x3,past_key_values_key_0:A1s1x1x30x96,past_key_values_value_0:A1s1x1x30x96);
    :onnx_ort_inputs2_batch1_ortclip,dict(input_ids:A7s1x3,attention_mask:A7s1x33,position_ids:A7s1x3,past_key_values_key_0:A1s1x1x30x96,past_key_values_value_0:A1s1x1x30x96);
    :onnx_ort_inputs2_batch1_ortconformer,dict(input_ids:A7s1x3,attention_mask:A7s1x33,position_ids:A7s1x3,past_key_values_key_0:A1s1x1x30x96,past_key_values_value_0:A1s1x1x30x96);
    :onnx_ort_inputs2_batch1_ortgpt2,dict(input_ids:A7s1x3,attention_mask:A7s1x33,position_ids:A7s1x3,past_key_values_key_0:A1s1x1x30x96,past_key_values_value_0:A1s1x1x30x96);
    :onnx_ort_inputs2_batch1_ortgpt2_tf,dict(input_ids:A7s1x3,attention_mask:A7s1x33,position_ids:A7s1x3,past_key_values_key_0:A1s1x1x30x96,past_key_values_value_0:A1s1x1x30x96);
    :onnx_ort_inputs2_batch1_ortgpt_neox,dict(input_ids:A7s1x3,attention_mask:A7s1x33,position_ids:A7s1x3,past_key_values_key_0:A1s1x1x30x96,past_key_values_value_0:A1s1x1x30x96);
    :onnx_ort_inputs2_batch1_ortmmdit,dict(input_ids:A7s1x3,attention_mask:A7s1x33,position_ids:A7s1x3,past_key_values_key_0:A1s1x1x30x96,past_key_values_value_0:A1s1x1x30x96);
    :onnx_ort_inputs2_batch1_ortsam2,dict(input_ids:A7s1x3,attention_mask:A7s1x33,position_ids:A7s1x3,past_key_values_key_0:A1s1x1x30x96,past_key_values_value_0:A1s1x1x30x96);
    :onnx_ort_inputs2_batch1_ortswin,dict(input_ids:A7s1x3,attention_mask:A7s1x33,position_ids:A7s1x3,past_key_values_key_0:A1s1x1x30x96,past_key_values_value_0:A1s1x1x30x96);
    :onnx_ort_inputs2_batch1_ortt5,dict(input_ids:A7s1x3,attention_mask:A7s1x33,position_ids:A7s1x3,past_key_values_key_0:A1s1x1x30x96,past_key_values_value_0:A1s1x1x30x96);
    :onnx_ort_inputs2_batch1_orttnlr,dict(input_ids:A7s1x3,attention_mask:A7s1x33,position_ids:A7s1x3,past_key_values_key_0:A1s1x1x30x96,past_key_values_value_0:A1s1x1x30x96);
    :onnx_ort_inputs2_batch1_ortunet,dict(input_ids:A7s1x3,attention_mask:A7s1x33,position_ids:A7s1x3,past_key_values_key_0:A1s1x1x30x96,past_key_values_value_0:A1s1x1x30x96);
    :onnx_ort_inputs2_batch1_ortvae,dict(input_ids:A7s1x3,attention_mask:A7s1x33,position_ids:A7s1x3,past_key_values_key_0:A1s1x1x30x96,past_key_values_value_0:A1s1x1x30x96);
    :onnx_ort_inputs2_batch1_ortvit,dict(input_ids:A7s1x3,attention_mask:A7s1x33,position_ids:A7s1x3,past_key_values_key_0:A1s1x1x30x96,past_key_values_value_0:A1s1x1x30x96);
    :onnx_ort_inputs2_empty_cache,dict(input_ids:A7s2x3,attention_mask:A7s2x3,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x0x96,past_key_values_value_0:A1s2x1x0x96);
    :onnx_ort_inputs2_empty_cache_ortbart,dict(input_ids:A7s2x3,attention_mask:A7s2x3,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x0x96,past_key_values_value_0:A1s2x1x0x96);
    :onnx_ort_inputs2_empty_cache_ortbert,dict(input_ids:A7s2x3,attention_mask:A7s2x3,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x0x96,past_key_values_value_0:A1s2x1x0x96);
    :onnx_ort_inputs2_empty_cache_ortbert_keras,dict(input_ids:A7s2x3,attention_mask:A7s2x3,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x0x96,past_key_values_value_0:A1s2x1x0x96);
    :onnx_ort_inputs2_empty_cache_ortbert_tf,dict(input_ids:A7s2x3,attention_mask:A7s2x3,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x0x96,past_key_values_value_0:A1s2x1x0x96);
    :onnx_ort_inputs2_empty_cache_ortclip,dict(input_ids:A7s2x3,attention_mask:A7s2x3,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x0x96,past_key_values_value_0:A1s2x1x0x96);
    :onnx_ort_inputs2_empty_cache_ortconformer,dict(input_ids:A7s2x3,attention_mask:A7s2x3,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x0x96,past_key_values_value_0:A1s2x1x0x96);
    :onnx_ort_inputs2_empty_cache_ortgpt2,dict(input_ids:A7s2x3,attention_mask:A7s2x3,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x0x96,past_key_values_value_0:A1s2x1x0x96);
    :onnx_ort_inputs2_empty_cache_ortgpt2_tf,dict(input_ids:A7s2x3,attention_mask:A7s2x3,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x0x96,past_key_values_value_0:A1s2x1x0x96);
    :onnx_ort_inputs2_empty_cache_ortgpt_neox,dict(input_ids:A7s2x3,attention_mask:A7s2x3,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x0x96,past_key_values_value_0:A1s2x1x0x96);
    :onnx_ort_inputs2_empty_cache_ortmmdit,dict(input_ids:A7s2x3,attention_mask:A7s2x3,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x0x96,past_key_values_value_0:A1s2x1x0x96);
    :onnx_ort_inputs2_empty_cache_ortsam2,dict(input_ids:A7s2x3,attention_mask:A7s2x3,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x0x96,past_key_values_value_0:A1s2x1x0x96);
    :onnx_ort_inputs2_empty_cache_ortswin,dict(input_ids:A7s2x3,attention_mask:A7s2x3,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x0x96,past_key_values_value_0:A1s2x1x0x96);
    :onnx_ort_inputs2_empty_cache_ortt5,dict(input_ids:A7s2x3,attention_mask:A7s2x3,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x0x96,past_key_values_value_0:A1s2x1x0x96);
    :onnx_ort_inputs2_empty_cache_orttnlr,dict(input_ids:A7s2x3,attention_mask:A7s2x3,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x0x96,past_key_values_value_0:A1s2x1x0x96);
    :onnx_ort_inputs2_empty_cache_ortunet,dict(input_ids:A7s2x3,attention_mask:A7s2x3,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x0x96,past_key_values_value_0:A1s2x1x0x96);
    :onnx_ort_inputs2_empty_cache_ortvae,dict(input_ids:A7s2x3,attention_mask:A7s2x3,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x0x96,past_key_values_value_0:A1s2x1x0x96);
    :onnx_ort_inputs2_empty_cache_ortvit,dict(input_ids:A7s2x3,attention_mask:A7s2x3,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x0x96,past_key_values_value_0:A1s2x1x0x96);
    :onnx_ort_inputs_ortbart,dict(input_ids:A7s2x3,attention_mask:A7s2x33,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x30x96,past_key_values_value_0:A1s2x1x30x96);
    :onnx_ort_inputs_ortbert,dict(input_ids:A7s2x3,attention_mask:A7s2x33,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x30x96,past_key_values_value_0:A1s2x1x30x96);
    :onnx_ort_inputs_ortbert_keras,dict(input_ids:A7s2x3,attention_mask:A7s2x33,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x30x96,past_key_values_value_0:A1s2x1x30x96);
    :onnx_ort_inputs_ortbert_tf,dict(input_ids:A7s2x3,attention_mask:A7s2x33,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x30x96,past_key_values_value_0:A1s2x1x30x96);
    :onnx_ort_inputs_ortclip,dict(input_ids:A7s2x3,attention_mask:A7s2x33,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x30x96,past_key_values_value_0:A1s2x1x30x96);
    :onnx_ort_inputs_ortconformer,dict(input_ids:A7s2x3,attention_mask:A7s2x33,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x30x96,past_key_values_value_0:A1s2x1x30x96);
    :onnx_ort_inputs_ortgpt2,dict(input_ids:A7s2x3,attention_mask:A7s2x33,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x30x96,past_key_values_value_0:A1s2x1x30x96);
    :onnx_ort_inputs_ortgpt2_tf,dict(input_ids:A7s2x3,attention_mask:A7s2x33,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x30x96,past_key_values_value_0:A1s2x1x30x96);
    :onnx_ort_inputs_ortgpt_neox,dict(input_ids:A7s2x3,attention_mask:A7s2x33,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x30x96,past_key_values_value_0:A1s2x1x30x96);
    :onnx_ort_inputs_ortmmdit,dict(input_ids:A7s2x3,attention_mask:A7s2x33,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x30x96,past_key_values_value_0:A1s2x1x30x96);
    :onnx_ort_inputs_ortsam2,dict(input_ids:A7s2x3,attention_mask:A7s2x33,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x30x96,past_key_values_value_0:A1s2x1x30x96);
    :onnx_ort_inputs_ortswin,dict(input_ids:A7s2x3,attention_mask:A7s2x33,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x30x96,past_key_values_value_0:A1s2x1x30x96);
    :onnx_ort_inputs_ortt5,dict(input_ids:A7s2x3,attention_mask:A7s2x33,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x30x96,past_key_values_value_0:A1s2x1x30x96);
    :onnx_ort_inputs_orttnlr,dict(input_ids:A7s2x3,attention_mask:A7s2x33,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x30x96,past_key_values_value_0:A1s2x1x30x96);
    :onnx_ort_inputs_ortunet,dict(input_ids:A7s2x3,attention_mask:A7s2x33,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x30x96,past_key_values_value_0:A1s2x1x30x96);
    :onnx_ort_inputs_ortvae,dict(input_ids:A7s2x3,attention_mask:A7s2x33,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x30x96,past_key_values_value_0:A1s2x1x30x96);
    :onnx_ort_inputs_ortvit,dict(input_ids:A7s2x3,attention_mask:A7s2x33,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x30x96,past_key_values_value_0:A1s2x1x30x96);
    :onnx_size,200588;
    :onnx_size_ortbart,170038;
    :onnx_size_ortbert,170038;
    :onnx_size_ortbert_keras,170101;
    :onnx_size_ortbert_tf,170072;
    :onnx_size_ortclip,170038;
    :onnx_size_ortconformer,170090;
    :onnx_size_ortgpt2,170038;
    :onnx_size_ortgpt2_tf,170070;
    :onnx_size_ortgpt_neox,170079;
    :onnx_size_ortmmdit,170047;
    :onnx_size_ortsam2,201363;
    :onnx_size_ortswin,170038;
    :onnx_size_ortt5,170019;
    :onnx_size_orttnlr,170038;
    :onnx_size_ortunet,201363;
    :onnx_size_ortvae,201353;
    :onnx_size_ortvit,170028;
    :opt_ort_bart_delta_node,-18;
    :opt_ort_bart_duration,0.10988655799883418;
    :opt_ort_bart_duration_save,0.07418263800354907;
    :opt_ort_bart_n_nodes1,130;
    :opt_ort_bart_n_nodes2,112;
    :opt_ort_bert_delta_node,-18;
    :opt_ort_bert_duration,0.12158494799950859;
    :opt_ort_bert_duration_save,0.08524328200292075;
    :opt_ort_bert_keras_delta_node,-18;
    :opt_ort_bert_keras_duration,0.11803962000703905;
    :opt_ort_bert_keras_duration_save,0.06829193900193786;
    :opt_ort_bert_keras_n_nodes1,130;
    :opt_ort_bert_keras_n_nodes2,112;
    :opt_ort_bert_n_nodes1,130;
    :opt_ort_bert_n_nodes2,112;
    :opt_ort_bert_tf_delta_node,-18;
    :opt_ort_bert_tf_duration,0.04261509099887917;
    :opt_ort_bert_tf_duration_save,0.06562444299925119;
    :opt_ort_bert_tf_n_nodes1,130;
    :opt_ort_bert_tf_n_nodes2,112;
    :opt_ort_clip_delta_node,-18;
    :opt_ort_clip_duration,0.1548455610027304;
    :opt_ort_clip_duration_save,0.08998242399684386;
    :opt_ort_clip_n_nodes1,130;
    :opt_ort_clip_n_nodes2,112;
    :opt_ort_conformer_delta_node,-18;
    :opt_ort_conformer_duration,0.12352071199711645;
    :opt_ort_conformer_duration_save,0.06759310099732829;
    :opt_ort_conformer_n_nodes1,130;
    :opt_ort_conformer_n_nodes2,112;
    :opt_ort_gpt2_delta_node,-18;
    :opt_ort_gpt2_duration,0.1223549299975275;
    :opt_ort_gpt2_duration_save,0.0801966299986816;
    :opt_ort_gpt2_n_nodes1,130;
    :opt_ort_gpt2_n_nodes2,112;
    :opt_ort_gpt2_tf_delta_node,-18;
    :opt_ort_gpt2_tf_duration,0.12270535599964205;
    :opt_ort_gpt2_tf_duration_save,0.07562460900226142;
    :opt_ort_gpt2_tf_n_nodes1,130;
    :opt_ort_gpt2_tf_n_nodes2,112;
    :opt_ort_gpt_neox_delta_node,-18;
    :opt_ort_gpt_neox_duration,0.12394867500552209;
    :opt_ort_gpt_neox_duration_save,0.07981064799969317;
    :opt_ort_gpt_neox_n_nodes1,130;
    :opt_ort_gpt_neox_n_nodes2,112;
    :opt_ort_mmdit_delta_node,-18;
    :opt_ort_mmdit_duration,0.1166878249932779;
    :opt_ort_mmdit_duration_save,0.081025243998738;
    :opt_ort_mmdit_n_nodes1,130;
    :opt_ort_mmdit_n_nodes2,112;
    :opt_ort_phi_duration,0.0001256460018339567;
    :opt_ort_sam2_delta_node,0;
    :opt_ort_sam2_duration,0.12437868699635146;
    :opt_ort_sam2_duration_save,0.05957090800075093;
    :opt_ort_sam2_n_nodes1,130;
    :opt_ort_sam2_n_nodes2,130;
    :opt_ort_swin_delta_node,-18;
    :opt_ort_swin_duration,0.13064931699773297;
    :opt_ort_swin_duration_save,0.08071336599823553;
    :opt_ort_swin_n_nodes1,130;
    :opt_ort_swin_n_nodes2,112;
    :opt_ort_t5_delta_node,-18;
    :opt_ort_t5_duration,0.12471258999721613;
    :opt_ort_t5_duration_save,0.0819742750027217;
    :opt_ort_t5_n_nodes1,130;
    :opt_ort_t5_n_nodes2,112;
    :opt_ort_tnlr_delta_node,-18;
    :opt_ort_tnlr_duration,0.12163873599638464;
    :opt_ort_tnlr_duration_save,0.08700131600198802;
    :opt_ort_tnlr_n_nodes1,130;
    :opt_ort_tnlr_n_nodes2,112;
    :opt_ort_unet_delta_node,0;
    :opt_ort_unet_duration,0.18013529300515074;
    :opt_ort_unet_duration_save,0.05856689400388859;
    :opt_ort_unet_n_nodes1,130;
    :opt_ort_unet_n_nodes2,130;
    :opt_ort_vae_delta_node,0;
    :opt_ort_vae_duration,0.24460540399741149;
    :opt_ort_vae_duration_save,0.08460753099643625;
    :opt_ort_vae_n_nodes1,130;
    :opt_ort_vae_n_nodes2,130;
    :opt_ort_vit_delta_node,-18;
    :opt_ort_vit_duration,0.03967654699954437;
    :opt_ort_vit_duration_save,0.07955189399945084;
    :opt_ort_vit_n_nodes1,130;
    :opt_ort_vit_n_nodes2,112;
    :run_expected,CausalLMOutputWithPast(logits:T1s2x3x32000,past_key_values:DynamicCache(key_cache=#1[T1s2x1x33x96], value_cache=#1[T1s2x1x33x96]));
    :run_expected22,CausalLMOutputWithPast(logits:T1s3x4x32000,past_key_values:DynamicCache(key_cache=#1[T1s3x1x35x96], value_cache=#1[T1s3x1x35x96]));
    :run_expected2_batch1,CausalLMOutputWithPast(logits:T1s1x3x32000,past_key_values:DynamicCache(key_cache=#1[T1s1x1x33x96], value_cache=#1[T1s1x1x33x96]));
    :run_expected2_empty_cache,CausalLMOutputWithPast(logits:T1s2x3x32000,past_key_values:DynamicCache(key_cache=#1[T1s2x1x3x96], value_cache=#1[T1s2x1x3x96]));
    :run_feeds_inputs,dict(input_ids:A7s2x3,attention_mask:A7s2x33,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x30x96,past_key_values_value_0:A1s2x1x30x96);
    :run_feeds_inputs2,dict(input_ids:A7s3x4,attention_mask:A7s3x35,position_ids:A7s3x4,past_key_values_key_0:A1s3x1x31x96,past_key_values_value_0:A1s3x1x31x96);
    :run_feeds_inputs_batch1,dict(input_ids:A7s1x3,attention_mask:A7s1x33,position_ids:A7s1x3,past_key_values_key_0:A1s1x1x30x96,past_key_values_value_0:A1s1x1x30x96);
    :run_feeds_inputs_empty_cache,dict(input_ids:A7s2x3,attention_mask:A7s2x3,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x0x96,past_key_values_value_0:A1s2x1x0x96);
    :run_output_inputs,#3[A1s2x3x32000,A1s2x1x33x96,A1s2x1x33x96];
    :run_output_inputs2,#3[A1s3x4x32000,A1s3x1x35x96,A1s3x1x35x96];
    :run_output_inputs_batch1,#3[A1s1x3x32000,A1s1x1x33x96,A1s1x1x33x96];
    :run_output_inputs_empty_cache,#3[A1s2x3x32000,A1s2x1x3x96,A1s2x1x3x96];
    :second_input_keys,inputs2,inputs_empty_cache,inputs_batch1;
    :time_create_onnx_ort,0.12267397299729055;
    :time_create_onnx_ort_ortbart,0.03460928800632246;
    :time_create_onnx_ort_ortbert,0.06534095900133252;
    :time_create_onnx_ort_ortbert_keras,0.05267170600563986;
    :time_create_onnx_ort_ortbert_tf,0.04952780799794709;
    :time_create_onnx_ort_ortclip,0.05860051700437907;
    :time_create_onnx_ort_ortconformer,0.061947920999955386;
    :time_create_onnx_ort_ortgpt2,0.06725630800065119;
    :time_create_onnx_ort_ortgpt2_tf,0.033811781999247614;
    :time_create_onnx_ort_ortgpt_neox,0.06565929200587561;
    :time_create_onnx_ort_ortmmdit,0.045122034003725275;
    :time_create_onnx_ort_ortsam2,0.0666706490010256;
    :time_create_onnx_ort_ortswin,0.04587849599920446;
    :time_create_onnx_ort_ortt5,0.059667476001777686;
    :time_create_onnx_ort_orttnlr,0.04174095299822511;
    :time_create_onnx_ort_ortunet,0.0673251159969368;
    :time_create_onnx_ort_ortvae,0.03968618499493459;
    :time_create_onnx_ort_ortvit,0.038181119998625945;
    :time_create_torch_model,0.28786411399778444;
    :time_export_onnx,7.708822669002984;
    :time_export_onnx_opt_ir,0.08836800699646119;
    :time_onnx_save,0.3300936319938046;
    :time_ortfusion_ortbart,0.28666278300079284;
    :time_ortfusion_ortbert,0.309065601999464;
    :time_ortfusion_ortbert_keras,0.29284356900461717;
    :time_ortfusion_ortbert_tf,0.1332150369998999;
    :time_ortfusion_ortclip,0.3415136390030966;
    :time_ortfusion_ortconformer,0.2503558129974408;
    :time_ortfusion_ortgpt2,0.33758883599512046;
    :time_ortfusion_ortgpt2_tf,0.2716229259967804;
    :time_ortfusion_ortgpt_neox,0.298198181000771;
    :time_ortfusion_ortmmdit,0.30184246999851894;
    :time_ortfusion_ortphi,0.14227298500190955;
    :time_ortfusion_ortsam2,0.2520889220031677;
    :time_ortfusion_ortswin,0.2813860750029562;
    :time_ortfusion_ortt5,0.3476416880002944;
    :time_ortfusion_orttnlr,0.28425363099813694;
    :time_ortfusion_ortunet,0.3744417580019217;
    :time_ortfusion_ortvae,0.3916202269974747;
    :time_ortfusion_ortvit,0.183626402002119;
    :time_preprocess_model_id,2.872002369258553e-06;
    :time_run,0.02835918199707521;
    :time_run22,0.011091956999734975;
    :time_run2_batch1,0.009190950004267506;
    :time_run2_empty_cache,0.004232790997775737;
    :time_run_onnx_ort,0.017048691006493755;
    :time_run_onnx_ort22,0.0027102509993710555;
    :time_run_onnx_ort22_ortbart,0.0019879669998772442;
    :time_run_onnx_ort22_ortbert,0.003004489997692872;
    :time_run_onnx_ort22_ortbert_keras,0.003640023001935333;
    :time_run_onnx_ort22_ortbert_tf,0.004657133998989593;
    :time_run_onnx_ort22_ortclip,0.003609908999351319;
    :time_run_onnx_ort22_ortconformer,0.0037801690050400794;
    :time_run_onnx_ort22_ortgpt2,0.004170690997852944;
    :time_run_onnx_ort22_ortgpt2_tf,0.0023289439996005967;
    :time_run_onnx_ort22_ortgpt_neox,0.0034277390004717745;
    :time_run_onnx_ort22_ortmmdit,0.004295780003303662;
    :time_run_onnx_ort22_ortsam2,0.003952176004531793;
    :time_run_onnx_ort22_ortswin,0.003211331997590605;
    :time_run_onnx_ort22_ortt5,0.0036376640055095777;
    :time_run_onnx_ort22_orttnlr,0.005827040004078299;
    :time_run_onnx_ort22_ortunet,0.002614626006106846;
    :time_run_onnx_ort22_ortvae,0.0024052830049186014;
    :time_run_onnx_ort22_ortvit,0.0027797850052593276;
    :time_run_onnx_ort2_batch1,0.0015942270038067363;
    :time_run_onnx_ort2_batch1_ortbart,0.0015145389988902025;
    :time_run_onnx_ort2_batch1_ortbert,0.0016649719982524402;
    :time_run_onnx_ort2_batch1_ortbert_keras,0.0018082179958582856;
    :time_run_onnx_ort2_batch1_ortbert_tf,0.005440061999252066;
    :time_run_onnx_ort2_batch1_ortclip,0.0011707200028467923;
    :time_run_onnx_ort2_batch1_ortconformer,0.0020624629978556186;
    :time_run_onnx_ort2_batch1_ortgpt2,0.001357957000436727;
    :time_run_onnx_ort2_batch1_ortgpt2_tf,0.0015537990038865246;
    :time_run_onnx_ort2_batch1_ortgpt_neox,0.0020024889963679016;
    :time_run_onnx_ort2_batch1_ortmmdit,0.005733900994528085;
    :time_run_onnx_ort2_batch1_ortsam2,0.0015688200001022778;
    :time_run_onnx_ort2_batch1_ortswin,0.0019128240019199438;
    :time_run_onnx_ort2_batch1_ortt5,0.0015074449984240346;
    :time_run_onnx_ort2_batch1_orttnlr,0.001558559997647535;
    :time_run_onnx_ort2_batch1_ortunet,0.0017548280011396855;
    :time_run_onnx_ort2_batch1_ortvae,0.0015175150037975982;
    :time_run_onnx_ort2_batch1_ortvit,0.0013825539936078712;
    :time_run_onnx_ort2_empty_cache,0.0017345529995509423;
    :time_run_onnx_ort2_empty_cache_ortbart,0.0017088780004996806;
    :time_run_onnx_ort2_empty_cache_ortbert,0.0026978599998983555;
    :time_run_onnx_ort2_empty_cache_ortbert_keras,0.002004301000852138;
    :time_run_onnx_ort2_empty_cache_ortbert_tf,0.00820155599649297;
    :time_run_onnx_ort2_empty_cache_ortclip,0.0015727029967820272;
    :time_run_onnx_ort2_empty_cache_ortconformer,0.0021857159954379313;
    :time_run_onnx_ort2_empty_cache_ortgpt2,0.001552637004351709;
    :time_run_onnx_ort2_empty_cache_ortgpt2_tf,0.002505014002963435;
    :time_run_onnx_ort2_empty_cache_ortgpt_neox,0.0024267900007544085;
    :time_run_onnx_ort2_empty_cache_ortmmdit,0.002377239004999865;
    :time_run_onnx_ort2_empty_cache_ortsam2,0.0018211470014648512;
    :time_run_onnx_ort2_empty_cache_ortswin,0.0019817730062641203;
    :time_run_onnx_ort2_empty_cache_ortt5,0.0018264509999426082;
    :time_run_onnx_ort2_empty_cache_orttnlr,0.002329167997231707;
    :time_run_onnx_ort2_empty_cache_ortunet,0.0023776219968567602;
    :time_run_onnx_ort2_empty_cache_ortvae,0.0018526729982113466;
    :time_run_onnx_ort2_empty_cache_ortvit,0.0017854949983302504;
    :time_run_onnx_ort_ortbart,0.001911605999339372;
    :time_run_onnx_ort_ortbert,0.0027329700023983605;
    :time_run_onnx_ort_ortbert_keras,0.0071566189944860525;
    :time_run_onnx_ort_ortbert_tf,0.0032342240010621026;
    :time_run_onnx_ort_ortclip,0.0030915400056983344;
    :time_run_onnx_ort_ortconformer,0.004361829000117723;
    :time_run_onnx_ort_ortgpt2,0.0021929089998593554;
    :time_run_onnx_ort_ortgpt2_tf,0.0022186119967955165;
    :time_run_onnx_ort_ortgpt_neox,0.002514784995582886;
    :time_run_onnx_ort_ortmmdit,0.0023678750003455207;
    :time_run_onnx_ort_ortsam2,0.003779204998863861;
    :time_run_onnx_ort_ortswin,0.005235229997197166;
    :time_run_onnx_ort_ortt5,0.002648343004693743;
    :time_run_onnx_ort_orttnlr,0.0025084980006795377;
    :time_run_onnx_ort_ortunet,0.002971082998556085;
    :time_run_onnx_ort_ortvae,0.002170382998883724;
    :time_run_onnx_ort_ortvit,0.0019910589981009252;
    :time_run_patched,0.006929533999937121;
    :time_torch_export_export,2.59028768799908;
    :time_torch_export_export_n,1;
    :time_total,18.618280788003176;
    :time_total_exporter,9.92414682800154;
    :time_total_validation_onnx,0.2071704240006511;
    :time_total_validation_torch,0.05758322899782797;
    :version_date,2025-11-01T15:05:51;
    :version_device,;
    :version_do_run,True;
    :version_drop_inputs,[];
    :version_dtype,;
    :version_dump_folder,dump_models;
    :version_exporter,onnx-dynamo;
    :version_inputs2,1;
    :version_model_id,arnir0/Tiny-LLM;
    :version_numpy,2.3.4;
    :version_onnx,1.20.0;
    :version_onnx_diagnostic,0.8.0;
    :version_onnx_ir,0.1.13;
    :version_onnxruntime,1.24.0;
    :version_onnxscript,?;
    :version_opset,18;
    :version_optimization,ir;
    :version_ortbart_hidden_size,192;
    :version_ortbart_num_attention_heads,2;
    :version_ortbert_hidden_size,192;
    :version_ortbert_keras_hidden_size,192;
    :version_ortbert_keras_num_attention_heads,2;
    :version_ortbert_num_attention_heads,2;
    :version_ortbert_tf_hidden_size,192;
    :version_ortbert_tf_num_attention_heads,2;
    :version_ortclip_hidden_size,192;
    :version_ortclip_num_attention_heads,2;
    :version_ortconformer_hidden_size,192;
    :version_ortconformer_num_attention_heads,2;
    :version_ortfusiontype,ALL;
    :version_ortgpt2_hidden_size,192;
    :version_ortgpt2_num_attention_heads,2;
    :version_ortgpt2_tf_hidden_size,192;
    :version_ortgpt2_tf_num_attention_heads,2;
    :version_ortgpt_neox_hidden_size,192;
    :version_ortgpt_neox_num_attention_heads,2;
    :version_ortmmdit_hidden_size,192;
    :version_ortmmdit_num_attention_heads,2;
    :version_ortphi_hidden_size,192;
    :version_ortphi_num_attention_heads,2;
    :version_ortsam2_hidden_size,192;
    :version_ortsam2_num_attention_heads,2;
    :version_ortswin_hidden_size,192;
    :version_ortswin_num_attention_heads,2;
    :version_ortt5_hidden_size,192;
    :version_ortt5_num_attention_heads,2;
    :version_orttnlr_hidden_size,192;
    :version_orttnlr_num_attention_heads,2;
    :version_ortunet_hidden_size,192;
    :version_ortunet_num_attention_heads,2;
    :version_ortvae_hidden_size,192;
    :version_ortvae_num_attention_heads,2;
    :version_ortvit_hidden_size,192;
    :version_ortvit_num_attention_heads,2;
    :version_patch,{'patch': True};
    :version_patch_kwargs,{'patch':True,'patch_transformers':True,'patch_diffusers':True};
    :version_quiet,False;
    :version_rewrite,True;
    :version_runtime,onnxruntime;
    :version_same_as_pretrained,False;
    :version_scipy,1.16.2;
    :version_stop_if_static,0;
    :version_torch,2.10.0.dev20251022+cu130;
    :version_transformers,5.0.0.dev0;
    :version_use_pretrained,False;
    [runpythonerror]
    W1101 15:05:54.089000 74671 torch/fx/experimental/symbolic_shapes.py:6918] _maybe_guard_rel() was called on non-relation expression Eq(s72, 1) | Eq(Max(s44, s72), s72)
    W1101 15:05:54.098000 74671 torch/fx/experimental/symbolic_shapes.py:6918] _maybe_guard_rel() was called on non-relation expression Eq(s70, 1) | Eq(Max(s70, s9), s70)
    W1101 15:05:54.139000 74671 torch/fx/experimental/symbolic_shapes.py:6918] _maybe_guard_rel() was called on non-relation expression Eq(s44, 1) | Eq(Max(s44, s72), s44)
    W1101 15:05:54.145000 74671 torch/fx/experimental/symbolic_shapes.py:6918] _maybe_guard_rel() was called on non-relation expression Eq(s9, 1) | Eq(Max(s70, s9), s9)
    ~/vv/this312/lib/python3.12/site-packages/torch/onnx/_internal/exporter/_dynamic_shapes.py:272: UserWarning: # The axis name: batch will not be used, since it shares the same shape constraints with another axis: batch.
      warnings.warn(
    ~/vv/this312/lib/python3.12/site-packages/torch/onnx/_internal/exporter/_dynamic_shapes.py:272: UserWarning: # The axis name: seq_length will not be used, since it shares the same shape constraints with another axis: seq_length.
      warnings.warn(
    symbolic shape inference disabled or failed.
    symbolic shape inference disabled or failed.
    symbolic shape inference disabled or failed.
    symbolic shape inference disabled or failed.
    Model producer not matched: Expected "keras2onnx", Got "pytorch".Please specify correct --model_type parameter.
    symbolic shape inference disabled or failed.
    symbolic shape inference disabled or failed.
    Model producer not matched: Expected "tf2onnx", Got "pytorch".Please specify correct --model_type parameter.
    symbolic shape inference disabled or failed.
    symbolic shape inference disabled or failed.
    symbolic shape inference disabled or failed.
    symbolic shape inference disabled or failed.
    symbolic shape inference disabled or failed.
    symbolic shape inference disabled or failed.
    symbolic shape inference disabled or failed.
    symbolic shape inference disabled or failed.
    Model producer not matched: Expected "tf2onnx", Got "pytorch".Please specify correct --model_type parameter.
    symbolic shape inference disabled or failed.
    symbolic shape inference disabled or failed.
    symbolic shape inference disabled or failed.
    symbolic shape inference disabled or failed.
    
fusion:   0%|          | 0/5 [00:00<?, ?it/s]
                                             
The optimized model requires LayerNormalization with broadcast support. Please use onnxruntime-gpu>=1.21 for inference.
    
fusion:  20%|██        | 1/5 [00:00<00:00, 11.17it/s]
fusion: 100%|██████████| 5/5 [00:00<00:00, 50.02it/s]
    
sam2 fusion:   0%|          | 0/12 [00:00<?, ?it/s]
sam2 fusion:  17%|█▋        | 2/12 [00:00<00:00, 19.73it/s]
                                                           
symbolic shape inference disabled or failed.
    
sam2 fusion:  50%|█████     | 6/12 [00:00<00:00, 19.73it/s]
sam2 fusion: 100%|██████████| 12/12 [00:00<00:00, 103.15it/s]
    symbolic shape inference disabled or failed.
    symbolic shape inference disabled or failed.
    symbolic shape inference disabled or failed.
    symbolic shape inference disabled or failed.
    symbolic shape inference disabled or failed.
    symbolic shape inference disabled or failed.
    
fusion:   0%|          | 0/18 [00:00<?, ?it/s]
fusion:  17%|█▋        | 3/18 [00:00<00:00, 29.48it/s]
                                                      
symbolic shape inference disabled or failed.
    
fusion:  50%|█████     | 9/18 [00:00<00:00, 29.48it/s]
                                                      
SkipGroupNorm fusion will be skipped since symbolic shape inference disabled or failed.
    
fusion:  67%|██████▋   | 12/18 [00:00<00:00, 29.48it/s]
fusion: 100%|██████████| 18/18 [00:00<00:00, 104.82it/s]
    
fusion:   0%|          | 0/18 [00:00<?, ?it/s]
fusion:  11%|█         | 2/18 [00:00<00:01,  9.01it/s]
                                                      
symbolic shape inference disabled or failed.
    
fusion:  50%|█████     | 9/18 [00:00<00:00,  9.01it/s]
                                                      
SkipGroupNorm fusion will be skipped since symbolic shape inference disabled or failed.
    
fusion:  67%|██████▋   | 12/18 [00:00<00:00,  9.01it/s]
fusion: 100%|██████████| 18/18 [00:00<00:00, 75.67it/s]
    symbolic shape inference disabled or failed.
    symbolic shape inference disabled or failed.

Sdpa or Eager implementation or Use a StaticCache

Add --mop cache_implementation=static --iop cls_cache=StaticCache to use a StaticCache instead of a DynamicCache (default). Add --mop attn_implementation=eager to explicitly select eager implementation for attention.

python -m onnx_diagnostic validate \
            -m google/gemma-2b \
            --run \
            -v 1 \
            --export custom \
            -o dump_test \
            --dtype float16 \
            --device cpu \
            --patch \
            --no-quiet \
            --opt default \
            --rewrite \
            --mop attn_implementation=eager \
            --mop cache_implementation=static \
            --iop cls_cache=StaticCache