-m onnx_diagnostic validate … validate a model id¶
The command line is a wrapper around function
onnx_diagnostic.torch_models.validate.validate_model().
Description¶
The command lines validate a model id available on HuggingFace but not only. It creates dummy inputs, runs the models on them, exports the model, measures the discrepancies…
usage: validate [-h] [-m MID] [-t TASK] [-e EXPORT] [--opt OPT]
[-r | --run | --no-run] [-q | --quiet | --no-quiet]
[--patch [PATCH ...]] [--rewrite | --no-rewrite]
[--stop-if-static STOP_IF_STATIC]
[--same-as-trained | --no-same-as-trained]
[--trained | --no-trained] [--inputs2 INPUTS2]
[--runtime {onnxruntime,torch,ref,orteval,orteval10}]
[-o DUMP_FOLDER] [--drop DROP] [--opset OPSET]
[--subfolder SUBFOLDER] [--ortfusiontype ORTFUSIONTYPE]
[-v VERBOSE] [--dtype DTYPE] [--device DEVICE]
[--iop [KEY=VALUE ...]] [--mop [KEY=VALUE ...]]
[--repeat REPEAT] [--warmup WARMUP] [--outnames OUTNAMES]
[--ort-logs | --no-ort-logs]
[--quiet-input-sets QUIET_INPUT_SETS]
[--expop [KEY=VALUE ...]] [--save-ep SAVE_EP]
Validates a model for a particular task given the model id.
It exports the model and then validates it by computing the discrepancies
on different input sets.
options:
-h, --help show this help message and exit
-m MID, --mid MID model id, usually <author>/<name>
-t TASK, --task TASK force the task to use
-e EXPORT, --export EXPORT
export the model with this exporter
--opt OPT optimization to apply after the export
-r, --run, --no-run Runs the model to check it runs.
-q, --quiet, --no-quiet
Catches exception, reports them in the summary.
--patch [PATCH ...] Applies patches before exporting, it can be a boolean
to enable to disable the patches or be more finetuned
(default is True). It is possible to disable patch for torch
by adding:
--patch "patch_sympy=False" --patch "patch_torch=False"
--rewrite, --no-rewrite
Applies rewrite before exporting.
--stop-if-static STOP_IF_STATIC
Raises an exception if a dynamic dimension becomes static.
--same-as-trained, --no-same-as-trained
Validates or exports a model identical to the trained model but not trained.
--trained, --no-trained
Validates or exports the trained model (requires downloading).
--inputs2 INPUTS2 Validates or exports the model on a second set of inputs
to check the exported model supports dynamism. The values is used
as an increment to the first set of inputs. A high value may trick
a different behavior in the model and missed by the exporter.
--runtime {onnxruntime,torch,ref,orteval,orteval10}
onnx runtime to use, `onnxruntime` by default
-o DUMP_FOLDER, --dump-folder DUMP_FOLDER
A folder is created to dumps statistics,
exported program, onnx...
--drop DROP Drops the following inputs names, it should be a list
with comma separated values, example:
--drop position_ids
--opset OPSET onnx opset to use, 18 by default
--subfolder SUBFOLDER
Subfolder where to find the model and the configuration.
--ortfusiontype ORTFUSIONTYPE
Applies onnxruntime fusion, this parameter should contain the
model type or multiple values separated by `|`. `ALL` can be used
to run them all.
-v VERBOSE, --verbose VERBOSE
verbosity
--dtype DTYPE Changes dtype if necessary.
--device DEVICE Changes the device if necessary.
--iop [KEY=VALUE ...]
Additional input options, used to change the default
inputs use to export. Examples:
--iop cls_cache=SlidingWindowCache
--iop cls_cache=StaticCache
--mop [KEY=VALUE ...]
Additional model options, used to change some parameters
of the model. Example:
--mop attn_implementation=sdpa --mop attn_implementation=eager"
--mop "rope_scaling={'rope_type': 'dynamic', 'factor': 10.0}"
--repeat REPEAT number of times to run the model to measures inference time
--warmup WARMUP number of times to run the model to do warmup
--outnames OUTNAMES This comma separated list defines the output names the onnx exporter should use.
--ort-logs, --no-ort-logs
Enables onnxruntime logging when the session is created
--quiet-input-sets QUIET_INPUT_SETS
Avoids raising an exception when an input sets does not work with
the exported model. Example:
--quiet-input-sets=inputs,inputs22
--expop [KEY=VALUE ...]
Additional exporter options, use to change some parameters
of the model. Examples:
--expop report=True
--expop report=True --expop verify=True
--save-ep SAVE_EP
saves the exported program with torch.export.save
and the inputs sets with torch.save,
then command line sbs can be used to look for discrepancies.
If the model id is specified, one untrained version of it is instantiated.
Examples:
python -m onnx_diagnostic validate -m microsoft/Phi-4-mini-reasoning \
--run -v 1 -o dump_test --no-quiet --repeat 2 --warmup 2 \
--dtype float16 --device cuda --patch --export onnx-dynamo --opt ir
python -m onnx_diagnostic validate -m microsoft/Phi-4-mini-reasoning \
--run -v 1 -o dump_test --no-quiet --repeat 2 --warmup 2 \
--dtype float16 --device cuda --patch --export custom --opt default
python -m onnx_diagnostic validate -m microsoft/Phi-4-mini-reasoning \
--run -v 1 -o dump_test --no-quiet --repeat 2 --warmup 2 \
--dtype float16 --device cuda --export modelbuilder
position_ids is usually not needed, they can be removed by adding:
--drop position_ids
The behaviour may be modified compare the original configuration,
the following argument can be rope_scaling to dynamic:
--mop "rope_scaling={'rope_type': 'dynamic', 'factor': 10.0}""
You can profile the command line by running:
pyinstrument -m onnx_diagnostic validate ...
pyinstrument -r html -o profile.html -m onnx_diagnostic validate ...
Get the list of supported tasks¶
The task are the same defined by HuggingFace. The tool only supports a subset of them.
python -m onnx_diagnostic validate
-- list of supported tasks:
MoE
automatic-speech-recognition
feature-extraction
fill-mask
image-classification
image-text-to-text
image-to-video
mask-generation
object-detection
sentence-similarity
summarization
text-classification
text-generation
text-to-image
text2text-generation
zero-shot-image-classification
Get the default inputs for a specific task¶
This returns the dummy inputs for a specific task. There may be too many inputs. Only those the forward method defines are kept.
python -m onnx_diagnostic validate -t text-generation
-- inputs
+ input_ids : T7s2x3
+ attention_mask : T7s2x33
+ position_ids : T7s2x3
+ past_key_values : DynamicCache(key_cache=#4[T1s2x24x30x16,T1s2x24x30x16,T1s2x24x30x16,T1s2x24x30x16], value_cache=#4[T1s2x24x30x16,T1s2x24x30x16,T1s2x24x30x16,T1s2x24x30x16])
-- dynamic_shapes
+ input_ids : {0:DYN(batch),1:DYN(seq_length)}
+ attention_mask : {0:DYN(batch),1:DYN(cache+seq)}
+ position_ids : {0:DYN(batch),1:DYN(seq_length)}
+ past_key_values : #8[{0:DYN(batch),2:DYN(cache_length)},{0:DYN(batch),2:DYN(cache_length)},{0:DYN(batch),2:DYN(cache_length)},{0:DYN(batch),2:DYN(cache_length)},{0:DYN(batch),2:DYN(cache_length)},{0:DYN(batch),2:DYN(cache_length)},{0:DYN(batch),2:DYN(cache_length)},{0:DYN(batch),2:DYN(cache_length)}]
Validate dummy inputs for a model¶
The dummy inputs may not work for this model and this task. The following command line checks that. It is no use to export if this fails.
python -m onnx_diagnostic validate -m arnir0/Tiny-LLM --run -v 1
[validate_model] validate model id 'arnir0/Tiny-LLM'
[validate_model] patch={'patch': True}
[validate_model] get dummy inputs with input_options=None...
[validate_model] rewrite=True, patch_kwargs={'patch': True, 'patch_transformers': True, 'patch_diffusers': True}, stop_if_static=0
[validate_model] exporter=None, optimization=None
[validate_model] dump_folder=None
[validate_model] output_names=None
[get_untrained_model_with_inputs] model_id='arnir0/Tiny-LLM', subfolder=None
[get_untrained_model_with_inputs] use preinstalled 'arnir0/Tiny-LLM'
[get_untrained_model_with_inputs] architecture='LlamaForCausalLM'
[get_untrained_model_with_inputs] cls='LlamaConfig'
[get_untrained_model_with_inputs] task='text-generation'
[get_untrained_model_with_inputs] default config._attn_implementation=None
[get_untrained_model_with_inputs] package_source=transformers from ~/github/transformers/src/transformers/__init__.py
[get_untrained_model_with_inputs] instantiate model_id 'arnir0/Tiny-LLM', subfolder=None
[get_untrained_model_with_inputs] -- done(2) in 2.745000529102981e-06s
[get_untrained_model_with_inputs] instantiate_specific_model <class 'transformers.models.llama.modeling_llama.LlamaForCausalLM'>
[get_untrained_model_with_inputs] -- done(3) in 5.655005224980414e-06s (model is <class 'NoneType'>)
[get_untrained_model_with_inputs] instantiate_specific_model(2) <class 'transformers.models.llama.modeling_llama.LlamaForCausalLM'>
[get_untrained_model_with_inputs] -- done(4) in 0.0995945769973332s (model is <class 'transformers.models.llama.modeling_llama.LlamaForCausalLM'>)
[get_untrained_model_with_inputs] use fct=<function get_inputs at 0x764077b689a0>
[get_untrained_model_with_inputs] model class='LlamaForCausalLM'
[validate_model] --
[validate_model] task=text-generation
[validate_model] size=49.549072265625 Mb
[validate_model] n_weights=12.988992 millions parameters
[validate_model] +INPUT input_ids=T7s2x3
[validate_model] +INPUT attention_mask=T7s2x33
[validate_model] +INPUT position_ids=T7s2x3
[validate_model] +INPUT past_key_values=DynamicCache(key_cache=#1[T1s2x1x30x96], value_cache=#1[T1s2x1x30x96])
[validate_model] +SHAPE input_ids={0:DYN(batch),1:DYN(seq_length)}
[validate_model] +SHAPE attention_mask={0:DYN(batch),1:DYN(cache+seq)}
[validate_model] +SHAPE position_ids={0:DYN(batch),1:DYN(seq_length)}
[validate_model] +SHAPE past_key_values=#2[{0:DYN(batch),2:DYN(cache_length)},{0:DYN(batch),2:DYN(cache_length)}]
[validate_model] second_input_keys=['inputs_prompt', 'inputs2', 'inputs_empty_cache', 'inputs_batch1']
[validate_model] --
[validate_model] -- run the model inputs='inputs'...
[validate_model] inputs=dict(input_ids:T7s2x3,attention_mask:T7s2x33,position_ids:T7s2x3,past_key_values:DynamicCache(key_cache=#1[T1s2x1x30x96], value_cache=#1[T1s2x1x30x96]))
[validate_model] done ([run]) - CausalLMOutputWithPast(logits:T1s2x3x32000,past_key_values:DynamicCache(key_cache=#1[T1s2x1x33x96], value_cache=#1[T1s2x1x33x96]))
[validate_model] -- run the model inputs='inputs_prompt'...
[validate_model] inputs_prompt=dict(input_ids:T7s1x11)
[validate_model] done ([run2_prompt]) - CausalLMOutputWithPast(logits:T1s1x11x32000,past_key_values:DynamicCache(key_cache=#1[T1s1x1x11x96], value_cache=#1[T1s1x1x11x96]))
[validate_model] -- run the model inputs='inputs2'...
[validate_model] inputs2=dict(input_ids:T7s3x4,attention_mask:T7s3x35,position_ids:T7s3x4,past_key_values:DynamicCache(key_cache=#1[T1s3x1x31x96], value_cache=#1[T1s3x1x31x96]))
[validate_model] done ([run22]) - CausalLMOutputWithPast(logits:T1s3x4x32000,past_key_values:DynamicCache(key_cache=#1[T1s3x1x35x96], value_cache=#1[T1s3x1x35x96]))
[validate_model] -- run the model inputs='inputs_empty_cache'...
[validate_model] inputs_empty_cache=dict(input_ids:T7s2x3,attention_mask:T7s2x3,position_ids:T7s2x3,past_key_values:DynamicCache(key_cache=#1[T1s2x1x0x96], value_cache=#1[T1s2x1x0x96]))
[validate_model] done ([run2_empty_cache]) - CausalLMOutputWithPast(logits:T1s2x3x32000,past_key_values:DynamicCache(key_cache=#1[T1s2x1x3x96], value_cache=#1[T1s2x1x3x96]))
[validate_model] -- run the model inputs='inputs_batch1'...
[validate_model] inputs_batch1=dict(input_ids:T7s1x3,attention_mask:T7s1x33,position_ids:T7s1x3,past_key_values:DynamicCache(key_cache=#1[T1s1x1x30x96], value_cache=#1[T1s1x1x30x96]))
[validate_model] done ([run2_batch1]) - CausalLMOutputWithPast(logits:T1s1x3x32000,past_key_values:DynamicCache(key_cache=#1[T1s1x1x33x96], value_cache=#1[T1s1x1x33x96]))
[validate_model] -- done (final)
-- summary --
:model_class,LlamaForCausalLM;
:model_config,{'vocab_size':32000,'max_position_embeddings':1024,'hidden_size':192,'intermediate_size':1024,'num_hidden_layers':1,'num_attention_heads':2,'num_key_value_heads':1,'hidden_act':'silu','initializer_range':0.02,'rms_norm_eps':1e-05,'pretraining_tp':1,'use_cache':True,'attention_bias':False,'attention_dropout':0.0,'mlp_bias':False,'head_dim':96,'rope_parameters':{'rope_theta':10000.0,'rope_type':'default'},'return_dict':True,'output_hidden_states':False,'dtype':'float32','tie_word_embeddings':False,'chunk_size_feed_forward':0,'is_encoder_decoder':False,'is_decoder':False,'cross_attention_hidden_size':None,'add_cross_attention':False,'tie_encoder_decoder':False,'architectures':['LlamaForCausalLM'],'finetuning_task':None,'id2label':{0:'LABEL_0',1:'LABEL_1'},'label2id':{'LABEL_0':0,'LABEL_1':1},'task_specific_params':None,'problem_type':None,'tokenizer_class':None,'prefix':None,'bos_token_id':1,'pad_token_id':None,'eos_token_id':2,'sep_token_id':None,'decoder_start_token_id':None,'_name_or_path':'','transformers_version':'5.0.0.dev0','model_type':'llama','subfolder':None,'output_attentions':False};
:model_config_class,LlamaConfig;
:model_file,~/github/transformers/src/transformers/models/llama/modeling_llama.py;
:model_id,arnir0/Tiny-LLM;
:model_inputs,dict(input_ids:T7s2x3,attention_mask:T7s2x33,position_ids:T7s2x3,past_key_values:DynamicCache(key_cache=#1[T1s2x1x30x96], value_cache=#1[T1s2x1x30x96]));
:model_inputs_options,;
:model_module,transformers.models.llama.modeling_llama;
:model_nweights,12988992;
:model_shapes,dict(input_ids:{0:DYN(batch),1:DYN(seq_length)},attention_mask:{0:DYN(batch),1:DYN(cache+seq)},position_ids:{0:DYN(batch),1:DYN(seq_length)},past_key_values:#2[{0:DYN(batch),2:DYN(cache_length)},{0:DYN(batch),2:DYN(cache_length)}]);
:model_size,51955968;
:model_subfolder,;
:model_task,text-generation;
:run_expected,CausalLMOutputWithPast(logits:T1s2x3x32000,past_key_values:DynamicCache(key_cache=#1[T1s2x1x33x96], value_cache=#1[T1s2x1x33x96]));
:run_expected22,CausalLMOutputWithPast(logits:T1s3x4x32000,past_key_values:DynamicCache(key_cache=#1[T1s3x1x35x96], value_cache=#1[T1s3x1x35x96]));
:run_expected2_batch1,CausalLMOutputWithPast(logits:T1s1x3x32000,past_key_values:DynamicCache(key_cache=#1[T1s1x1x33x96], value_cache=#1[T1s1x1x33x96]));
:run_expected2_empty_cache,CausalLMOutputWithPast(logits:T1s2x3x32000,past_key_values:DynamicCache(key_cache=#1[T1s2x1x3x96], value_cache=#1[T1s2x1x3x96]));
:run_expected2_prompt,CausalLMOutputWithPast(logits:T1s1x11x32000,past_key_values:DynamicCache(key_cache=#1[T1s1x1x11x96], value_cache=#1[T1s1x1x11x96]));
:second_input_keys,inputs_prompt,inputs2,inputs_empty_cache,inputs_batch1;
:time_create_torch_model,0.1024218149977969;
:time_preprocess_model_id,2.601009327918291e-06;
:time_run,0.018668053002329543;
:time_run22,0.004238023000652902;
:time_run2_batch1,0.0028421219903975725;
:time_run2_empty_cache,0.0030992740066722035;
:time_run2_prompt,0.0037044259952381253;
:time_total_validation_torch,0.04001080698799342;
:version_date,2025-12-05T18:53:16;
:version_device,;
:version_do_run,True;
:version_drop_input,None;
:version_drop_inputs,[];
:version_dtype,;
:version_dump_folder,;
:version_exporter,;
:version_exporter_options,None;
:version_input_options,None;
:version_inputs2,1;
:version_model_id,arnir0/Tiny-LLM;
:version_model_options,None;
:version_numpy,2.3.5;
:version_onnx,1.21.0;
:version_onnx_diagnostic,0.8.4;
:version_onnx_ir,0.1.13;
:version_onnxruntime,1.24.0;
:version_onnxscript,?;
:version_opset,18;
:version_optimization,;
:version_ortfusiontype,;
:version_patch,{'patch': True};
:version_patch_kwargs,{'patch':True,'patch_transformers':True,'patch_diffusers':True};
:version_quiet,False;
:version_rewrite,True;
:version_runtime,onnxruntime;
:version_same_as_pretrained,False;
:version_scipy,1.16.2;
:version_stop_if_static,0;
:version_submodule,None;
:version_torch,2.10.0.dev20251123+cu130;
:version_transformers,5.0.0.dev0;
:version_use_pretrained,False;
Validate and export a model¶
Exports a model given the task. Checks for discrepancies as well. The latency given are just for one run. It tells how long the benchmark runs but it is far from the latency measure we can get by running multiple times the same model.
python -m onnx_diagnostic validate -m arnir0/Tiny-LLM --run -v 1 --export export-nostrict -o dump_models --patch
[validate_model] dump into 'arnir0_Tiny-LLM/export-nostrict/op18'
[validate_model] validate model id 'arnir0/Tiny-LLM'
[validate_model] patch={'patch': True}
[validate_model] get dummy inputs with input_options=None...
[validate_model] rewrite=True, patch_kwargs={'patch': True, 'patch_transformers': True, 'patch_diffusers': True}, stop_if_static=0
[validate_model] exporter='export-nostrict', optimization=None
[validate_model] dump_folder='dump_models/arnir0_Tiny-LLM/export-nostrict/op18'
[validate_model] output_names=None
[get_untrained_model_with_inputs] model_id='arnir0/Tiny-LLM', subfolder=None
[get_untrained_model_with_inputs] use preinstalled 'arnir0/Tiny-LLM'
[get_untrained_model_with_inputs] architecture='LlamaForCausalLM'
[get_untrained_model_with_inputs] cls='LlamaConfig'
[get_untrained_model_with_inputs] task='text-generation'
[get_untrained_model_with_inputs] default config._attn_implementation=None
[get_untrained_model_with_inputs] package_source=transformers from ~/github/transformers/src/transformers/__init__.py
[get_untrained_model_with_inputs] instantiate model_id 'arnir0/Tiny-LLM', subfolder=None
[get_untrained_model_with_inputs] -- done(2) in 2.5260087568312883e-06s
[get_untrained_model_with_inputs] instantiate_specific_model <class 'transformers.models.llama.modeling_llama.LlamaForCausalLM'>
[get_untrained_model_with_inputs] -- done(3) in 5.426991265267134e-06s (model is <class 'NoneType'>)
[get_untrained_model_with_inputs] instantiate_specific_model(2) <class 'transformers.models.llama.modeling_llama.LlamaForCausalLM'>
[get_untrained_model_with_inputs] -- done(4) in 0.094723055997747s (model is <class 'transformers.models.llama.modeling_llama.LlamaForCausalLM'>)
[get_untrained_model_with_inputs] use fct=<function get_inputs at 0x764077b689a0>
[get_untrained_model_with_inputs] model class='LlamaForCausalLM'
[validate_model] --
[validate_model] task=text-generation
[validate_model] size=49.549072265625 Mb
[validate_model] n_weights=12.988992 millions parameters
[validate_model] +INPUT input_ids=T7s2x3
[validate_model] +INPUT attention_mask=T7s2x33
[validate_model] +INPUT position_ids=T7s2x3
[validate_model] +INPUT past_key_values=DynamicCache(key_cache=#1[T1s2x1x30x96], value_cache=#1[T1s2x1x30x96])
[validate_model] +SHAPE input_ids={0:DYN(batch),1:DYN(seq_length)}
[validate_model] +SHAPE attention_mask={0:DYN(batch),1:DYN(cache+seq)}
[validate_model] +SHAPE position_ids={0:DYN(batch),1:DYN(seq_length)}
[validate_model] +SHAPE past_key_values=#2[{0:DYN(batch),2:DYN(cache_length)},{0:DYN(batch),2:DYN(cache_length)}]
[validate_model] second_input_keys=['inputs_prompt', 'inputs2', 'inputs_empty_cache', 'inputs_batch1']
[validate_model] --
[validate_model] -- run the model inputs='inputs'...
[validate_model] inputs=dict(input_ids:T7s2x3,attention_mask:T7s2x33,position_ids:T7s2x3,past_key_values:DynamicCache(key_cache=#1[T1s2x1x30x96], value_cache=#1[T1s2x1x30x96]))
[validate_model] done ([run]) - CausalLMOutputWithPast(logits:T1s2x3x32000,past_key_values:DynamicCache(key_cache=#1[T1s2x1x33x96], value_cache=#1[T1s2x1x33x96]))
[validate_model] -- run the model inputs='inputs_prompt'...
[validate_model] inputs_prompt=dict(input_ids:T7s1x11)
[validate_model] done ([run2_prompt]) - CausalLMOutputWithPast(logits:T1s1x11x32000,past_key_values:DynamicCache(key_cache=#1[T1s1x1x11x96], value_cache=#1[T1s1x1x11x96]))
[validate_model] -- run the model inputs='inputs2'...
[validate_model] inputs2=dict(input_ids:T7s3x4,attention_mask:T7s3x35,position_ids:T7s3x4,past_key_values:DynamicCache(key_cache=#1[T1s3x1x31x96], value_cache=#1[T1s3x1x31x96]))
[validate_model] done ([run22]) - CausalLMOutputWithPast(logits:T1s3x4x32000,past_key_values:DynamicCache(key_cache=#1[T1s3x1x35x96], value_cache=#1[T1s3x1x35x96]))
[validate_model] -- run the model inputs='inputs_empty_cache'...
[validate_model] inputs_empty_cache=dict(input_ids:T7s2x3,attention_mask:T7s2x3,position_ids:T7s2x3,past_key_values:DynamicCache(key_cache=#1[T1s2x1x0x96], value_cache=#1[T1s2x1x0x96]))
[validate_model] done ([run2_empty_cache]) - CausalLMOutputWithPast(logits:T1s2x3x32000,past_key_values:DynamicCache(key_cache=#1[T1s2x1x3x96], value_cache=#1[T1s2x1x3x96]))
[validate_model] -- run the model inputs='inputs_batch1'...
[validate_model] inputs_batch1=dict(input_ids:T7s1x3,attention_mask:T7s1x33,position_ids:T7s1x3,past_key_values:DynamicCache(key_cache=#1[T1s1x1x30x96], value_cache=#1[T1s1x1x30x96]))
[validate_model] done ([run2_batch1]) - CausalLMOutputWithPast(logits:T1s1x3x32000,past_key_values:DynamicCache(key_cache=#1[T1s1x1x33x96], value_cache=#1[T1s1x1x33x96]))
[validate_model] -- export the model with 'export-nostrict', optimization=None
[validate_model] applies patches before exporting stop_if_static=0
[validate_model] run patched model...
[validate_model] patched inputs=dict(input_ids:T7s2x3,attention_mask:T7s2x33,position_ids:T7s2x3,past_key_values:DynamicCache(key_cache=#1[T1s2x1x30x96], value_cache=#1[T1s2x1x30x96]))
[validate_model] done (patched run)
[validate_model] patched discrepancies=abs=0, rel=0, dev=0
[call_torch_export_export] exporter='export-nostrict', strict=False, optimization=None
[call_torch_export_export] args=()
[call_torch_export_export] kwargs=dict(input_ids:T7s2x3,attention_mask:T7s2x33,position_ids:T7s2x3,past_key_values:DynamicCache(key_cache=#1[T1s2x1x30x96], value_cache=#1[T1s2x1x30x96]))
[call_torch_export_export] dynamic_shapes=dict(input_ids:{0:DYN(batch),1:DYN(seq_length)},attention_mask:{0:DYN(batch),1:DYN(cache+seq)},position_ids:{0:DYN(batch),1:DYN(seq_length)},past_key_values:#2[{0:DYN(batch),2:DYN(cache_length)},{0:DYN(batch),2:DYN(cache_length)}])
[call_torch_export_export] dynamic_shapes_export_export=dict(input_ids:{0:DYNAMIC,1:DYNAMIC},attention_mask:{0:DYNAMIC,1:DYNAMIC},position_ids:{0:DYNAMIC,1:DYNAMIC},past_key_values:#2[{0:DYNAMIC,2:DYNAMIC},{0:DYNAMIC,2:DYNAMIC}])
[call_torch_export_export] export...
[call_torch_export_export] done (export) with 160 nodes
[validate_model] run exported model...
[validate_model] patched inputs=dict(input_ids:T7s2x3,attention_mask:T7s2x33,position_ids:T7s2x3,past_key_values:DynamicCache(key_cache=#1[T1s2x1x30x96], value_cache=#1[T1s2x1x30x96]))
[validate_model] done (exported run)
[validate_model] exported discrepancies=abs=0, rel=0, dev=0
[validate_model] -- dumps exported program in 'dump_models/arnir0_Tiny-LLM/export-nostrict/op18'...
[validate_model] done (dump ep)
[validate_model] dumps statistics in 'dump_models/arnir0_Tiny-LLM/export-nostrict/op18'...
[validate_model] done (dump)
[validate_model] -- done (final)
-- summary --
:disc_exported_abs,0;
:disc_exported_dev,0;
:disc_exported_dnan,0;
:disc_exported_n,204672.0;
:disc_exported_rel,0;
:disc_exported_sum,0.0;
:disc_patched_abs,0;
:disc_patched_dev,0;
:disc_patched_dnan,0;
:disc_patched_n,204672.0;
:disc_patched_rel,0;
:disc_patched_sum,0.0;
:dump_folder,dump_models/arnir0_Tiny-LLM/export-nostrict/op18;
:dump_folder_name,arnir0_Tiny-LLM/export-nostrict/op18;
:export_args,();
:export_dynamic_shapes,dict(input_ids:{0:DYN(batch),1:DYN(seq_length)},attention_mask:{0:DYN(batch),1:DYN(cache+seq)},position_ids:{0:DYN(batch),1:DYN(seq_length)},past_key_values:#2[{0:DYN(batch),2:DYN(cache_length)},{0:DYN(batch),2:DYN(cache_length)}]);
:export_dynamic_shapes_export_export,dict(input_ids:{0:DYNAMIC,1:DYNAMIC},attention_mask:{0:DYNAMIC,1:DYNAMIC},position_ids:{0:DYNAMIC,1:DYNAMIC},past_key_values:#2[{0:DYNAMIC,2:DYNAMIC},{0:DYNAMIC,2:DYNAMIC}]);
:export_exporter,export-nostrict;
:export_graph_nodes,160;
:export_kwargs,dict(input_ids:T7s2x3,attention_mask:T7s2x33,position_ids:T7s2x3,past_key_values:DynamicCache(key_cache=#1[T1s2x1x30x96], value_cache=#1[T1s2x1x30x96]));
:export_optimization,;
:export_options,{};
:export_strict,False;
:model_class,LlamaForCausalLM;
:model_config,{'vocab_size':32000,'max_position_embeddings':1024,'hidden_size':192,'intermediate_size':1024,'num_hidden_layers':1,'num_attention_heads':2,'num_key_value_heads':1,'hidden_act':'silu','initializer_range':0.02,'rms_norm_eps':1e-05,'pretraining_tp':1,'use_cache':True,'attention_bias':False,'attention_dropout':0.0,'mlp_bias':False,'head_dim':96,'rope_parameters':{'rope_theta':10000.0,'rope_type':'default'},'return_dict':True,'output_hidden_states':False,'dtype':'float32','tie_word_embeddings':False,'chunk_size_feed_forward':0,'is_encoder_decoder':False,'is_decoder':False,'cross_attention_hidden_size':None,'add_cross_attention':False,'tie_encoder_decoder':False,'architectures':['LlamaForCausalLM'],'finetuning_task':None,'id2label':{0:'LABEL_0',1:'LABEL_1'},'label2id':{'LABEL_0':0,'LABEL_1':1},'task_specific_params':None,'problem_type':None,'tokenizer_class':None,'prefix':None,'bos_token_id':1,'pad_token_id':None,'eos_token_id':2,'sep_token_id':None,'decoder_start_token_id':None,'_name_or_path':'','transformers_version':'5.0.0.dev0','model_type':'llama','subfolder':None,'output_attentions':False};
:model_config_class,LlamaConfig;
:model_file,~/github/transformers/src/transformers/models/llama/modeling_llama.py;
:model_id,arnir0/Tiny-LLM;
:model_inputs,dict(input_ids:T7s2x3,attention_mask:T7s2x33,position_ids:T7s2x3,past_key_values:DynamicCache(key_cache=#1[T1s2x1x30x96], value_cache=#1[T1s2x1x30x96]));
:model_inputs_options,;
:model_module,transformers.models.llama.modeling_llama;
:model_nweights,12988992;
:model_shapes,dict(input_ids:{0:DYN(batch),1:DYN(seq_length)},attention_mask:{0:DYN(batch),1:DYN(cache+seq)},position_ids:{0:DYN(batch),1:DYN(seq_length)},past_key_values:#2[{0:DYN(batch),2:DYN(cache_length)},{0:DYN(batch),2:DYN(cache_length)}]);
:model_size,51955968;
:model_subfolder,;
:model_task,text-generation;
:run_expected,CausalLMOutputWithPast(logits:T1s2x3x32000,past_key_values:DynamicCache(key_cache=#1[T1s2x1x33x96], value_cache=#1[T1s2x1x33x96]));
:run_expected22,CausalLMOutputWithPast(logits:T1s3x4x32000,past_key_values:DynamicCache(key_cache=#1[T1s3x1x35x96], value_cache=#1[T1s3x1x35x96]));
:run_expected2_batch1,CausalLMOutputWithPast(logits:T1s1x3x32000,past_key_values:DynamicCache(key_cache=#1[T1s1x1x33x96], value_cache=#1[T1s1x1x33x96]));
:run_expected2_empty_cache,CausalLMOutputWithPast(logits:T1s2x3x32000,past_key_values:DynamicCache(key_cache=#1[T1s2x1x3x96], value_cache=#1[T1s2x1x3x96]));
:run_expected2_prompt,CausalLMOutputWithPast(logits:T1s1x11x32000,past_key_values:DynamicCache(key_cache=#1[T1s1x1x11x96], value_cache=#1[T1s1x1x11x96]));
:second_input_keys,inputs_prompt,inputs2,inputs_empty_cache,inputs_batch1;
:time_create_torch_model,0.09754282199719455;
:time_export_export,1.4269442619988695;
:time_preprocess_model_id,2.9039947548881173e-06;
:time_run,0.013379272000747733;
:time_run22,0.00484548500389792;
:time_run2_batch1,0.004225794997182675;
:time_run2_empty_cache,0.0040254789928440005;
:time_run2_prompt,0.003381318994797766;
:time_run_exported,0.03103167399240192;
:time_run_patched,0.013351233006687835;
:time_torch_export_export,1.4269368669920368;
:time_torch_export_export_n,1;
:time_total_exporter,2.7121016349992715;
:time_total_validation_torch,0.034585519999382086;
:version_date,2025-12-05T18:53:16;
:version_device,;
:version_do_run,True;
:version_drop_input,None;
:version_drop_inputs,[];
:version_dtype,;
:version_dump_folder,dump_models;
:version_exporter,export-nostrict;
:version_exporter_options,None;
:version_input_options,None;
:version_inputs2,1;
:version_model_id,arnir0/Tiny-LLM;
:version_model_options,None;
:version_numpy,2.3.5;
:version_onnx,1.21.0;
:version_onnx_diagnostic,0.8.4;
:version_onnx_ir,0.1.13;
:version_onnxruntime,1.24.0;
:version_onnxscript,?;
:version_opset,18;
:version_optimization,;
:version_ortfusiontype,;
:version_patch,{'patch': True};
:version_patch_kwargs,{'patch':True,'patch_transformers':True,'patch_diffusers':True};
:version_quiet,False;
:version_rewrite,True;
:version_runtime,onnxruntime;
:version_same_as_pretrained,False;
:version_scipy,1.16.2;
:version_stop_if_static,0;
:version_submodule,None;
:version_torch,2.10.0.dev20251123+cu130;
:version_transformers,5.0.0.dev0;
:version_use_pretrained,False;
Validate ONNX discrepancies¶
Let’s export with ONNX this time and checks for discrepancies.
python -m onnx_diagnostic validate -m arnir0/Tiny-LLM --run -v 1 --export onnx-dynamo -o dump_models --patch --opt ir
[validate_model] dump into 'arnir0_Tiny-LLM/onnx-dynamo/ir/op18'
[validate_model] validate model id 'arnir0/Tiny-LLM'
[validate_model] patch={'patch': True}
[validate_model] get dummy inputs with input_options=None...
[validate_model] rewrite=True, patch_kwargs={'patch': True, 'patch_transformers': True, 'patch_diffusers': True}, stop_if_static=0
[validate_model] exporter='onnx-dynamo', optimization='ir'
[validate_model] dump_folder='dump_models/arnir0_Tiny-LLM/onnx-dynamo/ir/op18'
[validate_model] output_names=None
[get_untrained_model_with_inputs] model_id='arnir0/Tiny-LLM', subfolder=None
[get_untrained_model_with_inputs] use preinstalled 'arnir0/Tiny-LLM'
[get_untrained_model_with_inputs] architecture='LlamaForCausalLM'
[get_untrained_model_with_inputs] cls='LlamaConfig'
[get_untrained_model_with_inputs] task='text-generation'
[get_untrained_model_with_inputs] default config._attn_implementation=None
[get_untrained_model_with_inputs] package_source=transformers from ~/github/transformers/src/transformers/__init__.py
[get_untrained_model_with_inputs] instantiate model_id 'arnir0/Tiny-LLM', subfolder=None
[get_untrained_model_with_inputs] -- done(2) in 1.9601007807068527e-05s
[get_untrained_model_with_inputs] instantiate_specific_model <class 'transformers.models.llama.modeling_llama.LlamaForCausalLM'>
[get_untrained_model_with_inputs] -- done(3) in 5.323003279045224e-06s (model is <class 'NoneType'>)
[get_untrained_model_with_inputs] instantiate_specific_model(2) <class 'transformers.models.llama.modeling_llama.LlamaForCausalLM'>
[get_untrained_model_with_inputs] -- done(4) in 0.10771863799891435s (model is <class 'transformers.models.llama.modeling_llama.LlamaForCausalLM'>)
[get_untrained_model_with_inputs] use fct=<function get_inputs at 0x70903910b880>
[get_untrained_model_with_inputs] model class='LlamaForCausalLM'
[validate_model] --
[validate_model] task=text-generation
[validate_model] size=49.549072265625 Mb
[validate_model] n_weights=12.988992 millions parameters
[validate_model] +INPUT input_ids=T7s2x3
[validate_model] +INPUT attention_mask=T7s2x33
[validate_model] +INPUT position_ids=T7s2x3
[validate_model] +INPUT past_key_values=DynamicCache(key_cache=#1[T1s2x1x30x96], value_cache=#1[T1s2x1x30x96])
[validate_model] +SHAPE input_ids={0:DYN(batch),1:DYN(seq_length)}
[validate_model] +SHAPE attention_mask={0:DYN(batch),1:DYN(cache+seq)}
[validate_model] +SHAPE position_ids={0:DYN(batch),1:DYN(seq_length)}
[validate_model] +SHAPE past_key_values=#2[{0:DYN(batch),2:DYN(cache_length)},{0:DYN(batch),2:DYN(cache_length)}]
[validate_model] second_input_keys=['inputs_prompt', 'inputs2', 'inputs_empty_cache', 'inputs_batch1']
[validate_model] --
[validate_model] -- run the model inputs='inputs'...
[validate_model] inputs=dict(input_ids:T7s2x3,attention_mask:T7s2x33,position_ids:T7s2x3,past_key_values:DynamicCache(key_cache=#1[T1s2x1x30x96], value_cache=#1[T1s2x1x30x96]))
[validate_model] done ([run]) - CausalLMOutputWithPast(logits:T1s2x3x32000,past_key_values:DynamicCache(key_cache=#1[T1s2x1x33x96], value_cache=#1[T1s2x1x33x96]))
[validate_model] -- run the model inputs='inputs_prompt'...
[validate_model] inputs_prompt=dict(input_ids:T7s1x11)
[validate_model] done ([run2_prompt]) - CausalLMOutputWithPast(logits:T1s1x11x32000,past_key_values:DynamicCache(key_cache=#1[T1s1x1x11x96], value_cache=#1[T1s1x1x11x96]))
[validate_model] -- run the model inputs='inputs2'...
[validate_model] inputs2=dict(input_ids:T7s3x4,attention_mask:T7s3x35,position_ids:T7s3x4,past_key_values:DynamicCache(key_cache=#1[T1s3x1x31x96], value_cache=#1[T1s3x1x31x96]))
[validate_model] done ([run22]) - CausalLMOutputWithPast(logits:T1s3x4x32000,past_key_values:DynamicCache(key_cache=#1[T1s3x1x35x96], value_cache=#1[T1s3x1x35x96]))
[validate_model] -- run the model inputs='inputs_empty_cache'...
[validate_model] inputs_empty_cache=dict(input_ids:T7s2x3,attention_mask:T7s2x3,position_ids:T7s2x3,past_key_values:DynamicCache(key_cache=#1[T1s2x1x0x96], value_cache=#1[T1s2x1x0x96]))
[validate_model] done ([run2_empty_cache]) - CausalLMOutputWithPast(logits:T1s2x3x32000,past_key_values:DynamicCache(key_cache=#1[T1s2x1x3x96], value_cache=#1[T1s2x1x3x96]))
[validate_model] -- run the model inputs='inputs_batch1'...
[validate_model] inputs_batch1=dict(input_ids:T7s1x3,attention_mask:T7s1x33,position_ids:T7s1x3,past_key_values:DynamicCache(key_cache=#1[T1s1x1x30x96], value_cache=#1[T1s1x1x30x96]))
[validate_model] done ([run2_batch1]) - CausalLMOutputWithPast(logits:T1s1x3x32000,past_key_values:DynamicCache(key_cache=#1[T1s1x1x33x96], value_cache=#1[T1s1x1x33x96]))
[validate_model] -- export the model with 'onnx-dynamo', optimization='ir'
[validate_model] applies patches before exporting stop_if_static=0
[validate_model] run patched model...
[validate_model] patched inputs=dict(input_ids:T7s2x3,attention_mask:T7s2x33,position_ids:T7s2x3,past_key_values:DynamicCache(key_cache=#1[T1s2x1x30x96], value_cache=#1[T1s2x1x30x96]))
[validate_model] done (patched run)
[validate_model] patched discrepancies=abs=0, rel=0, dev=0
[call_torch_export_onnx] exporter='onnx-dynamo', optimization='ir'
[call_torch_export_onnx] args=()
[call_torch_export_onnx] kwargs=dict(input_ids:T7s2x3,attention_mask:T7s2x33,position_ids:T7s2x3,past_key_values:DynamicCache(key_cache=#1[T1s2x1x30x96], value_cache=#1[T1s2x1x30x96]))
[call_torch_export_onnx] dynamic_shapes=dict(input_ids:{0:DYN(batch),1:DYN(seq_length)},attention_mask:{0:DYN(batch),1:DYN(cache+seq)},position_ids:{0:DYN(batch),1:DYN(seq_length)},past_key_values:#2[{0:DYN(batch),2:DYN(cache_length)},{0:DYN(batch),2:DYN(cache_length)}])
[call_torch_export_onnx] export...
[call_torch_export_onnx] export_export_kwargs=dict(dynamo:bool,dynamic_shapes:dict(input_ids:{0:DYN(batch),1:DYN(seq_length)},attention_mask:{0:DYN(batch),1:DYN(cache+seq)},position_ids:{0:DYN(batch),1:DYN(seq_length)},past_key_values:#2[{0:DYN(batch),2:DYN(cache_length)},{0:DYN(batch),2:DYN(cache_length)}]),opset_version:int)
[torch.onnx] Obtain model graph for `LlamaForCausalLM([...]` with `torch.export.export(..., strict=False)`...
[torch.onnx] Obtain model graph for `LlamaForCausalLM([...]` with `torch.export.export(..., strict=False)`... ✅
[torch.onnx] Run decomposition...
[torch.onnx] Run decomposition... ✅
[torch.onnx] Translate the graph into ONNX...
[torch.onnx] Translate the graph into ONNX... ✅
Applied 38 of general pattern rewrite rules.
[call_torch_export_onnx] done (export)
[call_torch_export_onnx] starts optimization='ir'...
[call_torch_export_onnx] done (optimization)
[validate_model] dumps onnx program in 'dump_models/arnir0_Tiny-LLM/onnx-dynamo/ir/op18'...
[validate_model] done (dump onnx) in 0.17530186199292075
[validate_model] dumps statistics in 'dump_models/arnir0_Tiny-LLM/onnx-dynamo/ir/op18'...
[validate_model] done (dump)
[validate_onnx_model] verify onnx model with providers ['CPUExecutionProvider']..., flavour=None
[validate_onnx_model] runtime is onnxruntime
[validate_onnx_model] done (ort_session) flavour=None
[validate_onnx_model] -- keys=[('inputs', 'run_expected', ''), ('inputs_prompt', 'run_expected2_prompt', '2_prompt'), ('inputs2', 'run_expected22', '22'), ('inputs_empty_cache', 'run_expected2_empty_cache', '2_empty_cache'), ('inputs_batch1', 'run_expected2_batch1', '2_batch1')]
[validate_onnx_model] -- make_feeds for 'inputs'...
[validate_onnx_model] inputs=dict(input_ids:T7s2x3,attention_mask:T7s2x33,position_ids:T7s2x3,past_key_values:DynamicCache(key_cache=#1[T1s2x1x30x96], value_cache=#1[T1s2x1x30x96]))
[validate_onnx_model] ort inputs=dict(input_ids:A7s2x3,attention_mask:A7s2x33,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x30x96,past_key_values_value_0:A1s2x1x30x96)
[validate_onnx_model] done (make_feeds)
[validate_onnx_model] run session on inputs 'inputs'...
[validate_onnx_model] done (run)
[validate_onnx_model] got=#3[A1s2x3x32000,A1s2x1x33x96,A1s2x1x33x96]
[validate_onnx_model] discrepancies=abs=7.748603820800781e-07, rel=0.00044309172106863606, n=204672.0, dev=0
[validate_onnx_model] -- make_feeds for 'inputs2'...
[validate_onnx_model] inputs=dict(input_ids:T7s3x4,attention_mask:T7s3x35,position_ids:T7s3x4,past_key_values:DynamicCache(key_cache=#1[T1s3x1x31x96], value_cache=#1[T1s3x1x31x96]))
[validate_onnx_model] ort inputs=dict(input_ids:A7s3x4,attention_mask:A7s3x35,position_ids:A7s3x4,past_key_values_key_0:A1s3x1x31x96,past_key_values_value_0:A1s3x1x31x96)
[validate_onnx_model] done (make_feeds)
[validate_onnx_model] run session on inputs 'inputs22'...
[validate_onnx_model] done (run)
[validate_onnx_model] got=#3[A1s3x4x32000,A1s3x1x35x96,A1s3x1x35x96]
[validate_onnx_model] discrepancies=abs=8.344650268554688e-07, rel=0.0003670204921167059, n=404160.0, dev=0
[validate_onnx_model] -- make_feeds for 'inputs_empty_cache'...
[validate_onnx_model] inputs=dict(input_ids:T7s2x3,attention_mask:T7s2x3,position_ids:T7s2x3,past_key_values:DynamicCache(key_cache=#1[T1s2x1x0x96], value_cache=#1[T1s2x1x0x96]))
[validate_onnx_model] ort inputs=dict(input_ids:A7s2x3,attention_mask:A7s2x3,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x0x96,past_key_values_value_0:A1s2x1x0x96)
[validate_onnx_model] done (make_feeds)
[validate_onnx_model] run session on inputs 'inputs2_empty_cache'...
[validate_onnx_model] done (run)
[validate_onnx_model] got=#3[A1s2x3x32000,A1s2x1x3x96,A1s2x1x3x96]
[validate_onnx_model] discrepancies=abs=7.152557373046875e-07, rel=0.00028247341543503955, n=193152.0, dev=0
[validate_onnx_model] -- make_feeds for 'inputs_batch1'...
[validate_onnx_model] inputs=dict(input_ids:T7s1x3,attention_mask:T7s1x33,position_ids:T7s1x3,past_key_values:DynamicCache(key_cache=#1[T1s1x1x30x96], value_cache=#1[T1s1x1x30x96]))
[validate_onnx_model] ort inputs=dict(input_ids:A7s1x3,attention_mask:A7s1x33,position_ids:A7s1x3,past_key_values_key_0:A1s1x1x30x96,past_key_values_value_0:A1s1x1x30x96)
[validate_onnx_model] done (make_feeds)
[validate_onnx_model] run session on inputs 'inputs2_batch1'...
[validate_onnx_model] done (run)
[validate_onnx_model] got=#3[A1s1x3x32000,A1s1x1x33x96,A1s1x1x33x96]
[validate_onnx_model] discrepancies=abs=9.5367431640625e-07, rel=0.00030987364736661046, n=102336.0, dev=0
[validate_model] -- done (final)
-- summary --
:disc_onnx_ort_run22_abs,8.344650268554688e-07;
:disc_onnx_ort_run22_dev,0;
:disc_onnx_ort_run22_dnan,0;
:disc_onnx_ort_run22_n,404160.0;
:disc_onnx_ort_run22_rel,0.0003670204921167059;
:disc_onnx_ort_run22_sum,0.037561870639599704;
:disc_onnx_ort_run2_batch1_abs,9.5367431640625e-07;
:disc_onnx_ort_run2_batch1_dev,0;
:disc_onnx_ort_run2_batch1_dnan,0;
:disc_onnx_ort_run2_batch1_n,102336.0;
:disc_onnx_ort_run2_batch1_rel,0.00030987364736661046;
:disc_onnx_ort_run2_batch1_sum,0.011194461939794564;
:disc_onnx_ort_run2_empty_cache_abs,7.152557373046875e-07;
:disc_onnx_ort_run2_empty_cache_dev,0;
:disc_onnx_ort_run2_empty_cache_dnan,0;
:disc_onnx_ort_run2_empty_cache_n,193152.0;
:disc_onnx_ort_run2_empty_cache_rel,0.00028247341543503955;
:disc_onnx_ort_run2_empty_cache_sum,0.01621216703074424;
:disc_onnx_ort_run_abs,7.748603820800781e-07;
:disc_onnx_ort_run_dev,0;
:disc_onnx_ort_run_dnan,0;
:disc_onnx_ort_run_n,204672.0;
:disc_onnx_ort_run_rel,0.00044309172106863606;
:disc_onnx_ort_run_sum,0.02031988672524676;
:disc_patched_abs,0;
:disc_patched_dev,0;
:disc_patched_dnan,0;
:disc_patched_n,204672.0;
:disc_patched_rel,0;
:disc_patched_sum,0.0;
:dump_folder,dump_models/arnir0_Tiny-LLM/onnx-dynamo/ir/op18;
:dump_folder_name,arnir0_Tiny-LLM/onnx-dynamo/ir/op18;
:export_args,();
:export_dynamo,True;
:export_exporter,{};
:export_kwargs,dict(input_ids:T7s2x3,attention_mask:T7s2x33,position_ids:T7s2x3,past_key_values:DynamicCache(key_cache=#1[T1s2x1x30x96], value_cache=#1[T1s2x1x30x96]));
:export_opset,18;
:export_optimization,ir;
:model_class,LlamaForCausalLM;
:model_config,{'vocab_size':32000,'max_position_embeddings':1024,'hidden_size':192,'intermediate_size':1024,'num_hidden_layers':1,'num_attention_heads':2,'num_key_value_heads':1,'hidden_act':'silu','initializer_range':0.02,'rms_norm_eps':1e-05,'pretraining_tp':1,'use_cache':True,'attention_bias':False,'attention_dropout':0.0,'mlp_bias':False,'head_dim':96,'rope_parameters':{'rope_theta':10000.0,'rope_type':'default'},'return_dict':True,'output_hidden_states':False,'dtype':'float32','tie_word_embeddings':False,'chunk_size_feed_forward':0,'is_encoder_decoder':False,'is_decoder':False,'cross_attention_hidden_size':None,'add_cross_attention':False,'tie_encoder_decoder':False,'architectures':['LlamaForCausalLM'],'finetuning_task':None,'id2label':{0:'LABEL_0',1:'LABEL_1'},'label2id':{'LABEL_0':0,'LABEL_1':1},'task_specific_params':None,'problem_type':None,'tokenizer_class':None,'prefix':None,'bos_token_id':1,'pad_token_id':None,'eos_token_id':2,'sep_token_id':None,'decoder_start_token_id':None,'_name_or_path':'','transformers_version':'5.0.0.dev0','model_type':'llama','subfolder':None,'output_attentions':False};
:model_config_class,LlamaConfig;
:model_file,~/github/transformers/src/transformers/models/llama/modeling_llama.py;
:model_id,arnir0/Tiny-LLM;
:model_inputs,dict(input_ids:T7s2x3,attention_mask:T7s2x33,position_ids:T7s2x3,past_key_values:DynamicCache(key_cache=#1[T1s2x1x30x96], value_cache=#1[T1s2x1x30x96]));
:model_inputs_options,;
:model_module,transformers.models.llama.modeling_llama;
:model_nweights,12988992;
:model_shapes,dict(input_ids:{0:DYN(batch),1:DYN(seq_length)},attention_mask:{0:DYN(batch),1:DYN(cache+seq)},position_ids:{0:DYN(batch),1:DYN(seq_length)},past_key_values:#2[{0:DYN(batch),2:DYN(cache_length)},{0:DYN(batch),2:DYN(cache_length)}]);
:model_size,51955968;
:model_subfolder,;
:model_task,text-generation;
:n_node_Add,11;
:n_node_And,2;
:n_node_Cast,2;
:n_node_Concat,16;
:n_node_Cos,1;
:n_node_Expand,6;
:n_node_Gather,1;
:n_node_GatherND,1;
:n_node_IsNaN,1;
:n_node_LessOrEqual,1;
:n_node_MatMul,11;
:n_node_Max,2;
:n_node_Mul,14;
:n_node_Neg,2;
:n_node_Pow,3;
:n_node_Range,3;
:n_node_Reciprocal,3;
:n_node_ReduceMean,3;
:n_node_Reshape,11;
:n_node_Shape,7;
:n_node_Sigmoid,1;
:n_node_Sin,1;
:n_node_Slice,8;
:n_node_Softmax,1;
:n_node_Sqrt,3;
:n_node_Squeeze,5;
:n_node_Transpose,6;
:n_node_Unsqueeze,13;
:n_node_Where,2;
:n_node_functions,0;
:n_node_initializer_1,16;
:n_node_initializer_7,14;
:n_node_initializer_9,1;
:n_node_nodes,141;
:n_node_nodes_nocst,141;
:onnx_filename,dump_models/arnir0_Tiny-LLM/onnx-dynamo/ir/op18/arnir0_Tiny-LLM-onnx-dynamo-ir-op18.onnx;
:onnx_ort_inputs,dict(input_ids:A7s2x3,attention_mask:A7s2x33,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x30x96,past_key_values_value_0:A1s2x1x30x96);
:onnx_ort_inputs22,dict(input_ids:A7s3x4,attention_mask:A7s3x35,position_ids:A7s3x4,past_key_values_key_0:A1s3x1x31x96,past_key_values_value_0:A1s3x1x31x96);
:onnx_ort_inputs2_batch1,dict(input_ids:A7s1x3,attention_mask:A7s1x33,position_ids:A7s1x3,past_key_values_key_0:A1s1x1x30x96,past_key_values_value_0:A1s1x1x30x96);
:onnx_ort_inputs2_empty_cache,dict(input_ids:A7s2x3,attention_mask:A7s2x3,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x0x96,past_key_values_value_0:A1s2x1x0x96);
:onnx_size,210585;
:run_expected,CausalLMOutputWithPast(logits:T1s2x3x32000,past_key_values:DynamicCache(key_cache=#1[T1s2x1x33x96], value_cache=#1[T1s2x1x33x96]));
:run_expected22,CausalLMOutputWithPast(logits:T1s3x4x32000,past_key_values:DynamicCache(key_cache=#1[T1s3x1x35x96], value_cache=#1[T1s3x1x35x96]));
:run_expected2_batch1,CausalLMOutputWithPast(logits:T1s1x3x32000,past_key_values:DynamicCache(key_cache=#1[T1s1x1x33x96], value_cache=#1[T1s1x1x33x96]));
:run_expected2_empty_cache,CausalLMOutputWithPast(logits:T1s2x3x32000,past_key_values:DynamicCache(key_cache=#1[T1s2x1x3x96], value_cache=#1[T1s2x1x3x96]));
:run_expected2_prompt,CausalLMOutputWithPast(logits:T1s1x11x32000,past_key_values:DynamicCache(key_cache=#1[T1s1x1x11x96], value_cache=#1[T1s1x1x11x96]));
:run_feeds_inputs,dict(input_ids:A7s2x3,attention_mask:A7s2x33,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x30x96,past_key_values_value_0:A1s2x1x30x96);
:run_feeds_inputs2,dict(input_ids:A7s3x4,attention_mask:A7s3x35,position_ids:A7s3x4,past_key_values_key_0:A1s3x1x31x96,past_key_values_value_0:A1s3x1x31x96);
:run_feeds_inputs_batch1,dict(input_ids:A7s1x3,attention_mask:A7s1x33,position_ids:A7s1x3,past_key_values_key_0:A1s1x1x30x96,past_key_values_value_0:A1s1x1x30x96);
:run_feeds_inputs_empty_cache,dict(input_ids:A7s2x3,attention_mask:A7s2x3,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x0x96,past_key_values_value_0:A1s2x1x0x96);
:run_output_inputs,#3[A1s2x3x32000,A1s2x1x33x96,A1s2x1x33x96];
:run_output_inputs2,#3[A1s3x4x32000,A1s3x1x35x96,A1s3x1x35x96];
:run_output_inputs_batch1,#3[A1s1x3x32000,A1s1x1x33x96,A1s1x1x33x96];
:run_output_inputs_empty_cache,#3[A1s2x3x32000,A1s2x1x3x96,A1s2x1x3x96];
:second_input_keys,inputs_prompt,inputs2,inputs_empty_cache,inputs_batch1;
:time_create_onnx_ort,0.11510201999044511;
:time_create_torch_model,0.6331655669928296;
:time_export_onnx,4.336783115009894;
:time_export_onnx_opt_ir,0.040906337002525106;
:time_onnx_save,0.17530186199292075;
:time_preprocess_model_id,1.4429970178753138e-06;
:time_run,0.023689802997978404;
:time_run22,0.0039477279933635145;
:time_run2_batch1,0.00456619200122077;
:time_run2_empty_cache,0.004006414994364604;
:time_run2_prompt,0.0063429930014535785;
:time_run_onnx_ort,0.013160804999643005;
:time_run_onnx_ort22,0.0018779730016831309;
:time_run_onnx_ort2_batch1,0.0010899170010816306;
:time_run_onnx_ort2_empty_cache,0.0012567609956022352;
:time_run_patched,0.004444217003765516;
:time_torch_export_export,1.5782458460016642;
:time_torch_export_export_n,1;
:time_total,6.780866973000229;
:time_total_exporter,5.541651061997982;
:time_total_validation_onnx,0.16928344599728007;
:time_total_validation_torch,0.04719909999403171;
:version_date,2025-12-05T18:53:33;
:version_device,;
:version_do_run,True;
:version_drop_input,None;
:version_drop_inputs,[];
:version_dtype,;
:version_dump_folder,dump_models;
:version_exporter,onnx-dynamo;
:version_exporter_options,None;
:version_input_options,None;
:version_inputs2,1;
:version_model_id,arnir0/Tiny-LLM;
:version_model_options,None;
:version_numpy,2.3.5;
:version_onnx,1.21.0;
:version_onnx_diagnostic,0.8.4;
:version_onnx_ir,0.1.13;
:version_onnxruntime,1.24.0;
:version_onnxscript,?;
:version_opset,18;
:version_optimization,ir;
:version_ortfusiontype,;
:version_patch,{'patch': True};
:version_patch_kwargs,{'patch':True,'patch_transformers':True,'patch_diffusers':True};
:version_quiet,False;
:version_rewrite,True;
:version_runtime,onnxruntime;
:version_same_as_pretrained,False;
:version_scipy,1.16.2;
:version_stop_if_static,0;
:version_submodule,None;
:version_torch,2.10.0.dev20251123+cu130;
:version_transformers,5.0.0.dev0;
:version_use_pretrained,False;
[runpythonerror]
/usr/lib/python3.12/copyreg.py:99: FutureWarning: `isinstance(treespec, LeafSpec)` is deprecated, use `isinstance(treespec, TreeSpec) and treespec.is_leaf()` instead.
return cls.__new__(cls, *args)
~/vv/this312/lib/python3.12/site-packages/torch/onnx/_internal/exporter/_onnx_program.py:460: UserWarning: # The axis name: batch will not be used, since it shares the same shape constraints with another axis: batch.
rename_mapping = _dynamic_shapes.create_rename_mapping(
Run onnxruntime fusions¶
This option runs transformers optimizations
implemented in onnxruntime. The list of supported model_type can be found in the documentation
of function onnx_diagnostic.torch_models.validate.run_ort_fusion().
python -m onnx_diagnostic validate -m arnir0/Tiny-LLM --run -v 1 --export onnx-dynamo -o dump_models --patch --opt ir --ortfusiontype ALL
[validate_model] dump into 'arnir0_Tiny-LLM/onnx-dynamo/ir/op18'
[validate_model] validate model id 'arnir0/Tiny-LLM'
[validate_model] patch={'patch': True}
[validate_model] get dummy inputs with input_options=None...
[validate_model] rewrite=True, patch_kwargs={'patch': True, 'patch_transformers': True, 'patch_diffusers': True}, stop_if_static=0
[validate_model] exporter='onnx-dynamo', optimization='ir'
[validate_model] dump_folder='dump_models/arnir0_Tiny-LLM/onnx-dynamo/ir/op18'
[validate_model] output_names=None
[get_untrained_model_with_inputs] model_id='arnir0/Tiny-LLM', subfolder=None
[get_untrained_model_with_inputs] use preinstalled 'arnir0/Tiny-LLM'
[get_untrained_model_with_inputs] architecture='LlamaForCausalLM'
[get_untrained_model_with_inputs] cls='LlamaConfig'
[get_untrained_model_with_inputs] task='text-generation'
[get_untrained_model_with_inputs] default config._attn_implementation=None
[get_untrained_model_with_inputs] package_source=transformers from ~/github/transformers/src/transformers/__init__.py
[get_untrained_model_with_inputs] instantiate model_id 'arnir0/Tiny-LLM', subfolder=None
[get_untrained_model_with_inputs] -- done(2) in 2.5716988602653146e-05s
[get_untrained_model_with_inputs] instantiate_specific_model <class 'transformers.models.llama.modeling_llama.LlamaForCausalLM'>
[get_untrained_model_with_inputs] -- done(3) in 6.684000254608691e-06s (model is <class 'NoneType'>)
[get_untrained_model_with_inputs] instantiate_specific_model(2) <class 'transformers.models.llama.modeling_llama.LlamaForCausalLM'>
[get_untrained_model_with_inputs] -- done(4) in 0.12057524400006514s (model is <class 'transformers.models.llama.modeling_llama.LlamaForCausalLM'>)
[get_untrained_model_with_inputs] use fct=<function get_inputs at 0x73e1d750b9c0>
[get_untrained_model_with_inputs] model class='LlamaForCausalLM'
[validate_model] --
[validate_model] task=text-generation
[validate_model] size=49.549072265625 Mb
[validate_model] n_weights=12.988992 millions parameters
[validate_model] +INPUT input_ids=T7s2x3
[validate_model] +INPUT attention_mask=T7s2x33
[validate_model] +INPUT position_ids=T7s2x3
[validate_model] +INPUT past_key_values=DynamicCache(key_cache=#1[T1s2x1x30x96], value_cache=#1[T1s2x1x30x96])
[validate_model] +SHAPE input_ids={0:DYN(batch),1:DYN(seq_length)}
[validate_model] +SHAPE attention_mask={0:DYN(batch),1:DYN(cache+seq)}
[validate_model] +SHAPE position_ids={0:DYN(batch),1:DYN(seq_length)}
[validate_model] +SHAPE past_key_values=#2[{0:DYN(batch),2:DYN(cache_length)},{0:DYN(batch),2:DYN(cache_length)}]
[validate_model] second_input_keys=['inputs_prompt', 'inputs2', 'inputs_empty_cache', 'inputs_batch1']
[validate_model] --
[validate_model] -- run the model inputs='inputs'...
[validate_model] inputs=dict(input_ids:T7s2x3,attention_mask:T7s2x33,position_ids:T7s2x3,past_key_values:DynamicCache(key_cache=#1[T1s2x1x30x96], value_cache=#1[T1s2x1x30x96]))
[validate_model] done ([run]) - CausalLMOutputWithPast(logits:T1s2x3x32000,past_key_values:DynamicCache(key_cache=#1[T1s2x1x33x96], value_cache=#1[T1s2x1x33x96]))
[validate_model] -- run the model inputs='inputs_prompt'...
[validate_model] inputs_prompt=dict(input_ids:T7s1x11)
[validate_model] done ([run2_prompt]) - CausalLMOutputWithPast(logits:T1s1x11x32000,past_key_values:DynamicCache(key_cache=#1[T1s1x1x11x96], value_cache=#1[T1s1x1x11x96]))
[validate_model] -- run the model inputs='inputs2'...
[validate_model] inputs2=dict(input_ids:T7s3x4,attention_mask:T7s3x35,position_ids:T7s3x4,past_key_values:DynamicCache(key_cache=#1[T1s3x1x31x96], value_cache=#1[T1s3x1x31x96]))
[validate_model] done ([run22]) - CausalLMOutputWithPast(logits:T1s3x4x32000,past_key_values:DynamicCache(key_cache=#1[T1s3x1x35x96], value_cache=#1[T1s3x1x35x96]))
[validate_model] -- run the model inputs='inputs_empty_cache'...
[validate_model] inputs_empty_cache=dict(input_ids:T7s2x3,attention_mask:T7s2x3,position_ids:T7s2x3,past_key_values:DynamicCache(key_cache=#1[T1s2x1x0x96], value_cache=#1[T1s2x1x0x96]))
[validate_model] done ([run2_empty_cache]) - CausalLMOutputWithPast(logits:T1s2x3x32000,past_key_values:DynamicCache(key_cache=#1[T1s2x1x3x96], value_cache=#1[T1s2x1x3x96]))
[validate_model] -- run the model inputs='inputs_batch1'...
[validate_model] inputs_batch1=dict(input_ids:T7s1x3,attention_mask:T7s1x33,position_ids:T7s1x3,past_key_values:DynamicCache(key_cache=#1[T1s1x1x30x96], value_cache=#1[T1s1x1x30x96]))
[validate_model] done ([run2_batch1]) - CausalLMOutputWithPast(logits:T1s1x3x32000,past_key_values:DynamicCache(key_cache=#1[T1s1x1x33x96], value_cache=#1[T1s1x1x33x96]))
[validate_model] -- export the model with 'onnx-dynamo', optimization='ir'
[validate_model] applies patches before exporting stop_if_static=0
[validate_model] run patched model...
[validate_model] patched inputs=dict(input_ids:T7s2x3,attention_mask:T7s2x33,position_ids:T7s2x3,past_key_values:DynamicCache(key_cache=#1[T1s2x1x30x96], value_cache=#1[T1s2x1x30x96]))
[validate_model] done (patched run)
[validate_model] patched discrepancies=abs=0, rel=0, dev=0
[call_torch_export_onnx] exporter='onnx-dynamo', optimization='ir'
[call_torch_export_onnx] args=()
[call_torch_export_onnx] kwargs=dict(input_ids:T7s2x3,attention_mask:T7s2x33,position_ids:T7s2x3,past_key_values:DynamicCache(key_cache=#1[T1s2x1x30x96], value_cache=#1[T1s2x1x30x96]))
[call_torch_export_onnx] dynamic_shapes=dict(input_ids:{0:DYN(batch),1:DYN(seq_length)},attention_mask:{0:DYN(batch),1:DYN(cache+seq)},position_ids:{0:DYN(batch),1:DYN(seq_length)},past_key_values:#2[{0:DYN(batch),2:DYN(cache_length)},{0:DYN(batch),2:DYN(cache_length)}])
[call_torch_export_onnx] export...
[call_torch_export_onnx] export_export_kwargs=dict(dynamo:bool,dynamic_shapes:dict(input_ids:{0:DYN(batch),1:DYN(seq_length)},attention_mask:{0:DYN(batch),1:DYN(cache+seq)},position_ids:{0:DYN(batch),1:DYN(seq_length)},past_key_values:#2[{0:DYN(batch),2:DYN(cache_length)},{0:DYN(batch),2:DYN(cache_length)}]),opset_version:int)
[torch.onnx] Obtain model graph for `LlamaForCausalLM([...]` with `torch.export.export(..., strict=False)`...
[torch.onnx] Obtain model graph for `LlamaForCausalLM([...]` with `torch.export.export(..., strict=False)`... ✅
[torch.onnx] Run decomposition...
[torch.onnx] Run decomposition... ✅
[torch.onnx] Translate the graph into ONNX...
[torch.onnx] Translate the graph into ONNX... ✅
Applied 38 of general pattern rewrite rules.
[call_torch_export_onnx] done (export)
[call_torch_export_onnx] starts optimization='ir'...
[call_torch_export_onnx] done (optimization)
[validate_model] dumps onnx program in 'dump_models/arnir0_Tiny-LLM/onnx-dynamo/ir/op18'...
[validate_model] done (dump onnx) in 0.20543576800264418
[validate_model] dumps statistics in 'dump_models/arnir0_Tiny-LLM/onnx-dynamo/ir/op18'...
[validate_model] done (dump)
[validate_onnx_model] verify onnx model with providers ['CPUExecutionProvider']..., flavour=None
[validate_onnx_model] runtime is onnxruntime
[validate_onnx_model] done (ort_session) flavour=None
[validate_onnx_model] -- keys=[('inputs', 'run_expected', ''), ('inputs_prompt', 'run_expected2_prompt', '2_prompt'), ('inputs2', 'run_expected22', '22'), ('inputs_empty_cache', 'run_expected2_empty_cache', '2_empty_cache'), ('inputs_batch1', 'run_expected2_batch1', '2_batch1')]
[validate_onnx_model] -- make_feeds for 'inputs'...
[validate_onnx_model] inputs=dict(input_ids:T7s2x3,attention_mask:T7s2x33,position_ids:T7s2x3,past_key_values:DynamicCache(key_cache=#1[T1s2x1x30x96], value_cache=#1[T1s2x1x30x96]))
[validate_onnx_model] ort inputs=dict(input_ids:A7s2x3,attention_mask:A7s2x33,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x30x96,past_key_values_value_0:A1s2x1x30x96)
[validate_onnx_model] done (make_feeds)
[validate_onnx_model] run session on inputs 'inputs'...
[validate_onnx_model] done (run)
[validate_onnx_model] got=#3[A1s2x3x32000,A1s2x1x33x96,A1s2x1x33x96]
[validate_onnx_model] discrepancies=abs=7.748603820800781e-07, rel=0.00044309172106863606, n=204672.0, dev=0
[validate_onnx_model] -- make_feeds for 'inputs2'...
[validate_onnx_model] inputs=dict(input_ids:T7s3x4,attention_mask:T7s3x35,position_ids:T7s3x4,past_key_values:DynamicCache(key_cache=#1[T1s3x1x31x96], value_cache=#1[T1s3x1x31x96]))
[validate_onnx_model] ort inputs=dict(input_ids:A7s3x4,attention_mask:A7s3x35,position_ids:A7s3x4,past_key_values_key_0:A1s3x1x31x96,past_key_values_value_0:A1s3x1x31x96)
[validate_onnx_model] done (make_feeds)
[validate_onnx_model] run session on inputs 'inputs22'...
[validate_onnx_model] done (run)
[validate_onnx_model] got=#3[A1s3x4x32000,A1s3x1x35x96,A1s3x1x35x96]
[validate_onnx_model] discrepancies=abs=8.344650268554688e-07, rel=0.0003670204921167059, n=404160.0, dev=0
[validate_onnx_model] -- make_feeds for 'inputs_empty_cache'...
[validate_onnx_model] inputs=dict(input_ids:T7s2x3,attention_mask:T7s2x3,position_ids:T7s2x3,past_key_values:DynamicCache(key_cache=#1[T1s2x1x0x96], value_cache=#1[T1s2x1x0x96]))
[validate_onnx_model] ort inputs=dict(input_ids:A7s2x3,attention_mask:A7s2x3,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x0x96,past_key_values_value_0:A1s2x1x0x96)
[validate_onnx_model] done (make_feeds)
[validate_onnx_model] run session on inputs 'inputs2_empty_cache'...
[validate_onnx_model] done (run)
[validate_onnx_model] got=#3[A1s2x3x32000,A1s2x1x3x96,A1s2x1x3x96]
[validate_onnx_model] discrepancies=abs=7.152557373046875e-07, rel=0.00028247341543503955, n=193152.0, dev=0
[validate_onnx_model] -- make_feeds for 'inputs_batch1'...
[validate_onnx_model] inputs=dict(input_ids:T7s1x3,attention_mask:T7s1x33,position_ids:T7s1x3,past_key_values:DynamicCache(key_cache=#1[T1s1x1x30x96], value_cache=#1[T1s1x1x30x96]))
[validate_onnx_model] ort inputs=dict(input_ids:A7s1x3,attention_mask:A7s1x33,position_ids:A7s1x3,past_key_values_key_0:A1s1x1x30x96,past_key_values_value_0:A1s1x1x30x96)
[validate_onnx_model] done (make_feeds)
[validate_onnx_model] run session on inputs 'inputs2_batch1'...
[validate_onnx_model] done (run)
[validate_onnx_model] got=#3[A1s1x3x32000,A1s1x1x33x96,A1s1x1x33x96]
[validate_onnx_model] discrepancies=abs=9.5367431640625e-07, rel=0.00030987364736661046, n=102336.0, dev=0
[validate_model] run onnxruntime fusion for 'bart'
failed in shape inference <class 'AssertionError'>
[validate_model] done 'bart' in 0.20438803400611505, saved into 'dump_models/arnir0_Tiny-LLM/onnx-dynamo/ir/op18/arnir0_Tiny-LLM-onnx-dynamo-ir-op18.ort.bart.onnx'
[validate_onnx_model] verify onnx model with providers ['CPUExecutionProvider']..., flavour='ortbart'
[validate_onnx_model] runtime is onnxruntime
[validate_onnx_model] done (ort_session) flavour='ortbart'
[validate_onnx_model] -- keys=[('inputs', 'run_expected', ''), ('inputs_prompt', 'run_expected2_prompt', '2_prompt'), ('inputs2', 'run_expected22', '22'), ('inputs_empty_cache', 'run_expected2_empty_cache', '2_empty_cache'), ('inputs_batch1', 'run_expected2_batch1', '2_batch1')]
[validate_onnx_model] -- make_feeds for 'inputs'...
[validate_onnx_model] inputs=dict(input_ids:T7s2x3,attention_mask:T7s2x33,position_ids:T7s2x3,past_key_values:DynamicCache(key_cache=#1[T1s2x1x30x96], value_cache=#1[T1s2x1x30x96]))
[validate_onnx_model] ort inputs=dict(input_ids:A7s2x3,attention_mask:A7s2x33,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x30x96,past_key_values_value_0:A1s2x1x30x96)
[validate_onnx_model] done (make_feeds)
[validate_onnx_model] run session on inputs 'inputs'...
[validate_onnx_model] done (run)
[validate_onnx_model] got=#3[A1s2x3x32000,A1s2x1x33x96,A1s2x1x33x96]
[validate_onnx_model] discrepancies=abs=8.344650268554688e-07, rel=0.00038373230338287646, n=204672.0, dev=0
[validate_onnx_model] -- make_feeds for 'inputs2'...
[validate_onnx_model] inputs=dict(input_ids:T7s3x4,attention_mask:T7s3x35,position_ids:T7s3x4,past_key_values:DynamicCache(key_cache=#1[T1s3x1x31x96], value_cache=#1[T1s3x1x31x96]))
[validate_onnx_model] ort inputs=dict(input_ids:A7s3x4,attention_mask:A7s3x35,position_ids:A7s3x4,past_key_values_key_0:A1s3x1x31x96,past_key_values_value_0:A1s3x1x31x96)
[validate_onnx_model] done (make_feeds)
[validate_onnx_model] run session on inputs 'inputs22'...
[validate_onnx_model] done (run)
[validate_onnx_model] got=#3[A1s3x4x32000,A1s3x1x35x96,A1s3x1x35x96]
[validate_onnx_model] discrepancies=abs=9.5367431640625e-07, rel=0.00033374451371688606, n=404160.0, dev=0
[validate_onnx_model] -- make_feeds for 'inputs_empty_cache'...
[validate_onnx_model] inputs=dict(input_ids:T7s2x3,attention_mask:T7s2x3,position_ids:T7s2x3,past_key_values:DynamicCache(key_cache=#1[T1s2x1x0x96], value_cache=#1[T1s2x1x0x96]))
[validate_onnx_model] ort inputs=dict(input_ids:A7s2x3,attention_mask:A7s2x3,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x0x96,past_key_values_value_0:A1s2x1x0x96)
[validate_onnx_model] done (make_feeds)
[validate_onnx_model] run session on inputs 'inputs2_empty_cache'...
[validate_onnx_model] done (run)
[validate_onnx_model] got=#3[A1s2x3x32000,A1s2x1x3x96,A1s2x1x3x96]
[validate_onnx_model] discrepancies=abs=7.152557373046875e-07, rel=0.0002758755967644076, n=193152.0, dev=0
[validate_onnx_model] -- make_feeds for 'inputs_batch1'...
[validate_onnx_model] inputs=dict(input_ids:T7s1x3,attention_mask:T7s1x33,position_ids:T7s1x3,past_key_values:DynamicCache(key_cache=#1[T1s1x1x30x96], value_cache=#1[T1s1x1x30x96]))
[validate_onnx_model] ort inputs=dict(input_ids:A7s1x3,attention_mask:A7s1x33,position_ids:A7s1x3,past_key_values_key_0:A1s1x1x30x96,past_key_values_value_0:A1s1x1x30x96)
[validate_onnx_model] done (make_feeds)
[validate_onnx_model] run session on inputs 'inputs2_batch1'...
[validate_onnx_model] done (run)
[validate_onnx_model] got=#3[A1s1x3x32000,A1s1x1x33x96,A1s1x1x33x96]
[validate_onnx_model] discrepancies=abs=1.1324882507324219e-06, rel=0.00031306966585304286, n=102336.0, dev=0
[validate_model] run onnxruntime fusion for 'bert'
failed in shape inference <class 'AssertionError'>
[validate_model] done 'bert' in 0.20850337999581825, saved into 'dump_models/arnir0_Tiny-LLM/onnx-dynamo/ir/op18/arnir0_Tiny-LLM-onnx-dynamo-ir-op18.ort.bert.onnx'
[validate_onnx_model] verify onnx model with providers ['CPUExecutionProvider']..., flavour='ortbert'
[validate_onnx_model] runtime is onnxruntime
[validate_onnx_model] done (ort_session) flavour='ortbert'
[validate_onnx_model] -- keys=[('inputs', 'run_expected', ''), ('inputs_prompt', 'run_expected2_prompt', '2_prompt'), ('inputs2', 'run_expected22', '22'), ('inputs_empty_cache', 'run_expected2_empty_cache', '2_empty_cache'), ('inputs_batch1', 'run_expected2_batch1', '2_batch1')]
[validate_onnx_model] -- make_feeds for 'inputs'...
[validate_onnx_model] inputs=dict(input_ids:T7s2x3,attention_mask:T7s2x33,position_ids:T7s2x3,past_key_values:DynamicCache(key_cache=#1[T1s2x1x30x96], value_cache=#1[T1s2x1x30x96]))
[validate_onnx_model] ort inputs=dict(input_ids:A7s2x3,attention_mask:A7s2x33,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x30x96,past_key_values_value_0:A1s2x1x30x96)
[validate_onnx_model] done (make_feeds)
[validate_onnx_model] run session on inputs 'inputs'...
[validate_onnx_model] done (run)
[validate_onnx_model] got=#3[A1s2x3x32000,A1s2x1x33x96,A1s2x1x33x96]
[validate_onnx_model] discrepancies=abs=8.344650268554688e-07, rel=0.00038373230338287646, n=204672.0, dev=0
[validate_onnx_model] -- make_feeds for 'inputs2'...
[validate_onnx_model] inputs=dict(input_ids:T7s3x4,attention_mask:T7s3x35,position_ids:T7s3x4,past_key_values:DynamicCache(key_cache=#1[T1s3x1x31x96], value_cache=#1[T1s3x1x31x96]))
[validate_onnx_model] ort inputs=dict(input_ids:A7s3x4,attention_mask:A7s3x35,position_ids:A7s3x4,past_key_values_key_0:A1s3x1x31x96,past_key_values_value_0:A1s3x1x31x96)
[validate_onnx_model] done (make_feeds)
[validate_onnx_model] run session on inputs 'inputs22'...
[validate_onnx_model] done (run)
[validate_onnx_model] got=#3[A1s3x4x32000,A1s3x1x35x96,A1s3x1x35x96]
[validate_onnx_model] discrepancies=abs=9.5367431640625e-07, rel=0.00033374451371688606, n=404160.0, dev=0
[validate_onnx_model] -- make_feeds for 'inputs_empty_cache'...
[validate_onnx_model] inputs=dict(input_ids:T7s2x3,attention_mask:T7s2x3,position_ids:T7s2x3,past_key_values:DynamicCache(key_cache=#1[T1s2x1x0x96], value_cache=#1[T1s2x1x0x96]))
[validate_onnx_model] ort inputs=dict(input_ids:A7s2x3,attention_mask:A7s2x3,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x0x96,past_key_values_value_0:A1s2x1x0x96)
[validate_onnx_model] done (make_feeds)
[validate_onnx_model] run session on inputs 'inputs2_empty_cache'...
[validate_onnx_model] done (run)
[validate_onnx_model] got=#3[A1s2x3x32000,A1s2x1x3x96,A1s2x1x3x96]
[validate_onnx_model] discrepancies=abs=7.152557373046875e-07, rel=0.0002758755967644076, n=193152.0, dev=0
[validate_onnx_model] -- make_feeds for 'inputs_batch1'...
[validate_onnx_model] inputs=dict(input_ids:T7s1x3,attention_mask:T7s1x33,position_ids:T7s1x3,past_key_values:DynamicCache(key_cache=#1[T1s1x1x30x96], value_cache=#1[T1s1x1x30x96]))
[validate_onnx_model] ort inputs=dict(input_ids:A7s1x3,attention_mask:A7s1x33,position_ids:A7s1x3,past_key_values_key_0:A1s1x1x30x96,past_key_values_value_0:A1s1x1x30x96)
[validate_onnx_model] done (make_feeds)
[validate_onnx_model] run session on inputs 'inputs2_batch1'...
[validate_onnx_model] done (run)
[validate_onnx_model] got=#3[A1s1x3x32000,A1s1x1x33x96,A1s1x1x33x96]
[validate_onnx_model] discrepancies=abs=1.1324882507324219e-06, rel=0.00031306966585304286, n=102336.0, dev=0
[validate_model] run onnxruntime fusion for 'bert_keras'
failed in shape inference <class 'AssertionError'>
[validate_model] done 'bert_keras' in 0.23246258700964972, saved into 'dump_models/arnir0_Tiny-LLM/onnx-dynamo/ir/op18/arnir0_Tiny-LLM-onnx-dynamo-ir-op18.ort.bert_keras.onnx'
[validate_onnx_model] verify onnx model with providers ['CPUExecutionProvider']..., flavour='ortbert_keras'
[validate_onnx_model] runtime is onnxruntime
[validate_onnx_model] done (ort_session) flavour='ortbert_keras'
[validate_onnx_model] -- keys=[('inputs', 'run_expected', ''), ('inputs_prompt', 'run_expected2_prompt', '2_prompt'), ('inputs2', 'run_expected22', '22'), ('inputs_empty_cache', 'run_expected2_empty_cache', '2_empty_cache'), ('inputs_batch1', 'run_expected2_batch1', '2_batch1')]
[validate_onnx_model] -- make_feeds for 'inputs'...
[validate_onnx_model] inputs=dict(input_ids:T7s2x3,attention_mask:T7s2x33,position_ids:T7s2x3,past_key_values:DynamicCache(key_cache=#1[T1s2x1x30x96], value_cache=#1[T1s2x1x30x96]))
[validate_onnx_model] ort inputs=dict(input_ids:A7s2x3,attention_mask:A7s2x33,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x30x96,past_key_values_value_0:A1s2x1x30x96)
[validate_onnx_model] done (make_feeds)
[validate_onnx_model] run session on inputs 'inputs'...
[validate_onnx_model] done (run)
[validate_onnx_model] got=#3[A1s2x3x32000,A1s2x1x33x96,A1s2x1x33x96]
[validate_onnx_model] discrepancies=abs=8.344650268554688e-07, rel=0.00038373230338287646, n=204672.0, dev=0
[validate_onnx_model] -- make_feeds for 'inputs2'...
[validate_onnx_model] inputs=dict(input_ids:T7s3x4,attention_mask:T7s3x35,position_ids:T7s3x4,past_key_values:DynamicCache(key_cache=#1[T1s3x1x31x96], value_cache=#1[T1s3x1x31x96]))
[validate_onnx_model] ort inputs=dict(input_ids:A7s3x4,attention_mask:A7s3x35,position_ids:A7s3x4,past_key_values_key_0:A1s3x1x31x96,past_key_values_value_0:A1s3x1x31x96)
[validate_onnx_model] done (make_feeds)
[validate_onnx_model] run session on inputs 'inputs22'...
[validate_onnx_model] done (run)
[validate_onnx_model] got=#3[A1s3x4x32000,A1s3x1x35x96,A1s3x1x35x96]
[validate_onnx_model] discrepancies=abs=9.5367431640625e-07, rel=0.00033374451371688606, n=404160.0, dev=0
[validate_onnx_model] -- make_feeds for 'inputs_empty_cache'...
[validate_onnx_model] inputs=dict(input_ids:T7s2x3,attention_mask:T7s2x3,position_ids:T7s2x3,past_key_values:DynamicCache(key_cache=#1[T1s2x1x0x96], value_cache=#1[T1s2x1x0x96]))
[validate_onnx_model] ort inputs=dict(input_ids:A7s2x3,attention_mask:A7s2x3,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x0x96,past_key_values_value_0:A1s2x1x0x96)
[validate_onnx_model] done (make_feeds)
[validate_onnx_model] run session on inputs 'inputs2_empty_cache'...
[validate_onnx_model] done (run)
[validate_onnx_model] got=#3[A1s2x3x32000,A1s2x1x3x96,A1s2x1x3x96]
[validate_onnx_model] discrepancies=abs=7.152557373046875e-07, rel=0.0002758755967644076, n=193152.0, dev=0
[validate_onnx_model] -- make_feeds for 'inputs_batch1'...
[validate_onnx_model] inputs=dict(input_ids:T7s1x3,attention_mask:T7s1x33,position_ids:T7s1x3,past_key_values:DynamicCache(key_cache=#1[T1s1x1x30x96], value_cache=#1[T1s1x1x30x96]))
[validate_onnx_model] ort inputs=dict(input_ids:A7s1x3,attention_mask:A7s1x33,position_ids:A7s1x3,past_key_values_key_0:A1s1x1x30x96,past_key_values_value_0:A1s1x1x30x96)
[validate_onnx_model] done (make_feeds)
[validate_onnx_model] run session on inputs 'inputs2_batch1'...
[validate_onnx_model] done (run)
[validate_onnx_model] got=#3[A1s1x3x32000,A1s1x1x33x96,A1s1x1x33x96]
[validate_onnx_model] discrepancies=abs=1.1324882507324219e-06, rel=0.00031306966585304286, n=102336.0, dev=0
[validate_model] run onnxruntime fusion for 'bert_tf'
failed in shape inference <class 'AssertionError'>
[validate_model] done 'bert_tf' in 0.13830022100592032, saved into 'dump_models/arnir0_Tiny-LLM/onnx-dynamo/ir/op18/arnir0_Tiny-LLM-onnx-dynamo-ir-op18.ort.bert_tf.onnx'
[validate_onnx_model] verify onnx model with providers ['CPUExecutionProvider']..., flavour='ortbert_tf'
[validate_onnx_model] runtime is onnxruntime
[validate_onnx_model] done (ort_session) flavour='ortbert_tf'
[validate_onnx_model] -- keys=[('inputs', 'run_expected', ''), ('inputs_prompt', 'run_expected2_prompt', '2_prompt'), ('inputs2', 'run_expected22', '22'), ('inputs_empty_cache', 'run_expected2_empty_cache', '2_empty_cache'), ('inputs_batch1', 'run_expected2_batch1', '2_batch1')]
[validate_onnx_model] -- make_feeds for 'inputs'...
[validate_onnx_model] inputs=dict(input_ids:T7s2x3,attention_mask:T7s2x33,position_ids:T7s2x3,past_key_values:DynamicCache(key_cache=#1[T1s2x1x30x96], value_cache=#1[T1s2x1x30x96]))
[validate_onnx_model] ort inputs=dict(input_ids:A7s2x3,attention_mask:A7s2x33,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x30x96,past_key_values_value_0:A1s2x1x30x96)
[validate_onnx_model] done (make_feeds)
[validate_onnx_model] run session on inputs 'inputs'...
[validate_onnx_model] done (run)
[validate_onnx_model] got=#3[A1s2x3x32000,A1s2x1x33x96,A1s2x1x33x96]
[validate_onnx_model] discrepancies=abs=8.344650268554688e-07, rel=0.00038373230338287646, n=204672.0, dev=0
[validate_onnx_model] -- make_feeds for 'inputs2'...
[validate_onnx_model] inputs=dict(input_ids:T7s3x4,attention_mask:T7s3x35,position_ids:T7s3x4,past_key_values:DynamicCache(key_cache=#1[T1s3x1x31x96], value_cache=#1[T1s3x1x31x96]))
[validate_onnx_model] ort inputs=dict(input_ids:A7s3x4,attention_mask:A7s3x35,position_ids:A7s3x4,past_key_values_key_0:A1s3x1x31x96,past_key_values_value_0:A1s3x1x31x96)
[validate_onnx_model] done (make_feeds)
[validate_onnx_model] run session on inputs 'inputs22'...
[validate_onnx_model] done (run)
[validate_onnx_model] got=#3[A1s3x4x32000,A1s3x1x35x96,A1s3x1x35x96]
[validate_onnx_model] discrepancies=abs=9.5367431640625e-07, rel=0.00033374451371688606, n=404160.0, dev=0
[validate_onnx_model] -- make_feeds for 'inputs_empty_cache'...
[validate_onnx_model] inputs=dict(input_ids:T7s2x3,attention_mask:T7s2x3,position_ids:T7s2x3,past_key_values:DynamicCache(key_cache=#1[T1s2x1x0x96], value_cache=#1[T1s2x1x0x96]))
[validate_onnx_model] ort inputs=dict(input_ids:A7s2x3,attention_mask:A7s2x3,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x0x96,past_key_values_value_0:A1s2x1x0x96)
[validate_onnx_model] done (make_feeds)
[validate_onnx_model] run session on inputs 'inputs2_empty_cache'...
[validate_onnx_model] done (run)
[validate_onnx_model] got=#3[A1s2x3x32000,A1s2x1x3x96,A1s2x1x3x96]
[validate_onnx_model] discrepancies=abs=7.152557373046875e-07, rel=0.0002758755967644076, n=193152.0, dev=0
[validate_onnx_model] -- make_feeds for 'inputs_batch1'...
[validate_onnx_model] inputs=dict(input_ids:T7s1x3,attention_mask:T7s1x33,position_ids:T7s1x3,past_key_values:DynamicCache(key_cache=#1[T1s1x1x30x96], value_cache=#1[T1s1x1x30x96]))
[validate_onnx_model] ort inputs=dict(input_ids:A7s1x3,attention_mask:A7s1x33,position_ids:A7s1x3,past_key_values_key_0:A1s1x1x30x96,past_key_values_value_0:A1s1x1x30x96)
[validate_onnx_model] done (make_feeds)
[validate_onnx_model] run session on inputs 'inputs2_batch1'...
[validate_onnx_model] done (run)
[validate_onnx_model] got=#3[A1s1x3x32000,A1s1x1x33x96,A1s1x1x33x96]
[validate_onnx_model] discrepancies=abs=1.1324882507324219e-06, rel=0.00031306966585304286, n=102336.0, dev=0
[validate_model] run onnxruntime fusion for 'clip'
failed in shape inference <class 'AssertionError'>
[validate_model] done 'clip' in 0.20571301701420452, saved into 'dump_models/arnir0_Tiny-LLM/onnx-dynamo/ir/op18/arnir0_Tiny-LLM-onnx-dynamo-ir-op18.ort.clip.onnx'
[validate_onnx_model] verify onnx model with providers ['CPUExecutionProvider']..., flavour='ortclip'
[validate_onnx_model] runtime is onnxruntime
[validate_onnx_model] done (ort_session) flavour='ortclip'
[validate_onnx_model] -- keys=[('inputs', 'run_expected', ''), ('inputs_prompt', 'run_expected2_prompt', '2_prompt'), ('inputs2', 'run_expected22', '22'), ('inputs_empty_cache', 'run_expected2_empty_cache', '2_empty_cache'), ('inputs_batch1', 'run_expected2_batch1', '2_batch1')]
[validate_onnx_model] -- make_feeds for 'inputs'...
[validate_onnx_model] inputs=dict(input_ids:T7s2x3,attention_mask:T7s2x33,position_ids:T7s2x3,past_key_values:DynamicCache(key_cache=#1[T1s2x1x30x96], value_cache=#1[T1s2x1x30x96]))
[validate_onnx_model] ort inputs=dict(input_ids:A7s2x3,attention_mask:A7s2x33,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x30x96,past_key_values_value_0:A1s2x1x30x96)
[validate_onnx_model] done (make_feeds)
[validate_onnx_model] run session on inputs 'inputs'...
[validate_onnx_model] done (run)
[validate_onnx_model] got=#3[A1s2x3x32000,A1s2x1x33x96,A1s2x1x33x96]
[validate_onnx_model] discrepancies=abs=8.344650268554688e-07, rel=0.00038373230338287646, n=204672.0, dev=0
[validate_onnx_model] -- make_feeds for 'inputs2'...
[validate_onnx_model] inputs=dict(input_ids:T7s3x4,attention_mask:T7s3x35,position_ids:T7s3x4,past_key_values:DynamicCache(key_cache=#1[T1s3x1x31x96], value_cache=#1[T1s3x1x31x96]))
[validate_onnx_model] ort inputs=dict(input_ids:A7s3x4,attention_mask:A7s3x35,position_ids:A7s3x4,past_key_values_key_0:A1s3x1x31x96,past_key_values_value_0:A1s3x1x31x96)
[validate_onnx_model] done (make_feeds)
[validate_onnx_model] run session on inputs 'inputs22'...
[validate_onnx_model] done (run)
[validate_onnx_model] got=#3[A1s3x4x32000,A1s3x1x35x96,A1s3x1x35x96]
[validate_onnx_model] discrepancies=abs=9.5367431640625e-07, rel=0.00033374451371688606, n=404160.0, dev=0
[validate_onnx_model] -- make_feeds for 'inputs_empty_cache'...
[validate_onnx_model] inputs=dict(input_ids:T7s2x3,attention_mask:T7s2x3,position_ids:T7s2x3,past_key_values:DynamicCache(key_cache=#1[T1s2x1x0x96], value_cache=#1[T1s2x1x0x96]))
[validate_onnx_model] ort inputs=dict(input_ids:A7s2x3,attention_mask:A7s2x3,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x0x96,past_key_values_value_0:A1s2x1x0x96)
[validate_onnx_model] done (make_feeds)
[validate_onnx_model] run session on inputs 'inputs2_empty_cache'...
[validate_onnx_model] done (run)
[validate_onnx_model] got=#3[A1s2x3x32000,A1s2x1x3x96,A1s2x1x3x96]
[validate_onnx_model] discrepancies=abs=7.152557373046875e-07, rel=0.0002758755967644076, n=193152.0, dev=0
[validate_onnx_model] -- make_feeds for 'inputs_batch1'...
[validate_onnx_model] inputs=dict(input_ids:T7s1x3,attention_mask:T7s1x33,position_ids:T7s1x3,past_key_values:DynamicCache(key_cache=#1[T1s1x1x30x96], value_cache=#1[T1s1x1x30x96]))
[validate_onnx_model] ort inputs=dict(input_ids:A7s1x3,attention_mask:A7s1x33,position_ids:A7s1x3,past_key_values_key_0:A1s1x1x30x96,past_key_values_value_0:A1s1x1x30x96)
[validate_onnx_model] done (make_feeds)
[validate_onnx_model] run session on inputs 'inputs2_batch1'...
[validate_onnx_model] done (run)
[validate_onnx_model] got=#3[A1s1x3x32000,A1s1x1x33x96,A1s1x1x33x96]
[validate_onnx_model] discrepancies=abs=1.1324882507324219e-06, rel=0.00031306966585304286, n=102336.0, dev=0
[validate_model] run onnxruntime fusion for 'conformer'
failed in shape inference <class 'AssertionError'>
[validate_model] done 'conformer' in 0.22301872800744604, saved into 'dump_models/arnir0_Tiny-LLM/onnx-dynamo/ir/op18/arnir0_Tiny-LLM-onnx-dynamo-ir-op18.ort.conformer.onnx'
[validate_onnx_model] verify onnx model with providers ['CPUExecutionProvider']..., flavour='ortconformer'
[validate_onnx_model] runtime is onnxruntime
[validate_onnx_model] done (ort_session) flavour='ortconformer'
[validate_onnx_model] -- keys=[('inputs', 'run_expected', ''), ('inputs_prompt', 'run_expected2_prompt', '2_prompt'), ('inputs2', 'run_expected22', '22'), ('inputs_empty_cache', 'run_expected2_empty_cache', '2_empty_cache'), ('inputs_batch1', 'run_expected2_batch1', '2_batch1')]
[validate_onnx_model] -- make_feeds for 'inputs'...
[validate_onnx_model] inputs=dict(input_ids:T7s2x3,attention_mask:T7s2x33,position_ids:T7s2x3,past_key_values:DynamicCache(key_cache=#1[T1s2x1x30x96], value_cache=#1[T1s2x1x30x96]))
[validate_onnx_model] ort inputs=dict(input_ids:A7s2x3,attention_mask:A7s2x33,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x30x96,past_key_values_value_0:A1s2x1x30x96)
[validate_onnx_model] done (make_feeds)
[validate_onnx_model] run session on inputs 'inputs'...
[validate_onnx_model] done (run)
[validate_onnx_model] got=#3[A1s2x3x32000,A1s2x1x33x96,A1s2x1x33x96]
[validate_onnx_model] discrepancies=abs=8.344650268554688e-07, rel=0.00038373230338287646, n=204672.0, dev=0
[validate_onnx_model] -- make_feeds for 'inputs2'...
[validate_onnx_model] inputs=dict(input_ids:T7s3x4,attention_mask:T7s3x35,position_ids:T7s3x4,past_key_values:DynamicCache(key_cache=#1[T1s3x1x31x96], value_cache=#1[T1s3x1x31x96]))
[validate_onnx_model] ort inputs=dict(input_ids:A7s3x4,attention_mask:A7s3x35,position_ids:A7s3x4,past_key_values_key_0:A1s3x1x31x96,past_key_values_value_0:A1s3x1x31x96)
[validate_onnx_model] done (make_feeds)
[validate_onnx_model] run session on inputs 'inputs22'...
[validate_onnx_model] done (run)
[validate_onnx_model] got=#3[A1s3x4x32000,A1s3x1x35x96,A1s3x1x35x96]
[validate_onnx_model] discrepancies=abs=9.5367431640625e-07, rel=0.00033374451371688606, n=404160.0, dev=0
[validate_onnx_model] -- make_feeds for 'inputs_empty_cache'...
[validate_onnx_model] inputs=dict(input_ids:T7s2x3,attention_mask:T7s2x3,position_ids:T7s2x3,past_key_values:DynamicCache(key_cache=#1[T1s2x1x0x96], value_cache=#1[T1s2x1x0x96]))
[validate_onnx_model] ort inputs=dict(input_ids:A7s2x3,attention_mask:A7s2x3,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x0x96,past_key_values_value_0:A1s2x1x0x96)
[validate_onnx_model] done (make_feeds)
[validate_onnx_model] run session on inputs 'inputs2_empty_cache'...
[validate_onnx_model] done (run)
[validate_onnx_model] got=#3[A1s2x3x32000,A1s2x1x3x96,A1s2x1x3x96]
[validate_onnx_model] discrepancies=abs=7.152557373046875e-07, rel=0.0002758755967644076, n=193152.0, dev=0
[validate_onnx_model] -- make_feeds for 'inputs_batch1'...
[validate_onnx_model] inputs=dict(input_ids:T7s1x3,attention_mask:T7s1x33,position_ids:T7s1x3,past_key_values:DynamicCache(key_cache=#1[T1s1x1x30x96], value_cache=#1[T1s1x1x30x96]))
[validate_onnx_model] ort inputs=dict(input_ids:A7s1x3,attention_mask:A7s1x33,position_ids:A7s1x3,past_key_values_key_0:A1s1x1x30x96,past_key_values_value_0:A1s1x1x30x96)
[validate_onnx_model] done (make_feeds)
[validate_onnx_model] run session on inputs 'inputs2_batch1'...
[validate_onnx_model] done (run)
[validate_onnx_model] got=#3[A1s1x3x32000,A1s1x1x33x96,A1s1x1x33x96]
[validate_onnx_model] discrepancies=abs=1.1324882507324219e-06, rel=0.00031306966585304286, n=102336.0, dev=0
[validate_model] run onnxruntime fusion for 'gpt2'
failed in shape inference <class 'AssertionError'>
[validate_model] done 'gpt2' in 0.25617580900143366, saved into 'dump_models/arnir0_Tiny-LLM/onnx-dynamo/ir/op18/arnir0_Tiny-LLM-onnx-dynamo-ir-op18.ort.gpt2.onnx'
[validate_onnx_model] verify onnx model with providers ['CPUExecutionProvider']..., flavour='ortgpt2'
[validate_onnx_model] runtime is onnxruntime
[validate_onnx_model] done (ort_session) flavour='ortgpt2'
[validate_onnx_model] -- keys=[('inputs', 'run_expected', ''), ('inputs_prompt', 'run_expected2_prompt', '2_prompt'), ('inputs2', 'run_expected22', '22'), ('inputs_empty_cache', 'run_expected2_empty_cache', '2_empty_cache'), ('inputs_batch1', 'run_expected2_batch1', '2_batch1')]
[validate_onnx_model] -- make_feeds for 'inputs'...
[validate_onnx_model] inputs=dict(input_ids:T7s2x3,attention_mask:T7s2x33,position_ids:T7s2x3,past_key_values:DynamicCache(key_cache=#1[T1s2x1x30x96], value_cache=#1[T1s2x1x30x96]))
[validate_onnx_model] ort inputs=dict(input_ids:A7s2x3,attention_mask:A7s2x33,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x30x96,past_key_values_value_0:A1s2x1x30x96)
[validate_onnx_model] done (make_feeds)
[validate_onnx_model] run session on inputs 'inputs'...
[validate_onnx_model] done (run)
[validate_onnx_model] got=#3[A1s2x3x32000,A1s2x1x33x96,A1s2x1x33x96]
[validate_onnx_model] discrepancies=abs=8.344650268554688e-07, rel=0.00038373230338287646, n=204672.0, dev=0
[validate_onnx_model] -- make_feeds for 'inputs2'...
[validate_onnx_model] inputs=dict(input_ids:T7s3x4,attention_mask:T7s3x35,position_ids:T7s3x4,past_key_values:DynamicCache(key_cache=#1[T1s3x1x31x96], value_cache=#1[T1s3x1x31x96]))
[validate_onnx_model] ort inputs=dict(input_ids:A7s3x4,attention_mask:A7s3x35,position_ids:A7s3x4,past_key_values_key_0:A1s3x1x31x96,past_key_values_value_0:A1s3x1x31x96)
[validate_onnx_model] done (make_feeds)
[validate_onnx_model] run session on inputs 'inputs22'...
[validate_onnx_model] done (run)
[validate_onnx_model] got=#3[A1s3x4x32000,A1s3x1x35x96,A1s3x1x35x96]
[validate_onnx_model] discrepancies=abs=9.5367431640625e-07, rel=0.00033374451371688606, n=404160.0, dev=0
[validate_onnx_model] -- make_feeds for 'inputs_empty_cache'...
[validate_onnx_model] inputs=dict(input_ids:T7s2x3,attention_mask:T7s2x3,position_ids:T7s2x3,past_key_values:DynamicCache(key_cache=#1[T1s2x1x0x96], value_cache=#1[T1s2x1x0x96]))
[validate_onnx_model] ort inputs=dict(input_ids:A7s2x3,attention_mask:A7s2x3,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x0x96,past_key_values_value_0:A1s2x1x0x96)
[validate_onnx_model] done (make_feeds)
[validate_onnx_model] run session on inputs 'inputs2_empty_cache'...
[validate_onnx_model] done (run)
[validate_onnx_model] got=#3[A1s2x3x32000,A1s2x1x3x96,A1s2x1x3x96]
[validate_onnx_model] discrepancies=abs=7.152557373046875e-07, rel=0.0002758755967644076, n=193152.0, dev=0
[validate_onnx_model] -- make_feeds for 'inputs_batch1'...
[validate_onnx_model] inputs=dict(input_ids:T7s1x3,attention_mask:T7s1x33,position_ids:T7s1x3,past_key_values:DynamicCache(key_cache=#1[T1s1x1x30x96], value_cache=#1[T1s1x1x30x96]))
[validate_onnx_model] ort inputs=dict(input_ids:A7s1x3,attention_mask:A7s1x33,position_ids:A7s1x3,past_key_values_key_0:A1s1x1x30x96,past_key_values_value_0:A1s1x1x30x96)
[validate_onnx_model] done (make_feeds)
[validate_onnx_model] run session on inputs 'inputs2_batch1'...
[validate_onnx_model] done (run)
[validate_onnx_model] got=#3[A1s1x3x32000,A1s1x1x33x96,A1s1x1x33x96]
[validate_onnx_model] discrepancies=abs=1.1324882507324219e-06, rel=0.00031306966585304286, n=102336.0, dev=0
[validate_model] run onnxruntime fusion for 'gpt2_tf'
failed in shape inference <class 'AssertionError'>
[validate_model] done 'gpt2_tf' in 0.21968804800417274, saved into 'dump_models/arnir0_Tiny-LLM/onnx-dynamo/ir/op18/arnir0_Tiny-LLM-onnx-dynamo-ir-op18.ort.gpt2_tf.onnx'
[validate_onnx_model] verify onnx model with providers ['CPUExecutionProvider']..., flavour='ortgpt2_tf'
[validate_onnx_model] runtime is onnxruntime
[validate_onnx_model] done (ort_session) flavour='ortgpt2_tf'
[validate_onnx_model] -- keys=[('inputs', 'run_expected', ''), ('inputs_prompt', 'run_expected2_prompt', '2_prompt'), ('inputs2', 'run_expected22', '22'), ('inputs_empty_cache', 'run_expected2_empty_cache', '2_empty_cache'), ('inputs_batch1', 'run_expected2_batch1', '2_batch1')]
[validate_onnx_model] -- make_feeds for 'inputs'...
[validate_onnx_model] inputs=dict(input_ids:T7s2x3,attention_mask:T7s2x33,position_ids:T7s2x3,past_key_values:DynamicCache(key_cache=#1[T1s2x1x30x96], value_cache=#1[T1s2x1x30x96]))
[validate_onnx_model] ort inputs=dict(input_ids:A7s2x3,attention_mask:A7s2x33,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x30x96,past_key_values_value_0:A1s2x1x30x96)
[validate_onnx_model] done (make_feeds)
[validate_onnx_model] run session on inputs 'inputs'...
[validate_onnx_model] done (run)
[validate_onnx_model] got=#3[A1s2x3x32000,A1s2x1x33x96,A1s2x1x33x96]
[validate_onnx_model] discrepancies=abs=8.344650268554688e-07, rel=0.00038373230338287646, n=204672.0, dev=0
[validate_onnx_model] -- make_feeds for 'inputs2'...
[validate_onnx_model] inputs=dict(input_ids:T7s3x4,attention_mask:T7s3x35,position_ids:T7s3x4,past_key_values:DynamicCache(key_cache=#1[T1s3x1x31x96], value_cache=#1[T1s3x1x31x96]))
[validate_onnx_model] ort inputs=dict(input_ids:A7s3x4,attention_mask:A7s3x35,position_ids:A7s3x4,past_key_values_key_0:A1s3x1x31x96,past_key_values_value_0:A1s3x1x31x96)
[validate_onnx_model] done (make_feeds)
[validate_onnx_model] run session on inputs 'inputs22'...
[validate_onnx_model] done (run)
[validate_onnx_model] got=#3[A1s3x4x32000,A1s3x1x35x96,A1s3x1x35x96]
[validate_onnx_model] discrepancies=abs=9.5367431640625e-07, rel=0.00033374451371688606, n=404160.0, dev=0
[validate_onnx_model] -- make_feeds for 'inputs_empty_cache'...
[validate_onnx_model] inputs=dict(input_ids:T7s2x3,attention_mask:T7s2x3,position_ids:T7s2x3,past_key_values:DynamicCache(key_cache=#1[T1s2x1x0x96], value_cache=#1[T1s2x1x0x96]))
[validate_onnx_model] ort inputs=dict(input_ids:A7s2x3,attention_mask:A7s2x3,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x0x96,past_key_values_value_0:A1s2x1x0x96)
[validate_onnx_model] done (make_feeds)
[validate_onnx_model] run session on inputs 'inputs2_empty_cache'...
[validate_onnx_model] done (run)
[validate_onnx_model] got=#3[A1s2x3x32000,A1s2x1x3x96,A1s2x1x3x96]
[validate_onnx_model] discrepancies=abs=7.152557373046875e-07, rel=0.0002758755967644076, n=193152.0, dev=0
[validate_onnx_model] -- make_feeds for 'inputs_batch1'...
[validate_onnx_model] inputs=dict(input_ids:T7s1x3,attention_mask:T7s1x33,position_ids:T7s1x3,past_key_values:DynamicCache(key_cache=#1[T1s1x1x30x96], value_cache=#1[T1s1x1x30x96]))
[validate_onnx_model] ort inputs=dict(input_ids:A7s1x3,attention_mask:A7s1x33,position_ids:A7s1x3,past_key_values_key_0:A1s1x1x30x96,past_key_values_value_0:A1s1x1x30x96)
[validate_onnx_model] done (make_feeds)
[validate_onnx_model] run session on inputs 'inputs2_batch1'...
[validate_onnx_model] done (run)
[validate_onnx_model] got=#3[A1s1x3x32000,A1s1x1x33x96,A1s1x1x33x96]
[validate_onnx_model] discrepancies=abs=1.1324882507324219e-06, rel=0.00031306966585304286, n=102336.0, dev=0
[validate_model] run onnxruntime fusion for 'gpt_neox'
failed in shape inference <class 'AssertionError'>
[validate_model] done 'gpt_neox' in 0.227634588998626, saved into 'dump_models/arnir0_Tiny-LLM/onnx-dynamo/ir/op18/arnir0_Tiny-LLM-onnx-dynamo-ir-op18.ort.gpt_neox.onnx'
[validate_onnx_model] verify onnx model with providers ['CPUExecutionProvider']..., flavour='ortgpt_neox'
[validate_onnx_model] runtime is onnxruntime
[validate_onnx_model] done (ort_session) flavour='ortgpt_neox'
[validate_onnx_model] -- keys=[('inputs', 'run_expected', ''), ('inputs_prompt', 'run_expected2_prompt', '2_prompt'), ('inputs2', 'run_expected22', '22'), ('inputs_empty_cache', 'run_expected2_empty_cache', '2_empty_cache'), ('inputs_batch1', 'run_expected2_batch1', '2_batch1')]
[validate_onnx_model] -- make_feeds for 'inputs'...
[validate_onnx_model] inputs=dict(input_ids:T7s2x3,attention_mask:T7s2x33,position_ids:T7s2x3,past_key_values:DynamicCache(key_cache=#1[T1s2x1x30x96], value_cache=#1[T1s2x1x30x96]))
[validate_onnx_model] ort inputs=dict(input_ids:A7s2x3,attention_mask:A7s2x33,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x30x96,past_key_values_value_0:A1s2x1x30x96)
[validate_onnx_model] done (make_feeds)
[validate_onnx_model] run session on inputs 'inputs'...
[validate_onnx_model] done (run)
[validate_onnx_model] got=#3[A1s2x3x32000,A1s2x1x33x96,A1s2x1x33x96]
[validate_onnx_model] discrepancies=abs=8.344650268554688e-07, rel=0.00038373230338287646, n=204672.0, dev=0
[validate_onnx_model] -- make_feeds for 'inputs2'...
[validate_onnx_model] inputs=dict(input_ids:T7s3x4,attention_mask:T7s3x35,position_ids:T7s3x4,past_key_values:DynamicCache(key_cache=#1[T1s3x1x31x96], value_cache=#1[T1s3x1x31x96]))
[validate_onnx_model] ort inputs=dict(input_ids:A7s3x4,attention_mask:A7s3x35,position_ids:A7s3x4,past_key_values_key_0:A1s3x1x31x96,past_key_values_value_0:A1s3x1x31x96)
[validate_onnx_model] done (make_feeds)
[validate_onnx_model] run session on inputs 'inputs22'...
[validate_onnx_model] done (run)
[validate_onnx_model] got=#3[A1s3x4x32000,A1s3x1x35x96,A1s3x1x35x96]
[validate_onnx_model] discrepancies=abs=9.5367431640625e-07, rel=0.00033374451371688606, n=404160.0, dev=0
[validate_onnx_model] -- make_feeds for 'inputs_empty_cache'...
[validate_onnx_model] inputs=dict(input_ids:T7s2x3,attention_mask:T7s2x3,position_ids:T7s2x3,past_key_values:DynamicCache(key_cache=#1[T1s2x1x0x96], value_cache=#1[T1s2x1x0x96]))
[validate_onnx_model] ort inputs=dict(input_ids:A7s2x3,attention_mask:A7s2x3,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x0x96,past_key_values_value_0:A1s2x1x0x96)
[validate_onnx_model] done (make_feeds)
[validate_onnx_model] run session on inputs 'inputs2_empty_cache'...
[validate_onnx_model] done (run)
[validate_onnx_model] got=#3[A1s2x3x32000,A1s2x1x3x96,A1s2x1x3x96]
[validate_onnx_model] discrepancies=abs=7.152557373046875e-07, rel=0.0002758755967644076, n=193152.0, dev=0
[validate_onnx_model] -- make_feeds for 'inputs_batch1'...
[validate_onnx_model] inputs=dict(input_ids:T7s1x3,attention_mask:T7s1x33,position_ids:T7s1x3,past_key_values:DynamicCache(key_cache=#1[T1s1x1x30x96], value_cache=#1[T1s1x1x30x96]))
[validate_onnx_model] ort inputs=dict(input_ids:A7s1x3,attention_mask:A7s1x33,position_ids:A7s1x3,past_key_values_key_0:A1s1x1x30x96,past_key_values_value_0:A1s1x1x30x96)
[validate_onnx_model] done (make_feeds)
[validate_onnx_model] run session on inputs 'inputs2_batch1'...
[validate_onnx_model] done (run)
[validate_onnx_model] got=#3[A1s1x3x32000,A1s1x1x33x96,A1s1x1x33x96]
[validate_onnx_model] discrepancies=abs=1.1324882507324219e-06, rel=0.00031306966585304286, n=102336.0, dev=0
[validate_model] run onnxruntime fusion for 'mmdit'
failed in shape inference <class 'AssertionError'>
[validate_model] done 'mmdit' in 0.20451642200350761, saved into 'dump_models/arnir0_Tiny-LLM/onnx-dynamo/ir/op18/arnir0_Tiny-LLM-onnx-dynamo-ir-op18.ort.mmdit.onnx'
[validate_onnx_model] verify onnx model with providers ['CPUExecutionProvider']..., flavour='ortmmdit'
[validate_onnx_model] runtime is onnxruntime
[validate_onnx_model] done (ort_session) flavour='ortmmdit'
[validate_onnx_model] -- keys=[('inputs', 'run_expected', ''), ('inputs_prompt', 'run_expected2_prompt', '2_prompt'), ('inputs2', 'run_expected22', '22'), ('inputs_empty_cache', 'run_expected2_empty_cache', '2_empty_cache'), ('inputs_batch1', 'run_expected2_batch1', '2_batch1')]
[validate_onnx_model] -- make_feeds for 'inputs'...
[validate_onnx_model] inputs=dict(input_ids:T7s2x3,attention_mask:T7s2x33,position_ids:T7s2x3,past_key_values:DynamicCache(key_cache=#1[T1s2x1x30x96], value_cache=#1[T1s2x1x30x96]))
[validate_onnx_model] ort inputs=dict(input_ids:A7s2x3,attention_mask:A7s2x33,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x30x96,past_key_values_value_0:A1s2x1x30x96)
[validate_onnx_model] done (make_feeds)
[validate_onnx_model] run session on inputs 'inputs'...
[validate_onnx_model] done (run)
[validate_onnx_model] got=#3[A1s2x3x32000,A1s2x1x33x96,A1s2x1x33x96]
[validate_onnx_model] discrepancies=abs=8.344650268554688e-07, rel=0.00038373230338287646, n=204672.0, dev=0
[validate_onnx_model] -- make_feeds for 'inputs2'...
[validate_onnx_model] inputs=dict(input_ids:T7s3x4,attention_mask:T7s3x35,position_ids:T7s3x4,past_key_values:DynamicCache(key_cache=#1[T1s3x1x31x96], value_cache=#1[T1s3x1x31x96]))
[validate_onnx_model] ort inputs=dict(input_ids:A7s3x4,attention_mask:A7s3x35,position_ids:A7s3x4,past_key_values_key_0:A1s3x1x31x96,past_key_values_value_0:A1s3x1x31x96)
[validate_onnx_model] done (make_feeds)
[validate_onnx_model] run session on inputs 'inputs22'...
[validate_onnx_model] done (run)
[validate_onnx_model] got=#3[A1s3x4x32000,A1s3x1x35x96,A1s3x1x35x96]
[validate_onnx_model] discrepancies=abs=9.5367431640625e-07, rel=0.00033374451371688606, n=404160.0, dev=0
[validate_onnx_model] -- make_feeds for 'inputs_empty_cache'...
[validate_onnx_model] inputs=dict(input_ids:T7s2x3,attention_mask:T7s2x3,position_ids:T7s2x3,past_key_values:DynamicCache(key_cache=#1[T1s2x1x0x96], value_cache=#1[T1s2x1x0x96]))
[validate_onnx_model] ort inputs=dict(input_ids:A7s2x3,attention_mask:A7s2x3,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x0x96,past_key_values_value_0:A1s2x1x0x96)
[validate_onnx_model] done (make_feeds)
[validate_onnx_model] run session on inputs 'inputs2_empty_cache'...
[validate_onnx_model] done (run)
[validate_onnx_model] got=#3[A1s2x3x32000,A1s2x1x3x96,A1s2x1x3x96]
[validate_onnx_model] discrepancies=abs=7.152557373046875e-07, rel=0.0002758755967644076, n=193152.0, dev=0
[validate_onnx_model] -- make_feeds for 'inputs_batch1'...
[validate_onnx_model] inputs=dict(input_ids:T7s1x3,attention_mask:T7s1x33,position_ids:T7s1x3,past_key_values:DynamicCache(key_cache=#1[T1s1x1x30x96], value_cache=#1[T1s1x1x30x96]))
[validate_onnx_model] ort inputs=dict(input_ids:A7s1x3,attention_mask:A7s1x33,position_ids:A7s1x3,past_key_values_key_0:A1s1x1x30x96,past_key_values_value_0:A1s1x1x30x96)
[validate_onnx_model] done (make_feeds)
[validate_onnx_model] run session on inputs 'inputs2_batch1'...
[validate_onnx_model] done (run)
[validate_onnx_model] got=#3[A1s1x3x32000,A1s1x1x33x96,A1s1x1x33x96]
[validate_onnx_model] discrepancies=abs=1.1324882507324219e-06, rel=0.00031306966585304286, n=102336.0, dev=0
[validate_model] run onnxruntime fusion for 'phi'
[validate_model] done 'phi' in 0.1024268359906273, saved into 'dump_models/arnir0_Tiny-LLM/onnx-dynamo/ir/op18/arnir0_Tiny-LLM-onnx-dynamo-ir-op18.ort.phi.onnx'
[validate_onnx_model] missing 'dump_models/arnir0_Tiny-LLM/onnx-dynamo/ir/op18/arnir0_Tiny-LLM-onnx-dynamo-ir-op18.ort.phi.onnx'
[validate_model] run onnxruntime fusion for 'sam2'
failed in shape inference <class 'AssertionError'>
[validate_model] done 'sam2' in 0.17527699800848495, saved into 'dump_models/arnir0_Tiny-LLM/onnx-dynamo/ir/op18/arnir0_Tiny-LLM-onnx-dynamo-ir-op18.ort.sam2.onnx'
[validate_onnx_model] verify onnx model with providers ['CPUExecutionProvider']..., flavour='ortsam2'
[validate_onnx_model] runtime is onnxruntime
[validate_onnx_model] done (ort_session) flavour='ortsam2'
[validate_onnx_model] -- keys=[('inputs', 'run_expected', ''), ('inputs_prompt', 'run_expected2_prompt', '2_prompt'), ('inputs2', 'run_expected22', '22'), ('inputs_empty_cache', 'run_expected2_empty_cache', '2_empty_cache'), ('inputs_batch1', 'run_expected2_batch1', '2_batch1')]
[validate_onnx_model] -- make_feeds for 'inputs'...
[validate_onnx_model] inputs=dict(input_ids:T7s2x3,attention_mask:T7s2x33,position_ids:T7s2x3,past_key_values:DynamicCache(key_cache=#1[T1s2x1x30x96], value_cache=#1[T1s2x1x30x96]))
[validate_onnx_model] ort inputs=dict(input_ids:A7s2x3,attention_mask:A7s2x33,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x30x96,past_key_values_value_0:A1s2x1x30x96)
[validate_onnx_model] done (make_feeds)
[validate_onnx_model] run session on inputs 'inputs'...
[validate_onnx_model] done (run)
[validate_onnx_model] got=#3[A1s2x3x32000,A1s2x1x33x96,A1s2x1x33x96]
[validate_onnx_model] discrepancies=abs=7.748603820800781e-07, rel=0.00044309172106863606, n=204672.0, dev=0
[validate_onnx_model] -- make_feeds for 'inputs2'...
[validate_onnx_model] inputs=dict(input_ids:T7s3x4,attention_mask:T7s3x35,position_ids:T7s3x4,past_key_values:DynamicCache(key_cache=#1[T1s3x1x31x96], value_cache=#1[T1s3x1x31x96]))
[validate_onnx_model] ort inputs=dict(input_ids:A7s3x4,attention_mask:A7s3x35,position_ids:A7s3x4,past_key_values_key_0:A1s3x1x31x96,past_key_values_value_0:A1s3x1x31x96)
[validate_onnx_model] done (make_feeds)
[validate_onnx_model] run session on inputs 'inputs22'...
[validate_onnx_model] done (run)
[validate_onnx_model] got=#3[A1s3x4x32000,A1s3x1x35x96,A1s3x1x35x96]
[validate_onnx_model] discrepancies=abs=8.344650268554688e-07, rel=0.0003670204921167059, n=404160.0, dev=0
[validate_onnx_model] -- make_feeds for 'inputs_empty_cache'...
[validate_onnx_model] inputs=dict(input_ids:T7s2x3,attention_mask:T7s2x3,position_ids:T7s2x3,past_key_values:DynamicCache(key_cache=#1[T1s2x1x0x96], value_cache=#1[T1s2x1x0x96]))
[validate_onnx_model] ort inputs=dict(input_ids:A7s2x3,attention_mask:A7s2x3,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x0x96,past_key_values_value_0:A1s2x1x0x96)
[validate_onnx_model] done (make_feeds)
[validate_onnx_model] run session on inputs 'inputs2_empty_cache'...
[validate_onnx_model] done (run)
[validate_onnx_model] got=#3[A1s2x3x32000,A1s2x1x3x96,A1s2x1x3x96]
[validate_onnx_model] discrepancies=abs=7.152557373046875e-07, rel=0.00028247341543503955, n=193152.0, dev=0
[validate_onnx_model] -- make_feeds for 'inputs_batch1'...
[validate_onnx_model] inputs=dict(input_ids:T7s1x3,attention_mask:T7s1x33,position_ids:T7s1x3,past_key_values:DynamicCache(key_cache=#1[T1s1x1x30x96], value_cache=#1[T1s1x1x30x96]))
[validate_onnx_model] ort inputs=dict(input_ids:A7s1x3,attention_mask:A7s1x33,position_ids:A7s1x3,past_key_values_key_0:A1s1x1x30x96,past_key_values_value_0:A1s1x1x30x96)
[validate_onnx_model] done (make_feeds)
[validate_onnx_model] run session on inputs 'inputs2_batch1'...
[validate_onnx_model] done (run)
[validate_onnx_model] got=#3[A1s1x3x32000,A1s1x1x33x96,A1s1x1x33x96]
[validate_onnx_model] discrepancies=abs=9.5367431640625e-07, rel=0.00030987364736661046, n=102336.0, dev=0
[validate_model] run onnxruntime fusion for 'swin'
failed in shape inference <class 'AssertionError'>
[validate_model] done 'swin' in 0.16543381300289184, saved into 'dump_models/arnir0_Tiny-LLM/onnx-dynamo/ir/op18/arnir0_Tiny-LLM-onnx-dynamo-ir-op18.ort.swin.onnx'
[validate_onnx_model] verify onnx model with providers ['CPUExecutionProvider']..., flavour='ortswin'
[validate_onnx_model] runtime is onnxruntime
[validate_onnx_model] done (ort_session) flavour='ortswin'
[validate_onnx_model] -- keys=[('inputs', 'run_expected', ''), ('inputs_prompt', 'run_expected2_prompt', '2_prompt'), ('inputs2', 'run_expected22', '22'), ('inputs_empty_cache', 'run_expected2_empty_cache', '2_empty_cache'), ('inputs_batch1', 'run_expected2_batch1', '2_batch1')]
[validate_onnx_model] -- make_feeds for 'inputs'...
[validate_onnx_model] inputs=dict(input_ids:T7s2x3,attention_mask:T7s2x33,position_ids:T7s2x3,past_key_values:DynamicCache(key_cache=#1[T1s2x1x30x96], value_cache=#1[T1s2x1x30x96]))
[validate_onnx_model] ort inputs=dict(input_ids:A7s2x3,attention_mask:A7s2x33,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x30x96,past_key_values_value_0:A1s2x1x30x96)
[validate_onnx_model] done (make_feeds)
[validate_onnx_model] run session on inputs 'inputs'...
[validate_onnx_model] done (run)
[validate_onnx_model] got=#3[A1s2x3x32000,A1s2x1x33x96,A1s2x1x33x96]
[validate_onnx_model] discrepancies=abs=8.344650268554688e-07, rel=0.00038373230338287646, n=204672.0, dev=0
[validate_onnx_model] -- make_feeds for 'inputs2'...
[validate_onnx_model] inputs=dict(input_ids:T7s3x4,attention_mask:T7s3x35,position_ids:T7s3x4,past_key_values:DynamicCache(key_cache=#1[T1s3x1x31x96], value_cache=#1[T1s3x1x31x96]))
[validate_onnx_model] ort inputs=dict(input_ids:A7s3x4,attention_mask:A7s3x35,position_ids:A7s3x4,past_key_values_key_0:A1s3x1x31x96,past_key_values_value_0:A1s3x1x31x96)
[validate_onnx_model] done (make_feeds)
[validate_onnx_model] run session on inputs 'inputs22'...
[validate_onnx_model] done (run)
[validate_onnx_model] got=#3[A1s3x4x32000,A1s3x1x35x96,A1s3x1x35x96]
[validate_onnx_model] discrepancies=abs=9.5367431640625e-07, rel=0.00033374451371688606, n=404160.0, dev=0
[validate_onnx_model] -- make_feeds for 'inputs_empty_cache'...
[validate_onnx_model] inputs=dict(input_ids:T7s2x3,attention_mask:T7s2x3,position_ids:T7s2x3,past_key_values:DynamicCache(key_cache=#1[T1s2x1x0x96], value_cache=#1[T1s2x1x0x96]))
[validate_onnx_model] ort inputs=dict(input_ids:A7s2x3,attention_mask:A7s2x3,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x0x96,past_key_values_value_0:A1s2x1x0x96)
[validate_onnx_model] done (make_feeds)
[validate_onnx_model] run session on inputs 'inputs2_empty_cache'...
[validate_onnx_model] done (run)
[validate_onnx_model] got=#3[A1s2x3x32000,A1s2x1x3x96,A1s2x1x3x96]
[validate_onnx_model] discrepancies=abs=7.152557373046875e-07, rel=0.0002758755967644076, n=193152.0, dev=0
[validate_onnx_model] -- make_feeds for 'inputs_batch1'...
[validate_onnx_model] inputs=dict(input_ids:T7s1x3,attention_mask:T7s1x33,position_ids:T7s1x3,past_key_values:DynamicCache(key_cache=#1[T1s1x1x30x96], value_cache=#1[T1s1x1x30x96]))
[validate_onnx_model] ort inputs=dict(input_ids:A7s1x3,attention_mask:A7s1x33,position_ids:A7s1x3,past_key_values_key_0:A1s1x1x30x96,past_key_values_value_0:A1s1x1x30x96)
[validate_onnx_model] done (make_feeds)
[validate_onnx_model] run session on inputs 'inputs2_batch1'...
[validate_onnx_model] done (run)
[validate_onnx_model] got=#3[A1s1x3x32000,A1s1x1x33x96,A1s1x1x33x96]
[validate_onnx_model] discrepancies=abs=1.1324882507324219e-06, rel=0.00031306966585304286, n=102336.0, dev=0
[validate_model] run onnxruntime fusion for 't5'
failed in shape inference <class 'AssertionError'>
[validate_model] done 't5' in 0.15775669300637674, saved into 'dump_models/arnir0_Tiny-LLM/onnx-dynamo/ir/op18/arnir0_Tiny-LLM-onnx-dynamo-ir-op18.ort.t5.onnx'
[validate_onnx_model] verify onnx model with providers ['CPUExecutionProvider']..., flavour='ortt5'
[validate_onnx_model] runtime is onnxruntime
[validate_onnx_model] done (ort_session) flavour='ortt5'
[validate_onnx_model] -- keys=[('inputs', 'run_expected', ''), ('inputs_prompt', 'run_expected2_prompt', '2_prompt'), ('inputs2', 'run_expected22', '22'), ('inputs_empty_cache', 'run_expected2_empty_cache', '2_empty_cache'), ('inputs_batch1', 'run_expected2_batch1', '2_batch1')]
[validate_onnx_model] -- make_feeds for 'inputs'...
[validate_onnx_model] inputs=dict(input_ids:T7s2x3,attention_mask:T7s2x33,position_ids:T7s2x3,past_key_values:DynamicCache(key_cache=#1[T1s2x1x30x96], value_cache=#1[T1s2x1x30x96]))
[validate_onnx_model] ort inputs=dict(input_ids:A7s2x3,attention_mask:A7s2x33,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x30x96,past_key_values_value_0:A1s2x1x30x96)
[validate_onnx_model] done (make_feeds)
[validate_onnx_model] run session on inputs 'inputs'...
[validate_onnx_model] done (run)
[validate_onnx_model] got=#3[A1s2x3x32000,A1s2x1x33x96,A1s2x1x33x96]
[validate_onnx_model] discrepancies=abs=8.344650268554688e-07, rel=0.00038373230338287646, n=204672.0, dev=0
[validate_onnx_model] -- make_feeds for 'inputs2'...
[validate_onnx_model] inputs=dict(input_ids:T7s3x4,attention_mask:T7s3x35,position_ids:T7s3x4,past_key_values:DynamicCache(key_cache=#1[T1s3x1x31x96], value_cache=#1[T1s3x1x31x96]))
[validate_onnx_model] ort inputs=dict(input_ids:A7s3x4,attention_mask:A7s3x35,position_ids:A7s3x4,past_key_values_key_0:A1s3x1x31x96,past_key_values_value_0:A1s3x1x31x96)
[validate_onnx_model] done (make_feeds)
[validate_onnx_model] run session on inputs 'inputs22'...
[validate_onnx_model] done (run)
[validate_onnx_model] got=#3[A1s3x4x32000,A1s3x1x35x96,A1s3x1x35x96]
[validate_onnx_model] discrepancies=abs=9.5367431640625e-07, rel=0.00033374451371688606, n=404160.0, dev=0
[validate_onnx_model] -- make_feeds for 'inputs_empty_cache'...
[validate_onnx_model] inputs=dict(input_ids:T7s2x3,attention_mask:T7s2x3,position_ids:T7s2x3,past_key_values:DynamicCache(key_cache=#1[T1s2x1x0x96], value_cache=#1[T1s2x1x0x96]))
[validate_onnx_model] ort inputs=dict(input_ids:A7s2x3,attention_mask:A7s2x3,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x0x96,past_key_values_value_0:A1s2x1x0x96)
[validate_onnx_model] done (make_feeds)
[validate_onnx_model] run session on inputs 'inputs2_empty_cache'...
[validate_onnx_model] done (run)
[validate_onnx_model] got=#3[A1s2x3x32000,A1s2x1x3x96,A1s2x1x3x96]
[validate_onnx_model] discrepancies=abs=7.152557373046875e-07, rel=0.0002758755967644076, n=193152.0, dev=0
[validate_onnx_model] -- make_feeds for 'inputs_batch1'...
[validate_onnx_model] inputs=dict(input_ids:T7s1x3,attention_mask:T7s1x33,position_ids:T7s1x3,past_key_values:DynamicCache(key_cache=#1[T1s1x1x30x96], value_cache=#1[T1s1x1x30x96]))
[validate_onnx_model] ort inputs=dict(input_ids:A7s1x3,attention_mask:A7s1x33,position_ids:A7s1x3,past_key_values_key_0:A1s1x1x30x96,past_key_values_value_0:A1s1x1x30x96)
[validate_onnx_model] done (make_feeds)
[validate_onnx_model] run session on inputs 'inputs2_batch1'...
[validate_onnx_model] done (run)
[validate_onnx_model] got=#3[A1s1x3x32000,A1s1x1x33x96,A1s1x1x33x96]
[validate_onnx_model] discrepancies=abs=1.1324882507324219e-06, rel=0.00031306966585304286, n=102336.0, dev=0
[validate_model] run onnxruntime fusion for 'tnlr'
failed in shape inference <class 'AssertionError'>
[validate_model] done 'tnlr' in 0.14180395398580004, saved into 'dump_models/arnir0_Tiny-LLM/onnx-dynamo/ir/op18/arnir0_Tiny-LLM-onnx-dynamo-ir-op18.ort.tnlr.onnx'
[validate_onnx_model] verify onnx model with providers ['CPUExecutionProvider']..., flavour='orttnlr'
[validate_onnx_model] runtime is onnxruntime
[validate_onnx_model] done (ort_session) flavour='orttnlr'
[validate_onnx_model] -- keys=[('inputs', 'run_expected', ''), ('inputs_prompt', 'run_expected2_prompt', '2_prompt'), ('inputs2', 'run_expected22', '22'), ('inputs_empty_cache', 'run_expected2_empty_cache', '2_empty_cache'), ('inputs_batch1', 'run_expected2_batch1', '2_batch1')]
[validate_onnx_model] -- make_feeds for 'inputs'...
[validate_onnx_model] inputs=dict(input_ids:T7s2x3,attention_mask:T7s2x33,position_ids:T7s2x3,past_key_values:DynamicCache(key_cache=#1[T1s2x1x30x96], value_cache=#1[T1s2x1x30x96]))
[validate_onnx_model] ort inputs=dict(input_ids:A7s2x3,attention_mask:A7s2x33,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x30x96,past_key_values_value_0:A1s2x1x30x96)
[validate_onnx_model] done (make_feeds)
[validate_onnx_model] run session on inputs 'inputs'...
[validate_onnx_model] done (run)
[validate_onnx_model] got=#3[A1s2x3x32000,A1s2x1x33x96,A1s2x1x33x96]
[validate_onnx_model] discrepancies=abs=8.344650268554688e-07, rel=0.00038373230338287646, n=204672.0, dev=0
[validate_onnx_model] -- make_feeds for 'inputs2'...
[validate_onnx_model] inputs=dict(input_ids:T7s3x4,attention_mask:T7s3x35,position_ids:T7s3x4,past_key_values:DynamicCache(key_cache=#1[T1s3x1x31x96], value_cache=#1[T1s3x1x31x96]))
[validate_onnx_model] ort inputs=dict(input_ids:A7s3x4,attention_mask:A7s3x35,position_ids:A7s3x4,past_key_values_key_0:A1s3x1x31x96,past_key_values_value_0:A1s3x1x31x96)
[validate_onnx_model] done (make_feeds)
[validate_onnx_model] run session on inputs 'inputs22'...
[validate_onnx_model] done (run)
[validate_onnx_model] got=#3[A1s3x4x32000,A1s3x1x35x96,A1s3x1x35x96]
[validate_onnx_model] discrepancies=abs=9.5367431640625e-07, rel=0.00033374451371688606, n=404160.0, dev=0
[validate_onnx_model] -- make_feeds for 'inputs_empty_cache'...
[validate_onnx_model] inputs=dict(input_ids:T7s2x3,attention_mask:T7s2x3,position_ids:T7s2x3,past_key_values:DynamicCache(key_cache=#1[T1s2x1x0x96], value_cache=#1[T1s2x1x0x96]))
[validate_onnx_model] ort inputs=dict(input_ids:A7s2x3,attention_mask:A7s2x3,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x0x96,past_key_values_value_0:A1s2x1x0x96)
[validate_onnx_model] done (make_feeds)
[validate_onnx_model] run session on inputs 'inputs2_empty_cache'...
[validate_onnx_model] done (run)
[validate_onnx_model] got=#3[A1s2x3x32000,A1s2x1x3x96,A1s2x1x3x96]
[validate_onnx_model] discrepancies=abs=7.152557373046875e-07, rel=0.0002758755967644076, n=193152.0, dev=0
[validate_onnx_model] -- make_feeds for 'inputs_batch1'...
[validate_onnx_model] inputs=dict(input_ids:T7s1x3,attention_mask:T7s1x33,position_ids:T7s1x3,past_key_values:DynamicCache(key_cache=#1[T1s1x1x30x96], value_cache=#1[T1s1x1x30x96]))
[validate_onnx_model] ort inputs=dict(input_ids:A7s1x3,attention_mask:A7s1x33,position_ids:A7s1x3,past_key_values_key_0:A1s1x1x30x96,past_key_values_value_0:A1s1x1x30x96)
[validate_onnx_model] done (make_feeds)
[validate_onnx_model] run session on inputs 'inputs2_batch1'...
[validate_onnx_model] done (run)
[validate_onnx_model] got=#3[A1s1x3x32000,A1s1x1x33x96,A1s1x1x33x96]
[validate_onnx_model] discrepancies=abs=1.1324882507324219e-06, rel=0.00031306966585304286, n=102336.0, dev=0
[validate_model] run onnxruntime fusion for 'unet'
failed in shape inference <class 'AssertionError'>
[validate_model] done 'unet' in 0.13654210600361694, saved into 'dump_models/arnir0_Tiny-LLM/onnx-dynamo/ir/op18/arnir0_Tiny-LLM-onnx-dynamo-ir-op18.ort.unet.onnx'
[validate_onnx_model] verify onnx model with providers ['CPUExecutionProvider']..., flavour='ortunet'
[validate_onnx_model] runtime is onnxruntime
[validate_onnx_model] done (ort_session) flavour='ortunet'
[validate_onnx_model] -- keys=[('inputs', 'run_expected', ''), ('inputs_prompt', 'run_expected2_prompt', '2_prompt'), ('inputs2', 'run_expected22', '22'), ('inputs_empty_cache', 'run_expected2_empty_cache', '2_empty_cache'), ('inputs_batch1', 'run_expected2_batch1', '2_batch1')]
[validate_onnx_model] -- make_feeds for 'inputs'...
[validate_onnx_model] inputs=dict(input_ids:T7s2x3,attention_mask:T7s2x33,position_ids:T7s2x3,past_key_values:DynamicCache(key_cache=#1[T1s2x1x30x96], value_cache=#1[T1s2x1x30x96]))
[validate_onnx_model] ort inputs=dict(input_ids:A7s2x3,attention_mask:A7s2x33,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x30x96,past_key_values_value_0:A1s2x1x30x96)
[validate_onnx_model] done (make_feeds)
[validate_onnx_model] run session on inputs 'inputs'...
[validate_onnx_model] done (run)
[validate_onnx_model] got=#3[A1s2x3x32000,A1s2x1x33x96,A1s2x1x33x96]
[validate_onnx_model] discrepancies=abs=7.748603820800781e-07, rel=0.00044309172106863606, n=204672.0, dev=0
[validate_onnx_model] -- make_feeds for 'inputs2'...
[validate_onnx_model] inputs=dict(input_ids:T7s3x4,attention_mask:T7s3x35,position_ids:T7s3x4,past_key_values:DynamicCache(key_cache=#1[T1s3x1x31x96], value_cache=#1[T1s3x1x31x96]))
[validate_onnx_model] ort inputs=dict(input_ids:A7s3x4,attention_mask:A7s3x35,position_ids:A7s3x4,past_key_values_key_0:A1s3x1x31x96,past_key_values_value_0:A1s3x1x31x96)
[validate_onnx_model] done (make_feeds)
[validate_onnx_model] run session on inputs 'inputs22'...
[validate_onnx_model] done (run)
[validate_onnx_model] got=#3[A1s3x4x32000,A1s3x1x35x96,A1s3x1x35x96]
[validate_onnx_model] discrepancies=abs=8.344650268554688e-07, rel=0.0003670204921167059, n=404160.0, dev=0
[validate_onnx_model] -- make_feeds for 'inputs_empty_cache'...
[validate_onnx_model] inputs=dict(input_ids:T7s2x3,attention_mask:T7s2x3,position_ids:T7s2x3,past_key_values:DynamicCache(key_cache=#1[T1s2x1x0x96], value_cache=#1[T1s2x1x0x96]))
[validate_onnx_model] ort inputs=dict(input_ids:A7s2x3,attention_mask:A7s2x3,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x0x96,past_key_values_value_0:A1s2x1x0x96)
[validate_onnx_model] done (make_feeds)
[validate_onnx_model] run session on inputs 'inputs2_empty_cache'...
[validate_onnx_model] done (run)
[validate_onnx_model] got=#3[A1s2x3x32000,A1s2x1x3x96,A1s2x1x3x96]
[validate_onnx_model] discrepancies=abs=7.152557373046875e-07, rel=0.00028247341543503955, n=193152.0, dev=0
[validate_onnx_model] -- make_feeds for 'inputs_batch1'...
[validate_onnx_model] inputs=dict(input_ids:T7s1x3,attention_mask:T7s1x33,position_ids:T7s1x3,past_key_values:DynamicCache(key_cache=#1[T1s1x1x30x96], value_cache=#1[T1s1x1x30x96]))
[validate_onnx_model] ort inputs=dict(input_ids:A7s1x3,attention_mask:A7s1x33,position_ids:A7s1x3,past_key_values_key_0:A1s1x1x30x96,past_key_values_value_0:A1s1x1x30x96)
[validate_onnx_model] done (make_feeds)
[validate_onnx_model] run session on inputs 'inputs2_batch1'...
[validate_onnx_model] done (run)
[validate_onnx_model] got=#3[A1s1x3x32000,A1s1x1x33x96,A1s1x1x33x96]
[validate_onnx_model] discrepancies=abs=9.5367431640625e-07, rel=0.00030987364736661046, n=102336.0, dev=0
[validate_model] run onnxruntime fusion for 'vae'
failed in shape inference <class 'AssertionError'>
[validate_model] done 'vae' in 0.11423224800091702, saved into 'dump_models/arnir0_Tiny-LLM/onnx-dynamo/ir/op18/arnir0_Tiny-LLM-onnx-dynamo-ir-op18.ort.vae.onnx'
[validate_onnx_model] verify onnx model with providers ['CPUExecutionProvider']..., flavour='ortvae'
[validate_onnx_model] runtime is onnxruntime
[validate_onnx_model] done (ort_session) flavour='ortvae'
[validate_onnx_model] -- keys=[('inputs', 'run_expected', ''), ('inputs_prompt', 'run_expected2_prompt', '2_prompt'), ('inputs2', 'run_expected22', '22'), ('inputs_empty_cache', 'run_expected2_empty_cache', '2_empty_cache'), ('inputs_batch1', 'run_expected2_batch1', '2_batch1')]
[validate_onnx_model] -- make_feeds for 'inputs'...
[validate_onnx_model] inputs=dict(input_ids:T7s2x3,attention_mask:T7s2x33,position_ids:T7s2x3,past_key_values:DynamicCache(key_cache=#1[T1s2x1x30x96], value_cache=#1[T1s2x1x30x96]))
[validate_onnx_model] ort inputs=dict(input_ids:A7s2x3,attention_mask:A7s2x33,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x30x96,past_key_values_value_0:A1s2x1x30x96)
[validate_onnx_model] done (make_feeds)
[validate_onnx_model] run session on inputs 'inputs'...
[validate_onnx_model] done (run)
[validate_onnx_model] got=#3[A1s2x3x32000,A1s2x1x33x96,A1s2x1x33x96]
[validate_onnx_model] discrepancies=abs=7.748603820800781e-07, rel=0.00044309172106863606, n=204672.0, dev=0
[validate_onnx_model] -- make_feeds for 'inputs2'...
[validate_onnx_model] inputs=dict(input_ids:T7s3x4,attention_mask:T7s3x35,position_ids:T7s3x4,past_key_values:DynamicCache(key_cache=#1[T1s3x1x31x96], value_cache=#1[T1s3x1x31x96]))
[validate_onnx_model] ort inputs=dict(input_ids:A7s3x4,attention_mask:A7s3x35,position_ids:A7s3x4,past_key_values_key_0:A1s3x1x31x96,past_key_values_value_0:A1s3x1x31x96)
[validate_onnx_model] done (make_feeds)
[validate_onnx_model] run session on inputs 'inputs22'...
[validate_onnx_model] done (run)
[validate_onnx_model] got=#3[A1s3x4x32000,A1s3x1x35x96,A1s3x1x35x96]
[validate_onnx_model] discrepancies=abs=8.344650268554688e-07, rel=0.0003670204921167059, n=404160.0, dev=0
[validate_onnx_model] -- make_feeds for 'inputs_empty_cache'...
[validate_onnx_model] inputs=dict(input_ids:T7s2x3,attention_mask:T7s2x3,position_ids:T7s2x3,past_key_values:DynamicCache(key_cache=#1[T1s2x1x0x96], value_cache=#1[T1s2x1x0x96]))
[validate_onnx_model] ort inputs=dict(input_ids:A7s2x3,attention_mask:A7s2x3,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x0x96,past_key_values_value_0:A1s2x1x0x96)
[validate_onnx_model] done (make_feeds)
[validate_onnx_model] run session on inputs 'inputs2_empty_cache'...
[validate_onnx_model] done (run)
[validate_onnx_model] got=#3[A1s2x3x32000,A1s2x1x3x96,A1s2x1x3x96]
[validate_onnx_model] discrepancies=abs=7.152557373046875e-07, rel=0.00028247341543503955, n=193152.0, dev=0
[validate_onnx_model] -- make_feeds for 'inputs_batch1'...
[validate_onnx_model] inputs=dict(input_ids:T7s1x3,attention_mask:T7s1x33,position_ids:T7s1x3,past_key_values:DynamicCache(key_cache=#1[T1s1x1x30x96], value_cache=#1[T1s1x1x30x96]))
[validate_onnx_model] ort inputs=dict(input_ids:A7s1x3,attention_mask:A7s1x33,position_ids:A7s1x3,past_key_values_key_0:A1s1x1x30x96,past_key_values_value_0:A1s1x1x30x96)
[validate_onnx_model] done (make_feeds)
[validate_onnx_model] run session on inputs 'inputs2_batch1'...
[validate_onnx_model] done (run)
[validate_onnx_model] got=#3[A1s1x3x32000,A1s1x1x33x96,A1s1x1x33x96]
[validate_onnx_model] discrepancies=abs=9.5367431640625e-07, rel=0.00030987364736661046, n=102336.0, dev=0
[validate_model] run onnxruntime fusion for 'vit'
failed in shape inference <class 'AssertionError'>
[validate_model] done 'vit' in 0.10131322499364614, saved into 'dump_models/arnir0_Tiny-LLM/onnx-dynamo/ir/op18/arnir0_Tiny-LLM-onnx-dynamo-ir-op18.ort.vit.onnx'
[validate_onnx_model] verify onnx model with providers ['CPUExecutionProvider']..., flavour='ortvit'
[validate_onnx_model] runtime is onnxruntime
[validate_onnx_model] done (ort_session) flavour='ortvit'
[validate_onnx_model] -- keys=[('inputs', 'run_expected', ''), ('inputs_prompt', 'run_expected2_prompt', '2_prompt'), ('inputs2', 'run_expected22', '22'), ('inputs_empty_cache', 'run_expected2_empty_cache', '2_empty_cache'), ('inputs_batch1', 'run_expected2_batch1', '2_batch1')]
[validate_onnx_model] -- make_feeds for 'inputs'...
[validate_onnx_model] inputs=dict(input_ids:T7s2x3,attention_mask:T7s2x33,position_ids:T7s2x3,past_key_values:DynamicCache(key_cache=#1[T1s2x1x30x96], value_cache=#1[T1s2x1x30x96]))
[validate_onnx_model] ort inputs=dict(input_ids:A7s2x3,attention_mask:A7s2x33,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x30x96,past_key_values_value_0:A1s2x1x30x96)
[validate_onnx_model] done (make_feeds)
[validate_onnx_model] run session on inputs 'inputs'...
[validate_onnx_model] done (run)
[validate_onnx_model] got=#3[A1s2x3x32000,A1s2x1x33x96,A1s2x1x33x96]
[validate_onnx_model] discrepancies=abs=8.344650268554688e-07, rel=0.00038373230338287646, n=204672.0, dev=0
[validate_onnx_model] -- make_feeds for 'inputs2'...
[validate_onnx_model] inputs=dict(input_ids:T7s3x4,attention_mask:T7s3x35,position_ids:T7s3x4,past_key_values:DynamicCache(key_cache=#1[T1s3x1x31x96], value_cache=#1[T1s3x1x31x96]))
[validate_onnx_model] ort inputs=dict(input_ids:A7s3x4,attention_mask:A7s3x35,position_ids:A7s3x4,past_key_values_key_0:A1s3x1x31x96,past_key_values_value_0:A1s3x1x31x96)
[validate_onnx_model] done (make_feeds)
[validate_onnx_model] run session on inputs 'inputs22'...
[validate_onnx_model] done (run)
[validate_onnx_model] got=#3[A1s3x4x32000,A1s3x1x35x96,A1s3x1x35x96]
[validate_onnx_model] discrepancies=abs=9.5367431640625e-07, rel=0.00033374451371688606, n=404160.0, dev=0
[validate_onnx_model] -- make_feeds for 'inputs_empty_cache'...
[validate_onnx_model] inputs=dict(input_ids:T7s2x3,attention_mask:T7s2x3,position_ids:T7s2x3,past_key_values:DynamicCache(key_cache=#1[T1s2x1x0x96], value_cache=#1[T1s2x1x0x96]))
[validate_onnx_model] ort inputs=dict(input_ids:A7s2x3,attention_mask:A7s2x3,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x0x96,past_key_values_value_0:A1s2x1x0x96)
[validate_onnx_model] done (make_feeds)
[validate_onnx_model] run session on inputs 'inputs2_empty_cache'...
[validate_onnx_model] done (run)
[validate_onnx_model] got=#3[A1s2x3x32000,A1s2x1x3x96,A1s2x1x3x96]
[validate_onnx_model] discrepancies=abs=7.152557373046875e-07, rel=0.0002758755967644076, n=193152.0, dev=0
[validate_onnx_model] -- make_feeds for 'inputs_batch1'...
[validate_onnx_model] inputs=dict(input_ids:T7s1x3,attention_mask:T7s1x33,position_ids:T7s1x3,past_key_values:DynamicCache(key_cache=#1[T1s1x1x30x96], value_cache=#1[T1s1x1x30x96]))
[validate_onnx_model] ort inputs=dict(input_ids:A7s1x3,attention_mask:A7s1x33,position_ids:A7s1x3,past_key_values_key_0:A1s1x1x30x96,past_key_values_value_0:A1s1x1x30x96)
[validate_onnx_model] done (make_feeds)
[validate_onnx_model] run session on inputs 'inputs2_batch1'...
[validate_onnx_model] done (run)
[validate_onnx_model] got=#3[A1s1x3x32000,A1s1x1x33x96,A1s1x1x33x96]
[validate_onnx_model] discrepancies=abs=1.1324882507324219e-06, rel=0.00031306966585304286, n=102336.0, dev=0
[validate_model] -- done (final)
-- summary --
:ERR_onnx_missing_ortphi,FileNotFoundError('dump_models/arnir0_Tiny-LLM/onnx-dynamo/ir/op18/arnir0_Tiny-LLM-onnx-dynamo-ir-op18.ort.phi.onnx');
:ERR_opt_ort_phi,'method' object is not iterable;
:disc_onnx_ort_run22_abs,8.344650268554688e-07;
:disc_onnx_ort_run22_abs_ortbart,9.5367431640625e-07;
:disc_onnx_ort_run22_abs_ortbert,9.5367431640625e-07;
:disc_onnx_ort_run22_abs_ortbert_keras,9.5367431640625e-07;
:disc_onnx_ort_run22_abs_ortbert_tf,9.5367431640625e-07;
:disc_onnx_ort_run22_abs_ortclip,9.5367431640625e-07;
:disc_onnx_ort_run22_abs_ortconformer,9.5367431640625e-07;
:disc_onnx_ort_run22_abs_ortgpt2,9.5367431640625e-07;
:disc_onnx_ort_run22_abs_ortgpt2_tf,9.5367431640625e-07;
:disc_onnx_ort_run22_abs_ortgpt_neox,9.5367431640625e-07;
:disc_onnx_ort_run22_abs_ortmmdit,9.5367431640625e-07;
:disc_onnx_ort_run22_abs_ortsam2,8.344650268554688e-07;
:disc_onnx_ort_run22_abs_ortswin,9.5367431640625e-07;
:disc_onnx_ort_run22_abs_ortt5,9.5367431640625e-07;
:disc_onnx_ort_run22_abs_orttnlr,9.5367431640625e-07;
:disc_onnx_ort_run22_abs_ortunet,8.344650268554688e-07;
:disc_onnx_ort_run22_abs_ortvae,8.344650268554688e-07;
:disc_onnx_ort_run22_abs_ortvit,9.5367431640625e-07;
:disc_onnx_ort_run22_dev,0;
:disc_onnx_ort_run22_dev_ortbart,0;
:disc_onnx_ort_run22_dev_ortbert,0;
:disc_onnx_ort_run22_dev_ortbert_keras,0;
:disc_onnx_ort_run22_dev_ortbert_tf,0;
:disc_onnx_ort_run22_dev_ortclip,0;
:disc_onnx_ort_run22_dev_ortconformer,0;
:disc_onnx_ort_run22_dev_ortgpt2,0;
:disc_onnx_ort_run22_dev_ortgpt2_tf,0;
:disc_onnx_ort_run22_dev_ortgpt_neox,0;
:disc_onnx_ort_run22_dev_ortmmdit,0;
:disc_onnx_ort_run22_dev_ortsam2,0;
:disc_onnx_ort_run22_dev_ortswin,0;
:disc_onnx_ort_run22_dev_ortt5,0;
:disc_onnx_ort_run22_dev_orttnlr,0;
:disc_onnx_ort_run22_dev_ortunet,0;
:disc_onnx_ort_run22_dev_ortvae,0;
:disc_onnx_ort_run22_dev_ortvit,0;
:disc_onnx_ort_run22_dnan,0;
:disc_onnx_ort_run22_dnan_ortbart,0;
:disc_onnx_ort_run22_dnan_ortbert,0;
:disc_onnx_ort_run22_dnan_ortbert_keras,0;
:disc_onnx_ort_run22_dnan_ortbert_tf,0;
:disc_onnx_ort_run22_dnan_ortclip,0;
:disc_onnx_ort_run22_dnan_ortconformer,0;
:disc_onnx_ort_run22_dnan_ortgpt2,0;
:disc_onnx_ort_run22_dnan_ortgpt2_tf,0;
:disc_onnx_ort_run22_dnan_ortgpt_neox,0;
:disc_onnx_ort_run22_dnan_ortmmdit,0;
:disc_onnx_ort_run22_dnan_ortsam2,0;
:disc_onnx_ort_run22_dnan_ortswin,0;
:disc_onnx_ort_run22_dnan_ortt5,0;
:disc_onnx_ort_run22_dnan_orttnlr,0;
:disc_onnx_ort_run22_dnan_ortunet,0;
:disc_onnx_ort_run22_dnan_ortvae,0;
:disc_onnx_ort_run22_dnan_ortvit,0;
:disc_onnx_ort_run22_n,404160.0;
:disc_onnx_ort_run22_n_ortbart,404160.0;
:disc_onnx_ort_run22_n_ortbert,404160.0;
:disc_onnx_ort_run22_n_ortbert_keras,404160.0;
:disc_onnx_ort_run22_n_ortbert_tf,404160.0;
:disc_onnx_ort_run22_n_ortclip,404160.0;
:disc_onnx_ort_run22_n_ortconformer,404160.0;
:disc_onnx_ort_run22_n_ortgpt2,404160.0;
:disc_onnx_ort_run22_n_ortgpt2_tf,404160.0;
:disc_onnx_ort_run22_n_ortgpt_neox,404160.0;
:disc_onnx_ort_run22_n_ortmmdit,404160.0;
:disc_onnx_ort_run22_n_ortsam2,404160.0;
:disc_onnx_ort_run22_n_ortswin,404160.0;
:disc_onnx_ort_run22_n_ortt5,404160.0;
:disc_onnx_ort_run22_n_orttnlr,404160.0;
:disc_onnx_ort_run22_n_ortunet,404160.0;
:disc_onnx_ort_run22_n_ortvae,404160.0;
:disc_onnx_ort_run22_n_ortvit,404160.0;
:disc_onnx_ort_run22_rel,0.0003670204921167059;
:disc_onnx_ort_run22_rel_ortbart,0.00033374451371688606;
:disc_onnx_ort_run22_rel_ortbert,0.00033374451371688606;
:disc_onnx_ort_run22_rel_ortbert_keras,0.00033374451371688606;
:disc_onnx_ort_run22_rel_ortbert_tf,0.00033374451371688606;
:disc_onnx_ort_run22_rel_ortclip,0.00033374451371688606;
:disc_onnx_ort_run22_rel_ortconformer,0.00033374451371688606;
:disc_onnx_ort_run22_rel_ortgpt2,0.00033374451371688606;
:disc_onnx_ort_run22_rel_ortgpt2_tf,0.00033374451371688606;
:disc_onnx_ort_run22_rel_ortgpt_neox,0.00033374451371688606;
:disc_onnx_ort_run22_rel_ortmmdit,0.00033374451371688606;
:disc_onnx_ort_run22_rel_ortsam2,0.0003670204921167059;
:disc_onnx_ort_run22_rel_ortswin,0.00033374451371688606;
:disc_onnx_ort_run22_rel_ortt5,0.00033374451371688606;
:disc_onnx_ort_run22_rel_orttnlr,0.00033374451371688606;
:disc_onnx_ort_run22_rel_ortunet,0.0003670204921167059;
:disc_onnx_ort_run22_rel_ortvae,0.0003670204921167059;
:disc_onnx_ort_run22_rel_ortvit,0.00033374451371688606;
:disc_onnx_ort_run22_sum,0.037561870639599704;
:disc_onnx_ort_run22_sum_ortbart,0.04027932227860731;
:disc_onnx_ort_run22_sum_ortbert,0.04027932227860731;
:disc_onnx_ort_run22_sum_ortbert_keras,0.04027932227860731;
:disc_onnx_ort_run22_sum_ortbert_tf,0.04027932227860731;
:disc_onnx_ort_run22_sum_ortclip,0.04027932227860731;
:disc_onnx_ort_run22_sum_ortconformer,0.04027932227860731;
:disc_onnx_ort_run22_sum_ortgpt2,0.04027932227860731;
:disc_onnx_ort_run22_sum_ortgpt2_tf,0.04027932227860731;
:disc_onnx_ort_run22_sum_ortgpt_neox,0.04027932227860731;
:disc_onnx_ort_run22_sum_ortmmdit,0.04027932227860731;
:disc_onnx_ort_run22_sum_ortsam2,0.037561870639599704;
:disc_onnx_ort_run22_sum_ortswin,0.04027932227860731;
:disc_onnx_ort_run22_sum_ortt5,0.04027932227860731;
:disc_onnx_ort_run22_sum_orttnlr,0.04027932227860731;
:disc_onnx_ort_run22_sum_ortunet,0.037561870639599704;
:disc_onnx_ort_run22_sum_ortvae,0.037561870639599704;
:disc_onnx_ort_run22_sum_ortvit,0.04027932227860731;
:disc_onnx_ort_run2_batch1_abs,9.5367431640625e-07;
:disc_onnx_ort_run2_batch1_abs_ortbart,1.1324882507324219e-06;
:disc_onnx_ort_run2_batch1_abs_ortbert,1.1324882507324219e-06;
:disc_onnx_ort_run2_batch1_abs_ortbert_keras,1.1324882507324219e-06;
:disc_onnx_ort_run2_batch1_abs_ortbert_tf,1.1324882507324219e-06;
:disc_onnx_ort_run2_batch1_abs_ortclip,1.1324882507324219e-06;
:disc_onnx_ort_run2_batch1_abs_ortconformer,1.1324882507324219e-06;
:disc_onnx_ort_run2_batch1_abs_ortgpt2,1.1324882507324219e-06;
:disc_onnx_ort_run2_batch1_abs_ortgpt2_tf,1.1324882507324219e-06;
:disc_onnx_ort_run2_batch1_abs_ortgpt_neox,1.1324882507324219e-06;
:disc_onnx_ort_run2_batch1_abs_ortmmdit,1.1324882507324219e-06;
:disc_onnx_ort_run2_batch1_abs_ortsam2,9.5367431640625e-07;
:disc_onnx_ort_run2_batch1_abs_ortswin,1.1324882507324219e-06;
:disc_onnx_ort_run2_batch1_abs_ortt5,1.1324882507324219e-06;
:disc_onnx_ort_run2_batch1_abs_orttnlr,1.1324882507324219e-06;
:disc_onnx_ort_run2_batch1_abs_ortunet,9.5367431640625e-07;
:disc_onnx_ort_run2_batch1_abs_ortvae,9.5367431640625e-07;
:disc_onnx_ort_run2_batch1_abs_ortvit,1.1324882507324219e-06;
:disc_onnx_ort_run2_batch1_dev,0;
:disc_onnx_ort_run2_batch1_dev_ortbart,0;
:disc_onnx_ort_run2_batch1_dev_ortbert,0;
:disc_onnx_ort_run2_batch1_dev_ortbert_keras,0;
:disc_onnx_ort_run2_batch1_dev_ortbert_tf,0;
:disc_onnx_ort_run2_batch1_dev_ortclip,0;
:disc_onnx_ort_run2_batch1_dev_ortconformer,0;
:disc_onnx_ort_run2_batch1_dev_ortgpt2,0;
:disc_onnx_ort_run2_batch1_dev_ortgpt2_tf,0;
:disc_onnx_ort_run2_batch1_dev_ortgpt_neox,0;
:disc_onnx_ort_run2_batch1_dev_ortmmdit,0;
:disc_onnx_ort_run2_batch1_dev_ortsam2,0;
:disc_onnx_ort_run2_batch1_dev_ortswin,0;
:disc_onnx_ort_run2_batch1_dev_ortt5,0;
:disc_onnx_ort_run2_batch1_dev_orttnlr,0;
:disc_onnx_ort_run2_batch1_dev_ortunet,0;
:disc_onnx_ort_run2_batch1_dev_ortvae,0;
:disc_onnx_ort_run2_batch1_dev_ortvit,0;
:disc_onnx_ort_run2_batch1_dnan,0;
:disc_onnx_ort_run2_batch1_dnan_ortbart,0;
:disc_onnx_ort_run2_batch1_dnan_ortbert,0;
:disc_onnx_ort_run2_batch1_dnan_ortbert_keras,0;
:disc_onnx_ort_run2_batch1_dnan_ortbert_tf,0;
:disc_onnx_ort_run2_batch1_dnan_ortclip,0;
:disc_onnx_ort_run2_batch1_dnan_ortconformer,0;
:disc_onnx_ort_run2_batch1_dnan_ortgpt2,0;
:disc_onnx_ort_run2_batch1_dnan_ortgpt2_tf,0;
:disc_onnx_ort_run2_batch1_dnan_ortgpt_neox,0;
:disc_onnx_ort_run2_batch1_dnan_ortmmdit,0;
:disc_onnx_ort_run2_batch1_dnan_ortsam2,0;
:disc_onnx_ort_run2_batch1_dnan_ortswin,0;
:disc_onnx_ort_run2_batch1_dnan_ortt5,0;
:disc_onnx_ort_run2_batch1_dnan_orttnlr,0;
:disc_onnx_ort_run2_batch1_dnan_ortunet,0;
:disc_onnx_ort_run2_batch1_dnan_ortvae,0;
:disc_onnx_ort_run2_batch1_dnan_ortvit,0;
:disc_onnx_ort_run2_batch1_n,102336.0;
:disc_onnx_ort_run2_batch1_n_ortbart,102336.0;
:disc_onnx_ort_run2_batch1_n_ortbert,102336.0;
:disc_onnx_ort_run2_batch1_n_ortbert_keras,102336.0;
:disc_onnx_ort_run2_batch1_n_ortbert_tf,102336.0;
:disc_onnx_ort_run2_batch1_n_ortclip,102336.0;
:disc_onnx_ort_run2_batch1_n_ortconformer,102336.0;
:disc_onnx_ort_run2_batch1_n_ortgpt2,102336.0;
:disc_onnx_ort_run2_batch1_n_ortgpt2_tf,102336.0;
:disc_onnx_ort_run2_batch1_n_ortgpt_neox,102336.0;
:disc_onnx_ort_run2_batch1_n_ortmmdit,102336.0;
:disc_onnx_ort_run2_batch1_n_ortsam2,102336.0;
:disc_onnx_ort_run2_batch1_n_ortswin,102336.0;
:disc_onnx_ort_run2_batch1_n_ortt5,102336.0;
:disc_onnx_ort_run2_batch1_n_orttnlr,102336.0;
:disc_onnx_ort_run2_batch1_n_ortunet,102336.0;
:disc_onnx_ort_run2_batch1_n_ortvae,102336.0;
:disc_onnx_ort_run2_batch1_n_ortvit,102336.0;
:disc_onnx_ort_run2_batch1_rel,0.00030987364736661046;
:disc_onnx_ort_run2_batch1_rel_ortbart,0.00031306966585304286;
:disc_onnx_ort_run2_batch1_rel_ortbert,0.00031306966585304286;
:disc_onnx_ort_run2_batch1_rel_ortbert_keras,0.00031306966585304286;
:disc_onnx_ort_run2_batch1_rel_ortbert_tf,0.00031306966585304286;
:disc_onnx_ort_run2_batch1_rel_ortclip,0.00031306966585304286;
:disc_onnx_ort_run2_batch1_rel_ortconformer,0.00031306966585304286;
:disc_onnx_ort_run2_batch1_rel_ortgpt2,0.00031306966585304286;
:disc_onnx_ort_run2_batch1_rel_ortgpt2_tf,0.00031306966585304286;
:disc_onnx_ort_run2_batch1_rel_ortgpt_neox,0.00031306966585304286;
:disc_onnx_ort_run2_batch1_rel_ortmmdit,0.00031306966585304286;
:disc_onnx_ort_run2_batch1_rel_ortsam2,0.00030987364736661046;
:disc_onnx_ort_run2_batch1_rel_ortswin,0.00031306966585304286;
:disc_onnx_ort_run2_batch1_rel_ortt5,0.00031306966585304286;
:disc_onnx_ort_run2_batch1_rel_orttnlr,0.00031306966585304286;
:disc_onnx_ort_run2_batch1_rel_ortunet,0.00030987364736661046;
:disc_onnx_ort_run2_batch1_rel_ortvae,0.00030987364736661046;
:disc_onnx_ort_run2_batch1_rel_ortvit,0.00031306966585304286;
:disc_onnx_ort_run2_batch1_sum,0.011194461939794564;
:disc_onnx_ort_run2_batch1_sum_ortbart,0.011881725947318955;
:disc_onnx_ort_run2_batch1_sum_ortbert,0.011881725947318955;
:disc_onnx_ort_run2_batch1_sum_ortbert_keras,0.011881725947318955;
:disc_onnx_ort_run2_batch1_sum_ortbert_tf,0.011881725947318955;
:disc_onnx_ort_run2_batch1_sum_ortclip,0.011881725947318955;
:disc_onnx_ort_run2_batch1_sum_ortconformer,0.011881725947318955;
:disc_onnx_ort_run2_batch1_sum_ortgpt2,0.011881725947318955;
:disc_onnx_ort_run2_batch1_sum_ortgpt2_tf,0.011881725947318955;
:disc_onnx_ort_run2_batch1_sum_ortgpt_neox,0.011881725947318955;
:disc_onnx_ort_run2_batch1_sum_ortmmdit,0.011881725947318955;
:disc_onnx_ort_run2_batch1_sum_ortsam2,0.011194461939794564;
:disc_onnx_ort_run2_batch1_sum_ortswin,0.011881725947318955;
:disc_onnx_ort_run2_batch1_sum_ortt5,0.011881725947318955;
:disc_onnx_ort_run2_batch1_sum_orttnlr,0.011881725947318955;
:disc_onnx_ort_run2_batch1_sum_ortunet,0.011194461939794564;
:disc_onnx_ort_run2_batch1_sum_ortvae,0.011194461939794564;
:disc_onnx_ort_run2_batch1_sum_ortvit,0.011881725947318955;
:disc_onnx_ort_run2_empty_cache_abs,7.152557373046875e-07;
:disc_onnx_ort_run2_empty_cache_abs_ortbart,7.152557373046875e-07;
:disc_onnx_ort_run2_empty_cache_abs_ortbert,7.152557373046875e-07;
:disc_onnx_ort_run2_empty_cache_abs_ortbert_keras,7.152557373046875e-07;
:disc_onnx_ort_run2_empty_cache_abs_ortbert_tf,7.152557373046875e-07;
:disc_onnx_ort_run2_empty_cache_abs_ortclip,7.152557373046875e-07;
:disc_onnx_ort_run2_empty_cache_abs_ortconformer,7.152557373046875e-07;
:disc_onnx_ort_run2_empty_cache_abs_ortgpt2,7.152557373046875e-07;
:disc_onnx_ort_run2_empty_cache_abs_ortgpt2_tf,7.152557373046875e-07;
:disc_onnx_ort_run2_empty_cache_abs_ortgpt_neox,7.152557373046875e-07;
:disc_onnx_ort_run2_empty_cache_abs_ortmmdit,7.152557373046875e-07;
:disc_onnx_ort_run2_empty_cache_abs_ortsam2,7.152557373046875e-07;
:disc_onnx_ort_run2_empty_cache_abs_ortswin,7.152557373046875e-07;
:disc_onnx_ort_run2_empty_cache_abs_ortt5,7.152557373046875e-07;
:disc_onnx_ort_run2_empty_cache_abs_orttnlr,7.152557373046875e-07;
:disc_onnx_ort_run2_empty_cache_abs_ortunet,7.152557373046875e-07;
:disc_onnx_ort_run2_empty_cache_abs_ortvae,7.152557373046875e-07;
:disc_onnx_ort_run2_empty_cache_abs_ortvit,7.152557373046875e-07;
:disc_onnx_ort_run2_empty_cache_dev,0;
:disc_onnx_ort_run2_empty_cache_dev_ortbart,0;
:disc_onnx_ort_run2_empty_cache_dev_ortbert,0;
:disc_onnx_ort_run2_empty_cache_dev_ortbert_keras,0;
:disc_onnx_ort_run2_empty_cache_dev_ortbert_tf,0;
:disc_onnx_ort_run2_empty_cache_dev_ortclip,0;
:disc_onnx_ort_run2_empty_cache_dev_ortconformer,0;
:disc_onnx_ort_run2_empty_cache_dev_ortgpt2,0;
:disc_onnx_ort_run2_empty_cache_dev_ortgpt2_tf,0;
:disc_onnx_ort_run2_empty_cache_dev_ortgpt_neox,0;
:disc_onnx_ort_run2_empty_cache_dev_ortmmdit,0;
:disc_onnx_ort_run2_empty_cache_dev_ortsam2,0;
:disc_onnx_ort_run2_empty_cache_dev_ortswin,0;
:disc_onnx_ort_run2_empty_cache_dev_ortt5,0;
:disc_onnx_ort_run2_empty_cache_dev_orttnlr,0;
:disc_onnx_ort_run2_empty_cache_dev_ortunet,0;
:disc_onnx_ort_run2_empty_cache_dev_ortvae,0;
:disc_onnx_ort_run2_empty_cache_dev_ortvit,0;
:disc_onnx_ort_run2_empty_cache_dnan,0;
:disc_onnx_ort_run2_empty_cache_dnan_ortbart,0;
:disc_onnx_ort_run2_empty_cache_dnan_ortbert,0;
:disc_onnx_ort_run2_empty_cache_dnan_ortbert_keras,0;
:disc_onnx_ort_run2_empty_cache_dnan_ortbert_tf,0;
:disc_onnx_ort_run2_empty_cache_dnan_ortclip,0;
:disc_onnx_ort_run2_empty_cache_dnan_ortconformer,0;
:disc_onnx_ort_run2_empty_cache_dnan_ortgpt2,0;
:disc_onnx_ort_run2_empty_cache_dnan_ortgpt2_tf,0;
:disc_onnx_ort_run2_empty_cache_dnan_ortgpt_neox,0;
:disc_onnx_ort_run2_empty_cache_dnan_ortmmdit,0;
:disc_onnx_ort_run2_empty_cache_dnan_ortsam2,0;
:disc_onnx_ort_run2_empty_cache_dnan_ortswin,0;
:disc_onnx_ort_run2_empty_cache_dnan_ortt5,0;
:disc_onnx_ort_run2_empty_cache_dnan_orttnlr,0;
:disc_onnx_ort_run2_empty_cache_dnan_ortunet,0;
:disc_onnx_ort_run2_empty_cache_dnan_ortvae,0;
:disc_onnx_ort_run2_empty_cache_dnan_ortvit,0;
:disc_onnx_ort_run2_empty_cache_n,193152.0;
:disc_onnx_ort_run2_empty_cache_n_ortbart,193152.0;
:disc_onnx_ort_run2_empty_cache_n_ortbert,193152.0;
:disc_onnx_ort_run2_empty_cache_n_ortbert_keras,193152.0;
:disc_onnx_ort_run2_empty_cache_n_ortbert_tf,193152.0;
:disc_onnx_ort_run2_empty_cache_n_ortclip,193152.0;
:disc_onnx_ort_run2_empty_cache_n_ortconformer,193152.0;
:disc_onnx_ort_run2_empty_cache_n_ortgpt2,193152.0;
:disc_onnx_ort_run2_empty_cache_n_ortgpt2_tf,193152.0;
:disc_onnx_ort_run2_empty_cache_n_ortgpt_neox,193152.0;
:disc_onnx_ort_run2_empty_cache_n_ortmmdit,193152.0;
:disc_onnx_ort_run2_empty_cache_n_ortsam2,193152.0;
:disc_onnx_ort_run2_empty_cache_n_ortswin,193152.0;
:disc_onnx_ort_run2_empty_cache_n_ortt5,193152.0;
:disc_onnx_ort_run2_empty_cache_n_orttnlr,193152.0;
:disc_onnx_ort_run2_empty_cache_n_ortunet,193152.0;
:disc_onnx_ort_run2_empty_cache_n_ortvae,193152.0;
:disc_onnx_ort_run2_empty_cache_n_ortvit,193152.0;
:disc_onnx_ort_run2_empty_cache_rel,0.00028247341543503955;
:disc_onnx_ort_run2_empty_cache_rel_ortbart,0.0002758755967644076;
:disc_onnx_ort_run2_empty_cache_rel_ortbert,0.0002758755967644076;
:disc_onnx_ort_run2_empty_cache_rel_ortbert_keras,0.0002758755967644076;
:disc_onnx_ort_run2_empty_cache_rel_ortbert_tf,0.0002758755967644076;
:disc_onnx_ort_run2_empty_cache_rel_ortclip,0.0002758755967644076;
:disc_onnx_ort_run2_empty_cache_rel_ortconformer,0.0002758755967644076;
:disc_onnx_ort_run2_empty_cache_rel_ortgpt2,0.0002758755967644076;
:disc_onnx_ort_run2_empty_cache_rel_ortgpt2_tf,0.0002758755967644076;
:disc_onnx_ort_run2_empty_cache_rel_ortgpt_neox,0.0002758755967644076;
:disc_onnx_ort_run2_empty_cache_rel_ortmmdit,0.0002758755967644076;
:disc_onnx_ort_run2_empty_cache_rel_ortsam2,0.00028247341543503955;
:disc_onnx_ort_run2_empty_cache_rel_ortswin,0.0002758755967644076;
:disc_onnx_ort_run2_empty_cache_rel_ortt5,0.0002758755967644076;
:disc_onnx_ort_run2_empty_cache_rel_orttnlr,0.0002758755967644076;
:disc_onnx_ort_run2_empty_cache_rel_ortunet,0.00028247341543503955;
:disc_onnx_ort_run2_empty_cache_rel_ortvae,0.00028247341543503955;
:disc_onnx_ort_run2_empty_cache_rel_ortvit,0.0002758755967644076;
:disc_onnx_ort_run2_empty_cache_sum,0.01621216703074424;
:disc_onnx_ort_run2_empty_cache_sum_ortbart,0.01956161782959498;
:disc_onnx_ort_run2_empty_cache_sum_ortbert,0.01956161782959498;
:disc_onnx_ort_run2_empty_cache_sum_ortbert_keras,0.01956161782959498;
:disc_onnx_ort_run2_empty_cache_sum_ortbert_tf,0.01956161782959498;
:disc_onnx_ort_run2_empty_cache_sum_ortclip,0.01956161782959498;
:disc_onnx_ort_run2_empty_cache_sum_ortconformer,0.01956161782959498;
:disc_onnx_ort_run2_empty_cache_sum_ortgpt2,0.01956161782959498;
:disc_onnx_ort_run2_empty_cache_sum_ortgpt2_tf,0.01956161782959498;
:disc_onnx_ort_run2_empty_cache_sum_ortgpt_neox,0.01956161782959498;
:disc_onnx_ort_run2_empty_cache_sum_ortmmdit,0.01956161782959498;
:disc_onnx_ort_run2_empty_cache_sum_ortsam2,0.01621216703074424;
:disc_onnx_ort_run2_empty_cache_sum_ortswin,0.01956161782959498;
:disc_onnx_ort_run2_empty_cache_sum_ortt5,0.01956161782959498;
:disc_onnx_ort_run2_empty_cache_sum_orttnlr,0.01956161782959498;
:disc_onnx_ort_run2_empty_cache_sum_ortunet,0.01621216703074424;
:disc_onnx_ort_run2_empty_cache_sum_ortvae,0.01621216703074424;
:disc_onnx_ort_run2_empty_cache_sum_ortvit,0.01956161782959498;
:disc_onnx_ort_run_abs,7.748603820800781e-07;
:disc_onnx_ort_run_abs_ortbart,8.344650268554688e-07;
:disc_onnx_ort_run_abs_ortbert,8.344650268554688e-07;
:disc_onnx_ort_run_abs_ortbert_keras,8.344650268554688e-07;
:disc_onnx_ort_run_abs_ortbert_tf,8.344650268554688e-07;
:disc_onnx_ort_run_abs_ortclip,8.344650268554688e-07;
:disc_onnx_ort_run_abs_ortconformer,8.344650268554688e-07;
:disc_onnx_ort_run_abs_ortgpt2,8.344650268554688e-07;
:disc_onnx_ort_run_abs_ortgpt2_tf,8.344650268554688e-07;
:disc_onnx_ort_run_abs_ortgpt_neox,8.344650268554688e-07;
:disc_onnx_ort_run_abs_ortmmdit,8.344650268554688e-07;
:disc_onnx_ort_run_abs_ortsam2,7.748603820800781e-07;
:disc_onnx_ort_run_abs_ortswin,8.344650268554688e-07;
:disc_onnx_ort_run_abs_ortt5,8.344650268554688e-07;
:disc_onnx_ort_run_abs_orttnlr,8.344650268554688e-07;
:disc_onnx_ort_run_abs_ortunet,7.748603820800781e-07;
:disc_onnx_ort_run_abs_ortvae,7.748603820800781e-07;
:disc_onnx_ort_run_abs_ortvit,8.344650268554688e-07;
:disc_onnx_ort_run_dev,0;
:disc_onnx_ort_run_dev_ortbart,0;
:disc_onnx_ort_run_dev_ortbert,0;
:disc_onnx_ort_run_dev_ortbert_keras,0;
:disc_onnx_ort_run_dev_ortbert_tf,0;
:disc_onnx_ort_run_dev_ortclip,0;
:disc_onnx_ort_run_dev_ortconformer,0;
:disc_onnx_ort_run_dev_ortgpt2,0;
:disc_onnx_ort_run_dev_ortgpt2_tf,0;
:disc_onnx_ort_run_dev_ortgpt_neox,0;
:disc_onnx_ort_run_dev_ortmmdit,0;
:disc_onnx_ort_run_dev_ortsam2,0;
:disc_onnx_ort_run_dev_ortswin,0;
:disc_onnx_ort_run_dev_ortt5,0;
:disc_onnx_ort_run_dev_orttnlr,0;
:disc_onnx_ort_run_dev_ortunet,0;
:disc_onnx_ort_run_dev_ortvae,0;
:disc_onnx_ort_run_dev_ortvit,0;
:disc_onnx_ort_run_dnan,0;
:disc_onnx_ort_run_dnan_ortbart,0;
:disc_onnx_ort_run_dnan_ortbert,0;
:disc_onnx_ort_run_dnan_ortbert_keras,0;
:disc_onnx_ort_run_dnan_ortbert_tf,0;
:disc_onnx_ort_run_dnan_ortclip,0;
:disc_onnx_ort_run_dnan_ortconformer,0;
:disc_onnx_ort_run_dnan_ortgpt2,0;
:disc_onnx_ort_run_dnan_ortgpt2_tf,0;
:disc_onnx_ort_run_dnan_ortgpt_neox,0;
:disc_onnx_ort_run_dnan_ortmmdit,0;
:disc_onnx_ort_run_dnan_ortsam2,0;
:disc_onnx_ort_run_dnan_ortswin,0;
:disc_onnx_ort_run_dnan_ortt5,0;
:disc_onnx_ort_run_dnan_orttnlr,0;
:disc_onnx_ort_run_dnan_ortunet,0;
:disc_onnx_ort_run_dnan_ortvae,0;
:disc_onnx_ort_run_dnan_ortvit,0;
:disc_onnx_ort_run_n,204672.0;
:disc_onnx_ort_run_n_ortbart,204672.0;
:disc_onnx_ort_run_n_ortbert,204672.0;
:disc_onnx_ort_run_n_ortbert_keras,204672.0;
:disc_onnx_ort_run_n_ortbert_tf,204672.0;
:disc_onnx_ort_run_n_ortclip,204672.0;
:disc_onnx_ort_run_n_ortconformer,204672.0;
:disc_onnx_ort_run_n_ortgpt2,204672.0;
:disc_onnx_ort_run_n_ortgpt2_tf,204672.0;
:disc_onnx_ort_run_n_ortgpt_neox,204672.0;
:disc_onnx_ort_run_n_ortmmdit,204672.0;
:disc_onnx_ort_run_n_ortsam2,204672.0;
:disc_onnx_ort_run_n_ortswin,204672.0;
:disc_onnx_ort_run_n_ortt5,204672.0;
:disc_onnx_ort_run_n_orttnlr,204672.0;
:disc_onnx_ort_run_n_ortunet,204672.0;
:disc_onnx_ort_run_n_ortvae,204672.0;
:disc_onnx_ort_run_n_ortvit,204672.0;
:disc_onnx_ort_run_rel,0.00044309172106863606;
:disc_onnx_ort_run_rel_ortbart,0.00038373230338287646;
:disc_onnx_ort_run_rel_ortbert,0.00038373230338287646;
:disc_onnx_ort_run_rel_ortbert_keras,0.00038373230338287646;
:disc_onnx_ort_run_rel_ortbert_tf,0.00038373230338287646;
:disc_onnx_ort_run_rel_ortclip,0.00038373230338287646;
:disc_onnx_ort_run_rel_ortconformer,0.00038373230338287646;
:disc_onnx_ort_run_rel_ortgpt2,0.00038373230338287646;
:disc_onnx_ort_run_rel_ortgpt2_tf,0.00038373230338287646;
:disc_onnx_ort_run_rel_ortgpt_neox,0.00038373230338287646;
:disc_onnx_ort_run_rel_ortmmdit,0.00038373230338287646;
:disc_onnx_ort_run_rel_ortsam2,0.00044309172106863606;
:disc_onnx_ort_run_rel_ortswin,0.00038373230338287646;
:disc_onnx_ort_run_rel_ortt5,0.00038373230338287646;
:disc_onnx_ort_run_rel_orttnlr,0.00038373230338287646;
:disc_onnx_ort_run_rel_ortunet,0.00044309172106863606;
:disc_onnx_ort_run_rel_ortvae,0.00044309172106863606;
:disc_onnx_ort_run_rel_ortvit,0.00038373230338287646;
:disc_onnx_ort_run_sum,0.02031988672524676;
:disc_onnx_ort_run_sum_ortbart,0.022044641082175076;
:disc_onnx_ort_run_sum_ortbert,0.022044641082175076;
:disc_onnx_ort_run_sum_ortbert_keras,0.022044641082175076;
:disc_onnx_ort_run_sum_ortbert_tf,0.022044641082175076;
:disc_onnx_ort_run_sum_ortclip,0.022044641082175076;
:disc_onnx_ort_run_sum_ortconformer,0.022044641082175076;
:disc_onnx_ort_run_sum_ortgpt2,0.022044641082175076;
:disc_onnx_ort_run_sum_ortgpt2_tf,0.022044641082175076;
:disc_onnx_ort_run_sum_ortgpt_neox,0.022044641082175076;
:disc_onnx_ort_run_sum_ortmmdit,0.022044641082175076;
:disc_onnx_ort_run_sum_ortsam2,0.02031988672524676;
:disc_onnx_ort_run_sum_ortswin,0.022044641082175076;
:disc_onnx_ort_run_sum_ortt5,0.022044641082175076;
:disc_onnx_ort_run_sum_orttnlr,0.022044641082175076;
:disc_onnx_ort_run_sum_ortunet,0.02031988672524676;
:disc_onnx_ort_run_sum_ortvae,0.02031988672524676;
:disc_onnx_ort_run_sum_ortvit,0.022044641082175076;
:disc_patched_abs,0;
:disc_patched_dev,0;
:disc_patched_dnan,0;
:disc_patched_n,204672.0;
:disc_patched_rel,0;
:disc_patched_sum,0.0;
:dump_folder,dump_models/arnir0_Tiny-LLM/onnx-dynamo/ir/op18;
:dump_folder_name,arnir0_Tiny-LLM/onnx-dynamo/ir/op18;
:export_args,();
:export_dynamo,True;
:export_exporter,{};
:export_kwargs,dict(input_ids:T7s2x3,attention_mask:T7s2x33,position_ids:T7s2x3,past_key_values:DynamicCache(key_cache=#1[T1s2x1x30x96], value_cache=#1[T1s2x1x30x96]));
:export_opset,18;
:export_optimization,ir;
:model_class,LlamaForCausalLM;
:model_config,{'vocab_size':32000,'max_position_embeddings':1024,'hidden_size':192,'intermediate_size':1024,'num_hidden_layers':1,'num_attention_heads':2,'num_key_value_heads':1,'hidden_act':'silu','initializer_range':0.02,'rms_norm_eps':1e-05,'pretraining_tp':1,'use_cache':True,'attention_bias':False,'attention_dropout':0.0,'mlp_bias':False,'head_dim':96,'rope_parameters':{'rope_theta':10000.0,'rope_type':'default'},'return_dict':True,'output_hidden_states':False,'dtype':'float32','tie_word_embeddings':False,'chunk_size_feed_forward':0,'is_encoder_decoder':False,'is_decoder':False,'cross_attention_hidden_size':None,'add_cross_attention':False,'tie_encoder_decoder':False,'architectures':['LlamaForCausalLM'],'finetuning_task':None,'id2label':{0:'LABEL_0',1:'LABEL_1'},'label2id':{'LABEL_0':0,'LABEL_1':1},'task_specific_params':None,'problem_type':None,'tokenizer_class':None,'prefix':None,'bos_token_id':1,'pad_token_id':None,'eos_token_id':2,'sep_token_id':None,'decoder_start_token_id':None,'_name_or_path':'','transformers_version':'5.0.0.dev0','model_type':'llama','subfolder':None,'output_attentions':False};
:model_config_class,LlamaConfig;
:model_file,~/github/transformers/src/transformers/models/llama/modeling_llama.py;
:model_id,arnir0/Tiny-LLM;
:model_inputs,dict(input_ids:T7s2x3,attention_mask:T7s2x33,position_ids:T7s2x3,past_key_values:DynamicCache(key_cache=#1[T1s2x1x30x96], value_cache=#1[T1s2x1x30x96]));
:model_inputs_options,;
:model_module,transformers.models.llama.modeling_llama;
:model_nweights,12988992;
:model_shapes,dict(input_ids:{0:DYN(batch),1:DYN(seq_length)},attention_mask:{0:DYN(batch),1:DYN(cache+seq)},position_ids:{0:DYN(batch),1:DYN(seq_length)},past_key_values:#2[{0:DYN(batch),2:DYN(cache_length)},{0:DYN(batch),2:DYN(cache_length)}]);
:model_size,51955968;
:model_subfolder,;
:model_task,text-generation;
:n_node_Add,11;
:n_node_And,2;
:n_node_Cast,2;
:n_node_Concat,16;
:n_node_Cos,1;
:n_node_Expand,6;
:n_node_Gather,1;
:n_node_GatherND,1;
:n_node_IsNaN,1;
:n_node_LessOrEqual,1;
:n_node_MatMul,11;
:n_node_Max,2;
:n_node_Mul,14;
:n_node_Neg,2;
:n_node_Pow,3;
:n_node_Range,3;
:n_node_Reciprocal,3;
:n_node_ReduceMean,3;
:n_node_Reshape,11;
:n_node_Shape,7;
:n_node_Sigmoid,1;
:n_node_Sin,1;
:n_node_Slice,8;
:n_node_Softmax,1;
:n_node_Sqrt,3;
:n_node_Squeeze,5;
:n_node_Transpose,6;
:n_node_Unsqueeze,13;
:n_node_Where,2;
:n_node_functions,0;
:n_node_initializer_1,16;
:n_node_initializer_7,14;
:n_node_initializer_9,1;
:n_node_nodes,141;
:n_node_nodes_nocst,141;
:onnx_filename,dump_models/arnir0_Tiny-LLM/onnx-dynamo/ir/op18/arnir0_Tiny-LLM-onnx-dynamo-ir-op18.onnx;
:onnx_filename_ortbart,dump_models/arnir0_Tiny-LLM/onnx-dynamo/ir/op18/arnir0_Tiny-LLM-onnx-dynamo-ir-op18.ort.bart.onnx;
:onnx_filename_ortbert,dump_models/arnir0_Tiny-LLM/onnx-dynamo/ir/op18/arnir0_Tiny-LLM-onnx-dynamo-ir-op18.ort.bert.onnx;
:onnx_filename_ortbert_keras,dump_models/arnir0_Tiny-LLM/onnx-dynamo/ir/op18/arnir0_Tiny-LLM-onnx-dynamo-ir-op18.ort.bert_keras.onnx;
:onnx_filename_ortbert_tf,dump_models/arnir0_Tiny-LLM/onnx-dynamo/ir/op18/arnir0_Tiny-LLM-onnx-dynamo-ir-op18.ort.bert_tf.onnx;
:onnx_filename_ortclip,dump_models/arnir0_Tiny-LLM/onnx-dynamo/ir/op18/arnir0_Tiny-LLM-onnx-dynamo-ir-op18.ort.clip.onnx;
:onnx_filename_ortconformer,dump_models/arnir0_Tiny-LLM/onnx-dynamo/ir/op18/arnir0_Tiny-LLM-onnx-dynamo-ir-op18.ort.conformer.onnx;
:onnx_filename_ortgpt2,dump_models/arnir0_Tiny-LLM/onnx-dynamo/ir/op18/arnir0_Tiny-LLM-onnx-dynamo-ir-op18.ort.gpt2.onnx;
:onnx_filename_ortgpt2_tf,dump_models/arnir0_Tiny-LLM/onnx-dynamo/ir/op18/arnir0_Tiny-LLM-onnx-dynamo-ir-op18.ort.gpt2_tf.onnx;
:onnx_filename_ortgpt_neox,dump_models/arnir0_Tiny-LLM/onnx-dynamo/ir/op18/arnir0_Tiny-LLM-onnx-dynamo-ir-op18.ort.gpt_neox.onnx;
:onnx_filename_ortmmdit,dump_models/arnir0_Tiny-LLM/onnx-dynamo/ir/op18/arnir0_Tiny-LLM-onnx-dynamo-ir-op18.ort.mmdit.onnx;
:onnx_filename_ortsam2,dump_models/arnir0_Tiny-LLM/onnx-dynamo/ir/op18/arnir0_Tiny-LLM-onnx-dynamo-ir-op18.ort.sam2.onnx;
:onnx_filename_ortswin,dump_models/arnir0_Tiny-LLM/onnx-dynamo/ir/op18/arnir0_Tiny-LLM-onnx-dynamo-ir-op18.ort.swin.onnx;
:onnx_filename_ortt5,dump_models/arnir0_Tiny-LLM/onnx-dynamo/ir/op18/arnir0_Tiny-LLM-onnx-dynamo-ir-op18.ort.t5.onnx;
:onnx_filename_orttnlr,dump_models/arnir0_Tiny-LLM/onnx-dynamo/ir/op18/arnir0_Tiny-LLM-onnx-dynamo-ir-op18.ort.tnlr.onnx;
:onnx_filename_ortunet,dump_models/arnir0_Tiny-LLM/onnx-dynamo/ir/op18/arnir0_Tiny-LLM-onnx-dynamo-ir-op18.ort.unet.onnx;
:onnx_filename_ortvae,dump_models/arnir0_Tiny-LLM/onnx-dynamo/ir/op18/arnir0_Tiny-LLM-onnx-dynamo-ir-op18.ort.vae.onnx;
:onnx_filename_ortvit,dump_models/arnir0_Tiny-LLM/onnx-dynamo/ir/op18/arnir0_Tiny-LLM-onnx-dynamo-ir-op18.ort.vit.onnx;
:onnx_ort_inputs,dict(input_ids:A7s2x3,attention_mask:A7s2x33,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x30x96,past_key_values_value_0:A1s2x1x30x96);
:onnx_ort_inputs22,dict(input_ids:A7s3x4,attention_mask:A7s3x35,position_ids:A7s3x4,past_key_values_key_0:A1s3x1x31x96,past_key_values_value_0:A1s3x1x31x96);
:onnx_ort_inputs22_ortbart,dict(input_ids:A7s3x4,attention_mask:A7s3x35,position_ids:A7s3x4,past_key_values_key_0:A1s3x1x31x96,past_key_values_value_0:A1s3x1x31x96);
:onnx_ort_inputs22_ortbert,dict(input_ids:A7s3x4,attention_mask:A7s3x35,position_ids:A7s3x4,past_key_values_key_0:A1s3x1x31x96,past_key_values_value_0:A1s3x1x31x96);
:onnx_ort_inputs22_ortbert_keras,dict(input_ids:A7s3x4,attention_mask:A7s3x35,position_ids:A7s3x4,past_key_values_key_0:A1s3x1x31x96,past_key_values_value_0:A1s3x1x31x96);
:onnx_ort_inputs22_ortbert_tf,dict(input_ids:A7s3x4,attention_mask:A7s3x35,position_ids:A7s3x4,past_key_values_key_0:A1s3x1x31x96,past_key_values_value_0:A1s3x1x31x96);
:onnx_ort_inputs22_ortclip,dict(input_ids:A7s3x4,attention_mask:A7s3x35,position_ids:A7s3x4,past_key_values_key_0:A1s3x1x31x96,past_key_values_value_0:A1s3x1x31x96);
:onnx_ort_inputs22_ortconformer,dict(input_ids:A7s3x4,attention_mask:A7s3x35,position_ids:A7s3x4,past_key_values_key_0:A1s3x1x31x96,past_key_values_value_0:A1s3x1x31x96);
:onnx_ort_inputs22_ortgpt2,dict(input_ids:A7s3x4,attention_mask:A7s3x35,position_ids:A7s3x4,past_key_values_key_0:A1s3x1x31x96,past_key_values_value_0:A1s3x1x31x96);
:onnx_ort_inputs22_ortgpt2_tf,dict(input_ids:A7s3x4,attention_mask:A7s3x35,position_ids:A7s3x4,past_key_values_key_0:A1s3x1x31x96,past_key_values_value_0:A1s3x1x31x96);
:onnx_ort_inputs22_ortgpt_neox,dict(input_ids:A7s3x4,attention_mask:A7s3x35,position_ids:A7s3x4,past_key_values_key_0:A1s3x1x31x96,past_key_values_value_0:A1s3x1x31x96);
:onnx_ort_inputs22_ortmmdit,dict(input_ids:A7s3x4,attention_mask:A7s3x35,position_ids:A7s3x4,past_key_values_key_0:A1s3x1x31x96,past_key_values_value_0:A1s3x1x31x96);
:onnx_ort_inputs22_ortsam2,dict(input_ids:A7s3x4,attention_mask:A7s3x35,position_ids:A7s3x4,past_key_values_key_0:A1s3x1x31x96,past_key_values_value_0:A1s3x1x31x96);
:onnx_ort_inputs22_ortswin,dict(input_ids:A7s3x4,attention_mask:A7s3x35,position_ids:A7s3x4,past_key_values_key_0:A1s3x1x31x96,past_key_values_value_0:A1s3x1x31x96);
:onnx_ort_inputs22_ortt5,dict(input_ids:A7s3x4,attention_mask:A7s3x35,position_ids:A7s3x4,past_key_values_key_0:A1s3x1x31x96,past_key_values_value_0:A1s3x1x31x96);
:onnx_ort_inputs22_orttnlr,dict(input_ids:A7s3x4,attention_mask:A7s3x35,position_ids:A7s3x4,past_key_values_key_0:A1s3x1x31x96,past_key_values_value_0:A1s3x1x31x96);
:onnx_ort_inputs22_ortunet,dict(input_ids:A7s3x4,attention_mask:A7s3x35,position_ids:A7s3x4,past_key_values_key_0:A1s3x1x31x96,past_key_values_value_0:A1s3x1x31x96);
:onnx_ort_inputs22_ortvae,dict(input_ids:A7s3x4,attention_mask:A7s3x35,position_ids:A7s3x4,past_key_values_key_0:A1s3x1x31x96,past_key_values_value_0:A1s3x1x31x96);
:onnx_ort_inputs22_ortvit,dict(input_ids:A7s3x4,attention_mask:A7s3x35,position_ids:A7s3x4,past_key_values_key_0:A1s3x1x31x96,past_key_values_value_0:A1s3x1x31x96);
:onnx_ort_inputs2_batch1,dict(input_ids:A7s1x3,attention_mask:A7s1x33,position_ids:A7s1x3,past_key_values_key_0:A1s1x1x30x96,past_key_values_value_0:A1s1x1x30x96);
:onnx_ort_inputs2_batch1_ortbart,dict(input_ids:A7s1x3,attention_mask:A7s1x33,position_ids:A7s1x3,past_key_values_key_0:A1s1x1x30x96,past_key_values_value_0:A1s1x1x30x96);
:onnx_ort_inputs2_batch1_ortbert,dict(input_ids:A7s1x3,attention_mask:A7s1x33,position_ids:A7s1x3,past_key_values_key_0:A1s1x1x30x96,past_key_values_value_0:A1s1x1x30x96);
:onnx_ort_inputs2_batch1_ortbert_keras,dict(input_ids:A7s1x3,attention_mask:A7s1x33,position_ids:A7s1x3,past_key_values_key_0:A1s1x1x30x96,past_key_values_value_0:A1s1x1x30x96);
:onnx_ort_inputs2_batch1_ortbert_tf,dict(input_ids:A7s1x3,attention_mask:A7s1x33,position_ids:A7s1x3,past_key_values_key_0:A1s1x1x30x96,past_key_values_value_0:A1s1x1x30x96);
:onnx_ort_inputs2_batch1_ortclip,dict(input_ids:A7s1x3,attention_mask:A7s1x33,position_ids:A7s1x3,past_key_values_key_0:A1s1x1x30x96,past_key_values_value_0:A1s1x1x30x96);
:onnx_ort_inputs2_batch1_ortconformer,dict(input_ids:A7s1x3,attention_mask:A7s1x33,position_ids:A7s1x3,past_key_values_key_0:A1s1x1x30x96,past_key_values_value_0:A1s1x1x30x96);
:onnx_ort_inputs2_batch1_ortgpt2,dict(input_ids:A7s1x3,attention_mask:A7s1x33,position_ids:A7s1x3,past_key_values_key_0:A1s1x1x30x96,past_key_values_value_0:A1s1x1x30x96);
:onnx_ort_inputs2_batch1_ortgpt2_tf,dict(input_ids:A7s1x3,attention_mask:A7s1x33,position_ids:A7s1x3,past_key_values_key_0:A1s1x1x30x96,past_key_values_value_0:A1s1x1x30x96);
:onnx_ort_inputs2_batch1_ortgpt_neox,dict(input_ids:A7s1x3,attention_mask:A7s1x33,position_ids:A7s1x3,past_key_values_key_0:A1s1x1x30x96,past_key_values_value_0:A1s1x1x30x96);
:onnx_ort_inputs2_batch1_ortmmdit,dict(input_ids:A7s1x3,attention_mask:A7s1x33,position_ids:A7s1x3,past_key_values_key_0:A1s1x1x30x96,past_key_values_value_0:A1s1x1x30x96);
:onnx_ort_inputs2_batch1_ortsam2,dict(input_ids:A7s1x3,attention_mask:A7s1x33,position_ids:A7s1x3,past_key_values_key_0:A1s1x1x30x96,past_key_values_value_0:A1s1x1x30x96);
:onnx_ort_inputs2_batch1_ortswin,dict(input_ids:A7s1x3,attention_mask:A7s1x33,position_ids:A7s1x3,past_key_values_key_0:A1s1x1x30x96,past_key_values_value_0:A1s1x1x30x96);
:onnx_ort_inputs2_batch1_ortt5,dict(input_ids:A7s1x3,attention_mask:A7s1x33,position_ids:A7s1x3,past_key_values_key_0:A1s1x1x30x96,past_key_values_value_0:A1s1x1x30x96);
:onnx_ort_inputs2_batch1_orttnlr,dict(input_ids:A7s1x3,attention_mask:A7s1x33,position_ids:A7s1x3,past_key_values_key_0:A1s1x1x30x96,past_key_values_value_0:A1s1x1x30x96);
:onnx_ort_inputs2_batch1_ortunet,dict(input_ids:A7s1x3,attention_mask:A7s1x33,position_ids:A7s1x3,past_key_values_key_0:A1s1x1x30x96,past_key_values_value_0:A1s1x1x30x96);
:onnx_ort_inputs2_batch1_ortvae,dict(input_ids:A7s1x3,attention_mask:A7s1x33,position_ids:A7s1x3,past_key_values_key_0:A1s1x1x30x96,past_key_values_value_0:A1s1x1x30x96);
:onnx_ort_inputs2_batch1_ortvit,dict(input_ids:A7s1x3,attention_mask:A7s1x33,position_ids:A7s1x3,past_key_values_key_0:A1s1x1x30x96,past_key_values_value_0:A1s1x1x30x96);
:onnx_ort_inputs2_empty_cache,dict(input_ids:A7s2x3,attention_mask:A7s2x3,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x0x96,past_key_values_value_0:A1s2x1x0x96);
:onnx_ort_inputs2_empty_cache_ortbart,dict(input_ids:A7s2x3,attention_mask:A7s2x3,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x0x96,past_key_values_value_0:A1s2x1x0x96);
:onnx_ort_inputs2_empty_cache_ortbert,dict(input_ids:A7s2x3,attention_mask:A7s2x3,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x0x96,past_key_values_value_0:A1s2x1x0x96);
:onnx_ort_inputs2_empty_cache_ortbert_keras,dict(input_ids:A7s2x3,attention_mask:A7s2x3,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x0x96,past_key_values_value_0:A1s2x1x0x96);
:onnx_ort_inputs2_empty_cache_ortbert_tf,dict(input_ids:A7s2x3,attention_mask:A7s2x3,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x0x96,past_key_values_value_0:A1s2x1x0x96);
:onnx_ort_inputs2_empty_cache_ortclip,dict(input_ids:A7s2x3,attention_mask:A7s2x3,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x0x96,past_key_values_value_0:A1s2x1x0x96);
:onnx_ort_inputs2_empty_cache_ortconformer,dict(input_ids:A7s2x3,attention_mask:A7s2x3,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x0x96,past_key_values_value_0:A1s2x1x0x96);
:onnx_ort_inputs2_empty_cache_ortgpt2,dict(input_ids:A7s2x3,attention_mask:A7s2x3,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x0x96,past_key_values_value_0:A1s2x1x0x96);
:onnx_ort_inputs2_empty_cache_ortgpt2_tf,dict(input_ids:A7s2x3,attention_mask:A7s2x3,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x0x96,past_key_values_value_0:A1s2x1x0x96);
:onnx_ort_inputs2_empty_cache_ortgpt_neox,dict(input_ids:A7s2x3,attention_mask:A7s2x3,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x0x96,past_key_values_value_0:A1s2x1x0x96);
:onnx_ort_inputs2_empty_cache_ortmmdit,dict(input_ids:A7s2x3,attention_mask:A7s2x3,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x0x96,past_key_values_value_0:A1s2x1x0x96);
:onnx_ort_inputs2_empty_cache_ortsam2,dict(input_ids:A7s2x3,attention_mask:A7s2x3,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x0x96,past_key_values_value_0:A1s2x1x0x96);
:onnx_ort_inputs2_empty_cache_ortswin,dict(input_ids:A7s2x3,attention_mask:A7s2x3,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x0x96,past_key_values_value_0:A1s2x1x0x96);
:onnx_ort_inputs2_empty_cache_ortt5,dict(input_ids:A7s2x3,attention_mask:A7s2x3,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x0x96,past_key_values_value_0:A1s2x1x0x96);
:onnx_ort_inputs2_empty_cache_orttnlr,dict(input_ids:A7s2x3,attention_mask:A7s2x3,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x0x96,past_key_values_value_0:A1s2x1x0x96);
:onnx_ort_inputs2_empty_cache_ortunet,dict(input_ids:A7s2x3,attention_mask:A7s2x3,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x0x96,past_key_values_value_0:A1s2x1x0x96);
:onnx_ort_inputs2_empty_cache_ortvae,dict(input_ids:A7s2x3,attention_mask:A7s2x3,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x0x96,past_key_values_value_0:A1s2x1x0x96);
:onnx_ort_inputs2_empty_cache_ortvit,dict(input_ids:A7s2x3,attention_mask:A7s2x3,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x0x96,past_key_values_value_0:A1s2x1x0x96);
:onnx_ort_inputs_ortbart,dict(input_ids:A7s2x3,attention_mask:A7s2x33,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x30x96,past_key_values_value_0:A1s2x1x30x96);
:onnx_ort_inputs_ortbert,dict(input_ids:A7s2x3,attention_mask:A7s2x33,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x30x96,past_key_values_value_0:A1s2x1x30x96);
:onnx_ort_inputs_ortbert_keras,dict(input_ids:A7s2x3,attention_mask:A7s2x33,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x30x96,past_key_values_value_0:A1s2x1x30x96);
:onnx_ort_inputs_ortbert_tf,dict(input_ids:A7s2x3,attention_mask:A7s2x33,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x30x96,past_key_values_value_0:A1s2x1x30x96);
:onnx_ort_inputs_ortclip,dict(input_ids:A7s2x3,attention_mask:A7s2x33,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x30x96,past_key_values_value_0:A1s2x1x30x96);
:onnx_ort_inputs_ortconformer,dict(input_ids:A7s2x3,attention_mask:A7s2x33,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x30x96,past_key_values_value_0:A1s2x1x30x96);
:onnx_ort_inputs_ortgpt2,dict(input_ids:A7s2x3,attention_mask:A7s2x33,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x30x96,past_key_values_value_0:A1s2x1x30x96);
:onnx_ort_inputs_ortgpt2_tf,dict(input_ids:A7s2x3,attention_mask:A7s2x33,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x30x96,past_key_values_value_0:A1s2x1x30x96);
:onnx_ort_inputs_ortgpt_neox,dict(input_ids:A7s2x3,attention_mask:A7s2x33,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x30x96,past_key_values_value_0:A1s2x1x30x96);
:onnx_ort_inputs_ortmmdit,dict(input_ids:A7s2x3,attention_mask:A7s2x33,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x30x96,past_key_values_value_0:A1s2x1x30x96);
:onnx_ort_inputs_ortsam2,dict(input_ids:A7s2x3,attention_mask:A7s2x33,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x30x96,past_key_values_value_0:A1s2x1x30x96);
:onnx_ort_inputs_ortswin,dict(input_ids:A7s2x3,attention_mask:A7s2x33,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x30x96,past_key_values_value_0:A1s2x1x30x96);
:onnx_ort_inputs_ortt5,dict(input_ids:A7s2x3,attention_mask:A7s2x33,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x30x96,past_key_values_value_0:A1s2x1x30x96);
:onnx_ort_inputs_orttnlr,dict(input_ids:A7s2x3,attention_mask:A7s2x33,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x30x96,past_key_values_value_0:A1s2x1x30x96);
:onnx_ort_inputs_ortunet,dict(input_ids:A7s2x3,attention_mask:A7s2x33,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x30x96,past_key_values_value_0:A1s2x1x30x96);
:onnx_ort_inputs_ortvae,dict(input_ids:A7s2x3,attention_mask:A7s2x33,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x30x96,past_key_values_value_0:A1s2x1x30x96);
:onnx_ort_inputs_ortvit,dict(input_ids:A7s2x3,attention_mask:A7s2x33,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x30x96,past_key_values_value_0:A1s2x1x30x96);
:onnx_size,210585;
:onnx_size_ortbart,180032;
:onnx_size_ortbert,180032;
:onnx_size_ortbert_keras,180095;
:onnx_size_ortbert_tf,180066;
:onnx_size_ortclip,180032;
:onnx_size_ortconformer,180084;
:onnx_size_ortgpt2,180032;
:onnx_size_ortgpt2_tf,180064;
:onnx_size_ortgpt_neox,180073;
:onnx_size_ortmmdit,180041;
:onnx_size_ortsam2,211360;
:onnx_size_ortswin,180032;
:onnx_size_ortt5,180013;
:onnx_size_orttnlr,180032;
:onnx_size_ortunet,211360;
:onnx_size_ortvae,211350;
:onnx_size_ortvit,180022;
:opt_ort_bart_delta_node,-18;
:opt_ort_bart_duration,0.08297411901003215;
:opt_ort_bart_duration_save,0.047770720993867144;
:opt_ort_bart_n_nodes1,141;
:opt_ort_bart_n_nodes2,123;
:opt_ort_bert_delta_node,-18;
:opt_ort_bert_duration,0.08401308600150514;
:opt_ort_bert_duration_save,0.056851212997571565;
:opt_ort_bert_keras_delta_node,-18;
:opt_ort_bert_keras_duration,0.10752764700737316;
:opt_ort_bert_keras_duration_save,0.04921507299877703;
:opt_ort_bert_keras_n_nodes1,141;
:opt_ort_bert_keras_n_nodes2,123;
:opt_ort_bert_n_nodes1,141;
:opt_ort_bert_n_nodes2,123;
:opt_ort_bert_tf_delta_node,-18;
:opt_ort_bert_tf_duration,0.04518127199844457;
:opt_ort_bert_tf_duration_save,0.07118142800754867;
:opt_ort_bert_tf_n_nodes1,141;
:opt_ort_bert_tf_n_nodes2,123;
:opt_ort_clip_delta_node,-18;
:opt_ort_clip_duration,0.10149331200227607;
:opt_ort_clip_duration_save,0.07260508400213439;
:opt_ort_clip_n_nodes1,141;
:opt_ort_clip_n_nodes2,123;
:opt_ort_conformer_delta_node,-18;
:opt_ort_conformer_duration,0.09832391400414053;
:opt_ort_conformer_duration_save,0.0609840230026748;
:opt_ort_conformer_n_nodes1,141;
:opt_ort_conformer_n_nodes2,123;
:opt_ort_gpt2_delta_node,-18;
:opt_ort_gpt2_duration,0.130085319004138;
:opt_ort_gpt2_duration_save,0.05191460200876463;
:opt_ort_gpt2_n_nodes1,141;
:opt_ort_gpt2_n_nodes2,123;
:opt_ort_gpt2_tf_delta_node,-18;
:opt_ort_gpt2_tf_duration,0.09538381401216611;
:opt_ort_gpt2_tf_duration_save,0.05597821800620295;
:opt_ort_gpt2_tf_n_nodes1,141;
:opt_ort_gpt2_tf_n_nodes2,123;
:opt_ort_gpt_neox_delta_node,-18;
:opt_ort_gpt_neox_duration,0.0873840189888142;
:opt_ort_gpt_neox_duration_save,0.05203647101006936;
:opt_ort_gpt_neox_n_nodes1,141;
:opt_ort_gpt_neox_n_nodes2,123;
:opt_ort_mmdit_delta_node,-18;
:opt_ort_mmdit_duration,0.09216256899526343;
:opt_ort_mmdit_duration_save,0.04880255399621092;
:opt_ort_mmdit_n_nodes1,141;
:opt_ort_mmdit_n_nodes2,123;
:opt_ort_phi_duration,0.00010872900020331144;
:opt_ort_sam2_delta_node,0;
:opt_ort_sam2_duration,0.06746330100577325;
:opt_ort_sam2_duration_save,0.047538498998619616;
:opt_ort_sam2_n_nodes1,141;
:opt_ort_sam2_n_nodes2,141;
:opt_ort_swin_delta_node,-18;
:opt_ort_swin_duration,0.08259028800239321;
:opt_ort_swin_duration_save,0.04213672700279858;
:opt_ort_swin_n_nodes1,141;
:opt_ort_swin_n_nodes2,123;
:opt_ort_t5_delta_node,-18;
:opt_ort_t5_duration,0.05394438600342255;
:opt_ort_t5_duration_save,0.05140690399275627;
:opt_ort_t5_n_nodes1,141;
:opt_ort_t5_n_nodes2,123;
:opt_ort_tnlr_delta_node,-18;
:opt_ort_tnlr_duration,0.056767304995446466;
:opt_ort_tnlr_duration_save,0.05638380200252868;
:opt_ort_tnlr_n_nodes1,141;
:opt_ort_tnlr_n_nodes2,123;
:opt_ort_unet_delta_node,0;
:opt_ort_unet_duration,0.049544366003829055;
:opt_ort_unet_duration_save,0.06670014100382105;
:opt_ort_unet_n_nodes1,141;
:opt_ort_unet_n_nodes2,141;
:opt_ort_vae_delta_node,0;
:opt_ort_vae_duration,0.03521541800000705;
:opt_ort_vae_duration_save,0.06050746901019011;
:opt_ort_vae_n_nodes1,141;
:opt_ort_vae_n_nodes2,141;
:opt_ort_vit_delta_node,-18;
:opt_ort_vit_duration,0.03267149400198832;
:opt_ort_vit_duration_save,0.05234583199489862;
:opt_ort_vit_n_nodes1,141;
:opt_ort_vit_n_nodes2,123;
:run_expected,CausalLMOutputWithPast(logits:T1s2x3x32000,past_key_values:DynamicCache(key_cache=#1[T1s2x1x33x96], value_cache=#1[T1s2x1x33x96]));
:run_expected22,CausalLMOutputWithPast(logits:T1s3x4x32000,past_key_values:DynamicCache(key_cache=#1[T1s3x1x35x96], value_cache=#1[T1s3x1x35x96]));
:run_expected2_batch1,CausalLMOutputWithPast(logits:T1s1x3x32000,past_key_values:DynamicCache(key_cache=#1[T1s1x1x33x96], value_cache=#1[T1s1x1x33x96]));
:run_expected2_empty_cache,CausalLMOutputWithPast(logits:T1s2x3x32000,past_key_values:DynamicCache(key_cache=#1[T1s2x1x3x96], value_cache=#1[T1s2x1x3x96]));
:run_expected2_prompt,CausalLMOutputWithPast(logits:T1s1x11x32000,past_key_values:DynamicCache(key_cache=#1[T1s1x1x11x96], value_cache=#1[T1s1x1x11x96]));
:run_feeds_inputs,dict(input_ids:A7s2x3,attention_mask:A7s2x33,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x30x96,past_key_values_value_0:A1s2x1x30x96);
:run_feeds_inputs2,dict(input_ids:A7s3x4,attention_mask:A7s3x35,position_ids:A7s3x4,past_key_values_key_0:A1s3x1x31x96,past_key_values_value_0:A1s3x1x31x96);
:run_feeds_inputs_batch1,dict(input_ids:A7s1x3,attention_mask:A7s1x33,position_ids:A7s1x3,past_key_values_key_0:A1s1x1x30x96,past_key_values_value_0:A1s1x1x30x96);
:run_feeds_inputs_empty_cache,dict(input_ids:A7s2x3,attention_mask:A7s2x3,position_ids:A7s2x3,past_key_values_key_0:A1s2x1x0x96,past_key_values_value_0:A1s2x1x0x96);
:run_output_inputs,#3[A1s2x3x32000,A1s2x1x33x96,A1s2x1x33x96];
:run_output_inputs2,#3[A1s3x4x32000,A1s3x1x35x96,A1s3x1x35x96];
:run_output_inputs_batch1,#3[A1s1x3x32000,A1s1x1x33x96,A1s1x1x33x96];
:run_output_inputs_empty_cache,#3[A1s2x3x32000,A1s2x1x3x96,A1s2x1x3x96];
:second_input_keys,inputs_prompt,inputs2,inputs_empty_cache,inputs_batch1;
:time_create_onnx_ort,0.07741460800752975;
:time_create_onnx_ort_ortbart,0.020932808009092696;
:time_create_onnx_ort_ortbert,0.030709683007444255;
:time_create_onnx_ort_ortbert_keras,0.027471473993500695;
:time_create_onnx_ort_ortbert_tf,0.026594423994538374;
:time_create_onnx_ort_ortclip,0.026577240001643077;
:time_create_onnx_ort_ortconformer,0.030501294997520745;
:time_create_onnx_ort_ortgpt2,0.027767417006543837;
:time_create_onnx_ort_ortgpt2_tf,0.026658016999135725;
:time_create_onnx_ort_ortgpt_neox,0.02321686700452119;
:time_create_onnx_ort_ortmmdit,0.02426723101234529;
:time_create_onnx_ort_ortsam2,0.027043735011829995;
:time_create_onnx_ort_ortswin,0.027480111006298102;
:time_create_onnx_ort_ortt5,0.023572851991048083;
:time_create_onnx_ort_orttnlr,0.02451149000262376;
:time_create_onnx_ort_ortunet,0.026049893000163138;
:time_create_onnx_ort_ortvae,0.03880558500532061;
:time_create_onnx_ort_ortvit,0.02470322500448674;
:time_create_torch_model,0.6635595109983115;
:time_export_onnx,5.040008892989135;
:time_export_onnx_opt_ir,0.045477776002371684;
:time_onnx_save,0.20543576800264418;
:time_ortfusion_ortbart,0.20438803400611505;
:time_ortfusion_ortbert,0.20850337999581825;
:time_ortfusion_ortbert_keras,0.23246258700964972;
:time_ortfusion_ortbert_tf,0.13830022100592032;
:time_ortfusion_ortclip,0.20571301701420452;
:time_ortfusion_ortconformer,0.22301872800744604;
:time_ortfusion_ortgpt2,0.25617580900143366;
:time_ortfusion_ortgpt2_tf,0.21968804800417274;
:time_ortfusion_ortgpt_neox,0.227634588998626;
:time_ortfusion_ortmmdit,0.20451642200350761;
:time_ortfusion_ortphi,0.1024268359906273;
:time_ortfusion_ortsam2,0.17527699800848495;
:time_ortfusion_ortswin,0.16543381300289184;
:time_ortfusion_ortt5,0.15775669300637674;
:time_ortfusion_orttnlr,0.14180395398580004;
:time_ortfusion_ortunet,0.13654210600361694;
:time_ortfusion_ortvae,0.11423224800091702;
:time_ortfusion_ortvit,0.10131322499364614;
:time_preprocess_model_id,1.3310054782778025e-06;
:time_run,0.035858607006957754;
:time_run22,0.010248131002299488;
:time_run2_batch1,0.008123010004055686;
:time_run2_empty_cache,0.009585687002982013;
:time_run2_prompt,0.007149825993110426;
:time_run_onnx_ort,0.009677398003987037;
:time_run_onnx_ort22,0.0021916800033068284;
:time_run_onnx_ort22_ortbart,0.0017463749973103404;
:time_run_onnx_ort22_ortbert,0.0019641160033643246;
:time_run_onnx_ort22_ortbert_keras,0.005258456993033178;
:time_run_onnx_ort22_ortbert_tf,0.002029839000897482;
:time_run_onnx_ort22_ortclip,0.002028086004429497;
:time_run_onnx_ort22_ortconformer,0.00435827299952507;
:time_run_onnx_ort22_ortgpt2,0.0028779879939975217;
:time_run_onnx_ort22_ortgpt2_tf,0.0020335840090410784;
:time_run_onnx_ort22_ortgpt_neox,0.0018745749985100701;
:time_run_onnx_ort22_ortmmdit,0.001641594004468061;
:time_run_onnx_ort22_ortsam2,0.001894708999316208;
:time_run_onnx_ort22_ortswin,0.001976540996110998;
:time_run_onnx_ort22_ortt5,0.0030711730069015175;
:time_run_onnx_ort22_orttnlr,0.0019813159888144583;
:time_run_onnx_ort22_ortunet,0.00246389998937957;
:time_run_onnx_ort22_ortvae,0.006217503992957063;
:time_run_onnx_ort22_ortvit,0.003523936989950016;
:time_run_onnx_ort2_batch1,0.00126797599659767;
:time_run_onnx_ort2_batch1_ortbart,0.001054506006767042;
:time_run_onnx_ort2_batch1_ortbert,0.0012853380030719563;
:time_run_onnx_ort2_batch1_ortbert_keras,0.0012882300070486963;
:time_run_onnx_ort2_batch1_ortbert_tf,0.0012633190053747967;
:time_run_onnx_ort2_batch1_ortclip,0.0028724610019708052;
:time_run_onnx_ort2_batch1_ortconformer,0.0011740210029529408;
:time_run_onnx_ort2_batch1_ortgpt2,0.001359064001007937;
:time_run_onnx_ort2_batch1_ortgpt2_tf,0.0023170749918790534;
:time_run_onnx_ort2_batch1_ortgpt_neox,0.0011724209907697514;
:time_run_onnx_ort2_batch1_ortmmdit,0.0012524219928309321;
:time_run_onnx_ort2_batch1_ortsam2,0.002098712997394614;
:time_run_onnx_ort2_batch1_ortswin,0.002436121998471208;
:time_run_onnx_ort2_batch1_ortt5,0.0011752529972000048;
:time_run_onnx_ort2_batch1_orttnlr,0.001843570003984496;
:time_run_onnx_ort2_batch1_ortunet,0.001195201009977609;
:time_run_onnx_ort2_batch1_ortvae,0.0019111150031676516;
:time_run_onnx_ort2_batch1_ortvit,0.0012628300028154626;
:time_run_onnx_ort2_empty_cache,0.0015880630089668557;
:time_run_onnx_ort2_empty_cache_ortbart,0.001397468993673101;
:time_run_onnx_ort2_empty_cache_ortbert,0.0014710670075146481;
:time_run_onnx_ort2_empty_cache_ortbert_keras,0.005367598991142586;
:time_run_onnx_ort2_empty_cache_ortbert_tf,0.002064050000626594;
:time_run_onnx_ort2_empty_cache_ortclip,0.00453544499760028;
:time_run_onnx_ort2_empty_cache_ortconformer,0.001356937995296903;
:time_run_onnx_ort2_empty_cache_ortgpt2,0.0015854680095799267;
:time_run_onnx_ort2_empty_cache_ortgpt2_tf,0.005080394010292366;
:time_run_onnx_ort2_empty_cache_ortgpt_neox,0.0014738650061190128;
:time_run_onnx_ort2_empty_cache_ortmmdit,0.0014118510007392615;
:time_run_onnx_ort2_empty_cache_ortsam2,0.003949125006329268;
:time_run_onnx_ort2_empty_cache_ortswin,0.0013364439946599305;
:time_run_onnx_ort2_empty_cache_ortt5,0.0016086779942270368;
:time_run_onnx_ort2_empty_cache_orttnlr,0.001975916005903855;
:time_run_onnx_ort2_empty_cache_ortunet,0.0019854299898725003;
:time_run_onnx_ort2_empty_cache_ortvae,0.005959759990219027;
:time_run_onnx_ort2_empty_cache_ortvit,0.0017919550009537488;
:time_run_onnx_ort_ortbart,0.0014371069992193952;
:time_run_onnx_ort_ortbert,0.0016436549922218546;
:time_run_onnx_ort_ortbert_keras,0.006459237993112765;
:time_run_onnx_ort_ortbert_tf,0.001558246003696695;
:time_run_onnx_ort_ortclip,0.006575727995368652;
:time_run_onnx_ort_ortconformer,0.0021537959983106703;
:time_run_onnx_ort_ortgpt2,0.001700600012554787;
:time_run_onnx_ort_ortgpt2_tf,0.0016756219993112609;
:time_run_onnx_ort_ortgpt_neox,0.0017503739945823327;
:time_run_onnx_ort_ortmmdit,0.002993850997881964;
:time_run_onnx_ort_ortsam2,0.001677417996688746;
:time_run_onnx_ort_ortswin,0.0035737880098167807;
:time_run_onnx_ort_ortt5,0.0017029750015353784;
:time_run_onnx_ort_orttnlr,0.0020534849900286645;
:time_run_onnx_ort_ortunet,0.0014894949999870732;
:time_run_onnx_ort_ortvae,0.0018428639887133613;
:time_run_onnx_ort_ortvit,0.002759365990641527;
:time_run_patched,0.03014032798819244;
:time_torch_export_export,1.9438328840042232;
:time_torch_export_export_n,1;
:time_total,12.28180674900068;
:time_total_exporter,6.429626975004794;
:time_total_validation_onnx,0.13685088300553616;
:time_total_validation_torch,0.07885151599475648;
:version_date,2025-12-05T18:53:55;
:version_device,;
:version_do_run,True;
:version_drop_input,None;
:version_drop_inputs,[];
:version_dtype,;
:version_dump_folder,dump_models;
:version_exporter,onnx-dynamo;
:version_exporter_options,None;
:version_input_options,None;
:version_inputs2,1;
:version_model_id,arnir0/Tiny-LLM;
:version_model_options,None;
:version_numpy,2.3.5;
:version_onnx,1.21.0;
:version_onnx_diagnostic,0.8.4;
:version_onnx_ir,0.1.13;
:version_onnxruntime,1.24.0;
:version_onnxscript,?;
:version_opset,18;
:version_optimization,ir;
:version_ortbart_hidden_size,192;
:version_ortbart_num_attention_heads,2;
:version_ortbert_hidden_size,192;
:version_ortbert_keras_hidden_size,192;
:version_ortbert_keras_num_attention_heads,2;
:version_ortbert_num_attention_heads,2;
:version_ortbert_tf_hidden_size,192;
:version_ortbert_tf_num_attention_heads,2;
:version_ortclip_hidden_size,192;
:version_ortclip_num_attention_heads,2;
:version_ortconformer_hidden_size,192;
:version_ortconformer_num_attention_heads,2;
:version_ortfusiontype,ALL;
:version_ortgpt2_hidden_size,192;
:version_ortgpt2_num_attention_heads,2;
:version_ortgpt2_tf_hidden_size,192;
:version_ortgpt2_tf_num_attention_heads,2;
:version_ortgpt_neox_hidden_size,192;
:version_ortgpt_neox_num_attention_heads,2;
:version_ortmmdit_hidden_size,192;
:version_ortmmdit_num_attention_heads,2;
:version_ortphi_hidden_size,192;
:version_ortphi_num_attention_heads,2;
:version_ortsam2_hidden_size,192;
:version_ortsam2_num_attention_heads,2;
:version_ortswin_hidden_size,192;
:version_ortswin_num_attention_heads,2;
:version_ortt5_hidden_size,192;
:version_ortt5_num_attention_heads,2;
:version_orttnlr_hidden_size,192;
:version_orttnlr_num_attention_heads,2;
:version_ortunet_hidden_size,192;
:version_ortunet_num_attention_heads,2;
:version_ortvae_hidden_size,192;
:version_ortvae_num_attention_heads,2;
:version_ortvit_hidden_size,192;
:version_ortvit_num_attention_heads,2;
:version_patch,{'patch': True};
:version_patch_kwargs,{'patch':True,'patch_transformers':True,'patch_diffusers':True};
:version_quiet,False;
:version_rewrite,True;
:version_runtime,onnxruntime;
:version_same_as_pretrained,False;
:version_scipy,1.16.2;
:version_stop_if_static,0;
:version_submodule,None;
:version_torch,2.10.0.dev20251123+cu130;
:version_transformers,5.0.0.dev0;
:version_use_pretrained,False;
[runpythonerror]
/usr/lib/python3.12/copyreg.py:99: FutureWarning: `isinstance(treespec, LeafSpec)` is deprecated, use `isinstance(treespec, TreeSpec) and treespec.is_leaf()` instead.
return cls.__new__(cls, *args)
~/vv/this312/lib/python3.12/site-packages/torch/onnx/_internal/exporter/_onnx_program.py:460: UserWarning: # The axis name: batch will not be used, since it shares the same shape constraints with another axis: batch.
rename_mapping = _dynamic_shapes.create_rename_mapping(
symbolic shape inference disabled or failed.
symbolic shape inference disabled or failed.
symbolic shape inference disabled or failed.
symbolic shape inference disabled or failed.
Model producer not matched: Expected "keras2onnx", Got "pytorch".Please specify correct --model_type parameter.
symbolic shape inference disabled or failed.
symbolic shape inference disabled or failed.
Model producer not matched: Expected "tf2onnx", Got "pytorch".Please specify correct --model_type parameter.
symbolic shape inference disabled or failed.
symbolic shape inference disabled or failed.
symbolic shape inference disabled or failed.
symbolic shape inference disabled or failed.
symbolic shape inference disabled or failed.
symbolic shape inference disabled or failed.
symbolic shape inference disabled or failed.
symbolic shape inference disabled or failed.
Model producer not matched: Expected "tf2onnx", Got "pytorch".Please specify correct --model_type parameter.
symbolic shape inference disabled or failed.
symbolic shape inference disabled or failed.
symbolic shape inference disabled or failed.
symbolic shape inference disabled or failed.
fusion: 0%| | 0/5 [00:00<?, ?it/s]
The optimized model requires LayerNormalization with broadcast support. Please use onnxruntime-gpu>=1.21 for inference.
fusion: 20%|██ | 1/5 [00:00<00:00, 12.38it/s]
fusion: 100%|██████████| 5/5 [00:00<00:00, 57.57it/s]
sam2 fusion: 0%| | 0/12 [00:00<?, ?it/s]
symbolic shape inference disabled or failed.
sam2 fusion: 50%|█████ | 6/12 [00:00<00:00, 100.65it/s]
sam2 fusion: 100%|██████████| 12/12 [00:00<00:00, 192.24it/s]
symbolic shape inference disabled or failed.
symbolic shape inference disabled or failed.
symbolic shape inference disabled or failed.
symbolic shape inference disabled or failed.
symbolic shape inference disabled or failed.
symbolic shape inference disabled or failed.
fusion: 0%| | 0/18 [00:00<?, ?it/s]
symbolic shape inference disabled or failed.
fusion: 50%|█████ | 9/18 [00:00<00:00, 296.34it/s]
SkipGroupNorm fusion will be skipped since symbolic shape inference disabled or failed.
fusion: 67%|██████▋ | 12/18 [00:00<00:00, 371.08it/s]
fusion: 100%|██████████| 18/18 [00:00<00:00, 454.75it/s]
fusion: 0%| | 0/18 [00:00<?, ?it/s]
symbolic shape inference disabled or failed.
fusion: 50%|█████ | 9/18 [00:00<00:00, 389.81it/s]
SkipGroupNorm fusion will be skipped since symbolic shape inference disabled or failed.
fusion: 67%|██████▋ | 12/18 [00:00<00:00, 493.24it/s]
fusion: 100%|██████████| 18/18 [00:00<00:00, 622.01it/s]
symbolic shape inference disabled or failed.
symbolic shape inference disabled or failed.
SDPA or Eager implementation or Use a StaticCache¶
Add --mop cache_implementation=static --iop cls_cache=StaticCache to use a StaticCache instead of a DynamicCache (default).
Add --mop attn_implementation=eager to explicitly select eager implementation for attention.
python -m onnx_diagnostic validate \
-m google/gemma-2b \
--run \
-v 1 \
--export custom \
-o dump_test \
--dtype float16 \
--device cpu \
--patch \
--no-quiet \
--opt default \
--rewrite \
--mop attn_implementation=eager \
--mop cache_implementation=static \
--iop cls_cache=StaticCache
Frequent examples used to test¶
python -m onnx_diagnostic validate -m arnir0/Tiny-LLM --run -v 1 --device cuda --dtype float16 -o dump_models --patch --opt default+onnxruntime --export custom
About the exporter ‘custom’¶
It used to investigate issues or scenarios. It is usually very strict
and fails every time it falls in one unexpected situation.
It call experimental_experiment.torch_interpreter.to_onnx().
Some useful environment variables to set before running the command line.
DROPPATTERN=<pattern1,patterns2,...>: do not apply those patterns when optimizing a modelDUMPPATTERNS=<folder>: dumps all matched and applied nodes when a pattern is appliedPATTERN=<pattern1,pattern2,...>: increase verbosity for specific patterns to understand why one pattern was not applied