onnx_diagnostic.export.api

class onnx_diagnostic.export.api.WrapperToExportMethodToOnnx(mod: Module, method_name: str = 'forward', input_names: Sequence[str] | None = None, target_opset: int | Dict[str, int] | None = None, verbose: int = 0, filename: str | None = None, output_names: List[str] | None = None, output_dynamic_shapes: Dict[str, Any] | Tuple[Any] | None = None, exporter: str = 'onnx-dynamo', exporter_kwargs: Dict[str, Any] | None = None, save_ep: str | None = None, optimize: bool = True, optimizer_for_ort: bool = True, use_control_flow_dispatcher: bool = False, onnx_plugs: List[EagerDirectReplacementWithOnnx] | None = None, inline: bool = True, convert_after_n_calls: int = 2, patch_kwargs: Dict[str, Any] | None = None, skip_kwargs_names: Set[str] | None = None, dynamic_shapes: Dict[str, Any] | Tuple[Any] | None = None, dynamic_batch_for: Sequence[int | str] | None = None, expand_batch_for: Sequence[int | str] | None = None)[source][source]

Wraps an existing models in order to spy on inputs. This is used by onnx_diagnostic.export.api.method_to_onnx() or Export a LLM through method generate (with Tiny-LLM) for an example.

classmethod add_empty_cache_if_needed(inputs: List[Any]) List[Any][source][source]

Adds empty cache if needed as onnxruntime needs an empty cache, not a missing cache. It only works if inputs are defined as a dictionary.

check_discrepancies(atol: float = 0.0001, rtol: float = 0.1, hist=(0.1, 0.01), verbose: int = 0) List[Dict[str, str | int | float]][source][source]

Computes the discrepancies between the saved inputs and outputs with the saved onnx model.

Parameters:
  • atol – absolute tolerance, recommended values, 1e-4 for float, 1e-2 flot float16

  • rtol – relative tolerance

  • hist – thresholds, the function determines the number of discrepancies above that threshold.

  • verbose – verbosity

Returns:

results, a list of dictionaries, ready to be consumed by a dataframe

forward(*args, **kwargs)[source][source]

Define the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

classmethod get_dynamic_shape_patterns() Dict[str, Any][source][source]

Returns the known patterns for the dynamic shapes.

<<<

import pprint
from onnx_diagnostic.export.api import WrapperToExportMethodToOnnx

pprint.pprint(WrapperToExportMethodToOnnx.get_dynamic_shape_patterns())

>>>

    {'LLM.text': {'attention_mask': {0: 'batch', 1: 'totallength'},
                  'cache_position': {0: 'seqlength'},
                  'input_ids': {0: 'batch', 1: 'seqlength'},
                  'past_key_values': {0: 'batch', 2: 'pastlength'}}}
classmethod make_empty_cache_from_others(examples: List[Any]) Any[source][source]

Builds an empty cache based on existing one.

classmethod rename_dynamic_shapes(ds: Dict[str, Any], verbose: int = 0) Dict[str, Any][source][source]

Renames the dynamic shapes with names. Tries to rename any dynamic dimnesion dimension before export. It is not very clever, it just tries to recognize a known configuration based on input names. Dimension names in dynamic shapes are renamed if ds has the same number of named arguments as the one of the patterns returned by function get_dynamic_shape_patterns.

onnx_diagnostic.export.api.get_main_dispatcher(use_control_flow_dispatcher: bool = False, onnx_plugs: List[EagerDirectReplacementWithOnnx] | None = None) Any[source][source]

Creates a custom dispatcher for the custom exporter.

onnx_diagnostic.export.api.method_to_onnx(mod: Module, method_name: str = 'forward', input_names: Sequence[str] | None = None, target_opset: int | Dict[str, int] | None = None, verbose: int = 0, filename: str | None = None, output_names: List[str] | None = None, output_dynamic_shapes: Dict[str, Any] | Tuple[Any] | None = None, exporter: str = 'onnx-dynamo', exporter_kwargs: Dict[str, Any] | None = None, save_ep: str | None = None, optimize: bool = True, optimizer_for_ort: bool = True, use_control_flow_dispatcher: bool = False, onnx_plugs: List[EagerDirectReplacementWithOnnx] | None = None, inline: bool = True, convert_after_n_calls: int = 2, patch_kwargs: Dict[str, Any] | None = None, skip_kwargs_names: Set[str] | None = None, dynamic_shapes: Dict[str, Any] | Tuple[Any] | None = None, dynamic_batch_for: Sequence[int | str] | None = None, expand_batch_for: Sequence[int | str] | None = None) Callable[source][source]

Exports one method into ONNX for a module into ONNX. It returns a new method which must be called by the user at least twice with different values for the dynamic dimension between triggering the conversion into ONNX.

Parameters:
  • mod_meth – function to export into ONNX

  • input_names – input names for the onnx model (optional)

  • target_opset – opset to target, if not specified, each converter keeps its default value

  • verbose – verbosity level

  • filename – output filename, mandatory, the onnx model is saved on disk

  • output_names – to change the output of the onnx model

  • output_dynamic_shapes – to overwrite the dynamic shapes names

  • exporter – exporter to use (onnx-dynamo, modelbuilder, custom)

  • exporter_kwargs – additional parameters sent to the exporter

  • save_ep – saves the exported program

  • optimize – optimizes the model

  • optimizer_for_ort – optimizes the model for onnxruntime

  • use_control_flow_dispatcher – use the dispatcher created to supported custom loops (see onnx_diagnostic.export.control_flow_onnx.loop_for_onnx())

  • onnx_plugs – the code was modified to replace some parts with onnx translation

  • inline – inline local functions

  • convert_after_n_calls – converts the model after this number of calls.

  • patch_kwargs – patch arguments

  • skip_kwargs_names – use default values for these parameters part of the signature of the method to export

  • dynamic_shapes – dynamic shapes to use if the guessed ones are not right

  • dynamic_batch_for – LLM are usually called with a batch size equal to 1, but the export may benefit from having a dynamic batch size, this parameter forces the input specified in this set to have the first dimension be dynamic

  • expand_batch_for – LLM are usually called with a batch size equal to 1, but the export may benefit from having another value for the batch size, this parameter forces the input specified in this set to be expanded to 2 if the batch size is one

Returns:

the output of the selected exporter, usually a structure including an onnx model

See Export a LLM through method generate (with Tiny-LLM) for an example.

onnx_diagnostic.export.api.to_onnx(mod: Module | GraphModule, args: Sequence[Tensor] | None = None, kwargs: Dict[str, Tensor] | None = None, input_names: Sequence[str] | None = None, target_opset: int | Dict[str, int] | None = None, verbose: int = 0, dynamic_shapes: Dict[str, Any] | Tuple[Any] | None = None, filename: str | None = None, output_names: List[str] | None = None, output_dynamic_shapes: Dict[str, Any] | Tuple[Any] | None = None, exporter: str = 'onnx-dynamo', exporter_kwargs: Dict[str, Any] | None = None, save_ep: str | None = None, optimize: bool = True, optimizer_for_ort: bool = True, use_control_flow_dispatcher: bool = False, onnx_plugs: List[EagerDirectReplacementWithOnnx] | None = None, inline: bool = True) Any[source][source]

Exports one model into ONNX. Common API for exporters. By default, the models are optimized to use the most efficient kernels implemented in onnxruntime.

Parameters:
  • mod – torch model

  • args – unnamed arguments

  • kwargs – named arguments

  • input_names – input names for the onnx model (optional)

  • target_opset – opset to target, if not specified, each converter keeps its default value

  • verbose – verbosity level

  • dynamic_shapes – dynamic shapes, usually a nested structure included a dictionary for each tensor

  • filename – output filename

  • output_names – to change the output of the onnx model

  • output_dynamic_shapes – to overwrite the dynamic shapes names

  • exporter – exporter to use (onnx-dynamo, modelbuilder, custom)

  • exporter_kwargs – additional parameters sent to the exporter

  • save_ep – saves the exported program

  • optimize – optimizes the model

  • optimizer_for_ort – optimizes the model for onnxruntime

  • use_control_flow_dispatcher – use the dispatcher created to supported custom loops (see onnx_diagnostic.export.control_flow_onnx.loop_for_onnx())

  • onnx_plugs – the code was modified to replace some parts with onnx translation

  • inline – inline local functions

Returns:

the output of the selected exporter, usually a structure including an onnx model

A simple example:

to_onnx(
    model,
    kwargs=inputs,
    dynamic_shapes=ds,
    exporter=exporter,
    filename=filename,
)

Some examples using control flows are available in onnx_diagnostic.export.control_flow_onnx.loop_for_onnx() or onnx_diagnostic.export.onnx_plug.EagerDirectReplacementWithOnnx.