Tutorial¶

This module was started to experiment around function torch.export.export() and see what kind of issues occur when leveraging that function to convert a torch.nn.Module into ONNX. The tutorial is a collection of examples or benchmark around that topic. Section Design explains how a converter works from torch model to the onnx graph. The official exporter is implemented in pytorch itself through the function torch.onnx.export(). Next sections show many examples, including how to deal with some possible issues.

torch.export.export: export to a Graph¶

All exporters rely on function torch.export.export() to convert a pytorch module into a torch.fx.Graph. Only then the conversion to ONNX starts. Most of the issues come from this first step and it is convenient to understand what it does. pytorch documentation already has many examples about it. Here are some corner cases.

Dynamic Shapes

Control Flow

Export a model with a loop (scan)

Custom Types as Inputs

Investigate, Export piece by piece

Guidelines to build a Benchmark

Check the exporter on a dummy from HuggingFace

strict = ?

The exporter relies on torch.export.export(). It exposes a parameter called strict: bool = True (true by default). The behaviour is different in some specific configuration. Page Summary goes through many kind of model and tells which one is supported and how it is converted.

decompositions

Function torch.export.export() produces an torch.export.ExportedProgram. This class has a method torch.export.ExportedProgram.run_decompositions() which converts the graph into another, usually longer but using a reduced set of functions or primitive. The converter to ONNX has less functions to support to convert this second graph.

Issues

See Issues with torch.export.export.

torch.onnx.export: export to ONNX¶

These examples relies on torch.onnx.export().

Simple Case

Linear Regression and export to ONNX

Dynamic Shapes

Dynamic shapes should be utilized to create a model capable of handling inputs with varying shapes while maintaining the same rank. Section torch.export.export: export to a Graph provides a couple of examples on how to define them, as their definition aligns with those used in torch.export.export().

Control Flow

torch.onnx.export and a model with a test

Custom Operators

Submodules

torch.onnx.export: Rename Dynamic Shapes

Models

torch.onnx.export and Phi-2

Optimization

It is recommended to optimize the obtained model by running method torch.onnx.ONNXProgram.optimize(). It removes many unncessary nodes (Identity, multiplication by 1) and other patterns. It tries to find patterns it knows how to optimize. See Pattern-based Rewrite Using Rules With onnxscript.

Issues

You can post issues in pytorch/issues and label it with module:onnx if you find an issue.

Frequent Exceptions or Errors with the Exporter¶

Unsupported functions or classes

If the converter to onnx fails, function torch_export_patches may help solving some of them. The ocumentation of this function gives the list of issues it can bypass.

from onnx_diagnostic.torch_export_patches import torch_export_patches

with torch_export_patches():
    # export to onnx with (model, inputs, ...)

If the input contains a cache class, you may need to patch the inputs.

from onnx_diagnostic.torch_export_patches import torch_export_patches

with torch_export_patches(patch_transformers=True) as modificator:
    inputs = modificator(inputs)
    # export to onnx with (model, inputs, ...)

This function is a work in progress as the exporter extends the list of supported models. A standaline copy of this function can be found at phi35.

torch._dynamo.exc.Unsupported: call_function BuiltinVariable(NotImplementedError) [ConstantVariable()] {}

This exception started to show up with transformers==4.38.2 but it does not seem related to it. Wrapping the code with the following fixes it.

with torch.no_grad():
    # ...

RuntimeError: Encountered autograd state manager op <built-in function _set_grad_enabled> trying to change global autograd state while exporting.

Wrapping the code around probably solves this issue.

with torch.no_grad():
    # ...

Play with onnx models and onnxruntime¶

onnxscript is one way to directly create model or function in ONNX. The onnxscript Tutorial explains how it works. Some other examples follow.

102: Examples with onnxscript

An exported model can be slow. It can be profiled on CUDA with the native profiling NVIDIA built. It can also be profiled with the tool implemented in onnxruntime. Next example shows that on CPU.

101: Profile an existing model with onnxruntime

Deeper into pytorch and onnx¶

101

102

102: Examples with onnxscript
l-plot-executorch-102
102: Convolution and Matrix Multiplication
102: Fuse kernels in a small Llama Model
102: Measure LLAMA speed

201

301

to_onnx: another export to investigate¶

to_onnx implements another exporter to ONNX. It does not support all the cases torch.onnx.export(). It fails rather trying different options to recover. It calls torch.export.export() but does not alter the graph (no rewriting, no decomposition) before converting this graph to onnx. It is used to investigate export issues raised by torch.export.export().

Simple Case

Export a linear regression

Dynamic Shapes

Control Flow

Custom Operators

Submodules

to_onnx and submodules from LLMs

Models

to_onnx and Phi-2

Optimization

101: Onnx Model Optimization based on Pattern Rewriting

Weird Errors and Dockers¶

Next sections mentions some weird errors and the way it was solved. Old work used to play with torch.compile() on a docker.