onnx_diagnostic.helpers.torch_helper

onnx_diagnostic.helpers.torch_helper.dummy_llm(cls_name: str | None = None, dynamic_shapes: bool = False) Tuple[Module, Tuple[Tensor, ...]] | Tuple[Module, Tuple[Tensor, ...], Any][source][source]

Creates a dummy LLM for test purposes.

Parameters:
  • cls_name – None for whole model or a piece of it

  • dynamic_shapes – returns dynamic shapes as well

<<<

from onnx_diagnostic.helpers.torch_helper import dummy_llm

print(dummy_llm())

>>>

    (LLM(
      (embedding): Embedding(
        (embedding): Embedding(1024, 16)
        (pe): Embedding(1024, 16)
      )
      (decoder): DecoderLayer(
        (attention): MultiAttentionBlock(
          (attention): ModuleList(
            (0-1): 2 x AttentionBlock(
              (query): Linear(in_features=16, out_features=16, bias=False)
              (key): Linear(in_features=16, out_features=16, bias=False)
              (value): Linear(in_features=16, out_features=16, bias=False)
            )
          )
          (linear): Linear(in_features=32, out_features=16, bias=True)
        )
        (feed_forward): FeedForward(
          (linear_1): Linear(in_features=16, out_features=128, bias=True)
          (relu): ReLU()
          (linear_2): Linear(in_features=128, out_features=16, bias=True)
        )
        (norm_1): LayerNorm((16,), eps=1e-05, elementwise_affine=True)
        (norm_2): LayerNorm((16,), eps=1e-05, elementwise_affine=True)
      )
    ), (tensor([[399, 985,  96, 639, 533, 821, 974, 820, 781, 290, 497, 259, 713, 429,
             159, 630, 942, 143, 748,  76, 380, 398, 239, 402, 668, 982, 517, 436,
             666, 741]]),))
onnx_diagnostic.helpers.torch_helper.fake_torchdynamo_exporting()[source][source]

Sets torch.compiler._is_exporting_flag to True to trigger pieces of code only enabled during export.

onnx_diagnostic.helpers.torch_helper.is_stealing() bool[source][source]

Returns true if steal_forward() was yielded.

onnx_diagnostic.helpers.torch_helper.is_torchdynamo_exporting() bool[source][source]

Tells if torch is exporting a model. Relies on torch.compiler.is_exporting().

onnx_diagnostic.helpers.torch_helper.model_statistics(model: Module)[source][source]

Returns statistics on a model in a dictionary.

onnx_diagnostic.helpers.torch_helper.onnx_dtype_to_torch_dtype(itype: int) dtype[source][source]

Converts an onnx type into a torch dtype.

Parameters:

to – onnx dtype

Returns:

torch dtype

onnx_diagnostic.helpers.torch_helper.proto_from_tensor(arr: Tensor, name: str | None = None, verbose: int = 0) TensorProto[source][source]

Converts a torch Tensor into a TensorProto.

Parameters:
  • arr – tensor

  • verbose – display the type and shape

Returns:

a TensorProto

onnx_diagnostic.helpers.torch_helper.replace_string_by_dynamic(dynamic_shapes: Any) Any[source][source]

Replaces strings by torch.export.Dim.DYNAMIC.

onnx_diagnostic.helpers.torch_helper.steal_append(name: str, obj: Any)[source][source]

When outside a forward method, it is still possible to add a python object which contains tensors and dump after the execution of the model.

steal_append("quantize", [t1, t2])

The same code can executed multiple times, then the name can extended with a number.

onnx_diagnostic.helpers.torch_helper.steal_forward(model: ~torch.nn.modules.module.Module | ~typing.Tuple[str, ~torch.nn.modules.module.Module] | ~typing.List[~torch.nn.modules.module.Module | ~typing.Tuple[str, ~torch.nn.modules.module.Module]], fprint: ~typing.Callable = <function string_type>, dump_file: str | None = None, submodules: bool = False, verbose: int = 0, storage_limit: int = 134217728, **kwargs)[source][source]

The necessary modification to steem forward method and prints out inputs and outputs using onnx_diagnostic.helpers.string_type(). See example Steel method forward to guess inputs and dynamic shapes (with Tiny-LLM).

Parameters:
  • model – a model or a list of models to monitor, every model can also be a tuple(name, model), name is displayed well.

  • fprint – function used to print out (or dump), by default, it is onnx_diagnostic.helpers.string_type()

  • kwargs – additional parameters sent to onnx_diagnostic.helpers.string_type() or any other function defined by fprint

  • dump_file – dumps stolen inputs and outputs in an onnx model, they can be restored with create_input_tensors_from_onnx_model

  • submodules – if True and model is a module, the list extended with all the submodules the module contains

  • verbose – verbosity

  • storage_limit – do not stored object bigger than this

The following examples shows how to steal and dump all the inputs / outputs for a module and its submodules, then restores them.

<<<

import torch
from onnx_diagnostic.helpers.torch_helper import steal_forward
from onnx_diagnostic.helpers.mini_onnx_builder import (
    create_input_tensors_from_onnx_model,
)


class SubModel(torch.nn.Module):
    def forward(self, x):
        return x * x


class Model(torch.nn.Module):
    def __init__(self):
        super().__init__()
        self.s1 = SubModel()
        self.s2 = SubModel()

    def forward(self, x, y):
        return self.s1(x) + self.s2(y)


inputs = torch.rand(2, 1), torch.rand(2, 1)
model = Model()
dump_file = "dump_steal_forward_submodules.onnx"
with steal_forward(model, submodules=True, dump_file=dump_file):
    model(*inputs)

# Let's restore the stolen data.
restored = create_input_tensors_from_onnx_model(dump_file)
for k, v in sorted(restored.items()):
    if isinstance(v, tuple):
        args, kwargs = v
        print("input", k, args, kwargs)
    else:
        print("output", k, v)

>>>

        +-Model-0 -- stolen forward for class Model -- iteration 0
          <- args=(T1s2x1,T1s2x1) --- kwargs={}
        +s1-SubModel-0 -- stolen forward for class SubModel -- iteration 0
          <- args=(T1s2x1,) --- kwargs={}
          -> T1s2x1
        -s1-SubModel-0.
        +s2-SubModel-0 -- stolen forward for class SubModel -- iteration 0
          <- args=(T1s2x1,) --- kwargs={}
          -> T1s2x1
        -s2-SubModel-0.
          -> T1s2x1
        --Model-0.
    input ('-Model-0', 0, 'I') (tensor([[0.1181],
            [0.1342]]), tensor([[0.4020],
            [0.6322]])) {}
    output ('-Model-0', 0, 'O') tensor([[0.1756],
            [0.4176]])
    input ('s1-SubModel-0', 0, 'I') (tensor([[0.1181],
            [0.1342]]),) {}
    output ('s1-SubModel-0', 0, 'O') tensor([[0.0139],
            [0.0180]])
    input ('s2-SubModel-0', 0, 'I') (tensor([[0.4020],
            [0.6322]]),) {}
    output ('s2-SubModel-0', 0, 'O') tensor([[0.1616],
            [0.3996]])

Function steal_append() can be used to dump more tensors. When inside the context, func:is_stealing returns True, False otherwise.

onnx_diagnostic.helpers.torch_helper.to_any(value: Any, to_value: dtype | device) Any[source][source]

Applies torch.to if applicable. Goes recursively.

onnx_diagnostic.helpers.torch_helper.to_numpy(tensor: Tensor)[source][source]

Converts a torch.Tensor to numpy.ndarray.

onnx_diagnostic.helpers.torch_helper.to_tensor(tensor: TensorProto, base_dir: str = '') Tensor[source][source]

Converts a TensorProto to a numpy array.

Parameters:
  • tensor – a TensorProto object.

  • base_dir – if external tensor exists, base_dir can help to find the path to it

Returns:

the converted tensor

onnx_diagnostic.helpers.torch_helper.torch_deepcopy(value: Any) Any[source][source]

Makes a deepcopy.

onnx_diagnostic.helpers.torch_helper.torch_dtype_to_onnx_dtype(to: dtype) int[source][source]

Converts a torch dtype into a onnx element type.

Parameters:

to – torch dtype

Returns:

onnx type

onnx_diagnostic.helpers.torch_helper.torch_tensor_size(value: Any) Any[source][source]

Returns the number of bytes stored in tensors.