onnx_extended.tools.graph.onnx_graph_transformer

cast_constant

onnx_extended.tools.graph.cast_constant(graph: Graph, from_type: int = 1, to_type: int = 10, quiet: bool = False) Graph | None[source]

Converts all constants and initializers to the same type. It also modifies the input.

Parameters:
  • graph – Graph

  • from_type – type of the constants to convert

  • to_type – new type for the constants

  • quiet – catch exception and silently skip failing nodes

Returns:

Graph or None if not modified

Transformation are logged with logger onnx-extended/transformer. The graph is modified inplace. Enables the logs gives a better idea of the progress.

QuantizeOptions

class onnx_extended.tools.graph.QuantizeOptions(value)[source]

Quantization options.

  • NONE: no option

  • OPTIMIZE: assumes there is no nan values,

    choose less generic functions such as the implementation of DynamicQuantizeLinear

quantize_float8

onnx_extended.tools.graph.quantize_float8(graph: Graph, elem_type: int = 17, output_type: int = 1, early_stop: int = -1, version: str = 'onnxruntime', quiet: bool = False, index_transpose: int = 2, domain_ops: Dict[str, str] | None = None, exceptions: List[Dict[str, str]] | None = None, quantize_options: QuantizeOptions = QuantizeOptions.NONE) Graph | None[source]

Transforms a graph to introduce quantized weights. This transformation requires opset 20. The graph is upgraded if the main opset is below. It is better to do it before calling this function.

Parameters:
  • graph – Graph

  • elem_type – quantization type

  • output_type – output type

  • early_stop – -1 to go through all nodes or a value n > 0 to stop after n changes

  • version‘onnxruntime’ to use operators from onnx and onnxruntime, ‘onnx-extended’ to use experimental operators

  • quiet – catch exception and silently skip failing nodes

  • index_transpose – which input to transpose before calling gemm: 0 (none), 1 (first), 2 (second), 3 for both

  • domain_ops – domain to use for operators used as keys in the dictionary

  • exceptions – exclude nodes from the quantization, [{“name”: “node_name1”}, {“name”: “node_name2”}] will exclude these two node names from the quantization

  • quantize_options – see QuantizeOptions

Returns:

Graph or None if not modified

Transformation are logged with logger onnx-extended/transformer. The graph is modified inplace. Enables the logs gives a better idea of the progress.

TransformResults

class onnx_extended.tools.graph.onnx_graph_transformer.TransformResults(removed_nodes: List[Node], added_nodes: List[NodeProto], new_opsets: Dict[str, int] | None = None, local_functions: List[FunctionProto] | None = None)[source]

Output of a function transforming a graph.

Parameters:
  • removed_nodes – node to remove from the graph

  • added_nodes – node to add to the graph

  • new_opsets – opsets to update

  • local_functions – necessary functions to add to the graph

QuantizationError

class onnx_extended.tools.graph.QuantizationError[source]

Raised when a model or a node cannot be quantized.