tools.graph.onnx_graph_transformer#

cast_constant#

onnx_extended.tools.graph.cast_constant(graph: Graph, from_type: int = 1, to_type: int = 10, quiet: bool = False) → Graph | None[source]#

Converts all constants and initializers to the same type. It also modifies the input.

Parameters:

graph – Graph
from_type – type of the constants to convert
to_type – new type for the constants
quiet – catch exception and silently skip failing nodes

Returns:

Graph or None if not modified

Transformation are logged with logger onnx-extended/transformer. The graph is modified inplace. Enables the logs gives a better idea of the progress.

QuantizeOptions#

class onnx_extended.tools.graph.QuantizeOptions(value)[source]#

Quantization options.

NONE: no option
OPTIMIZE: assumes there is no nan values,
choose less generic functions such as the implementation of DynamicQuantizeLinear

quantize_float8#

onnx_extended.tools.graph.quantize_float8(graph: Graph, elem_type: int = 17, output_type: int = 1, early_stop: int = -1, version: str = 'onnxruntime', quiet: bool = False, index_transpose: int = 2, domain_ops: Dict[str, str] | None = None, exceptions: List[Dict[str, str]] | None = None, quantize_options: QuantizeOptions = QuantizeOptions.NONE) → Graph | None[source]#

Transforms a graph to introduce quantized weights. This transformation requires opset 20. The graph is upgraded if the main opset is below. It is better to do it before calling this function.

Parameters:

graph – Graph
elem_type – quantization type
output_type – output type
early_stop – -1 to go through all nodes or a value n > 0 to stop after n changes
version – ‘onnxruntime’ to use operators from onnx and onnxruntime, ‘onnx-extended’ to use experimental operators
quiet – catch exception and silently skip failing nodes
index_transpose – which input to transpose before calling gemm: 0 (none), 1 (first), 2 (second), 3 for both
domain_ops – domain to use for operators used as keys in the dictionary
exceptions – exclude nodes from the quantization, [{“name”: “node_name1”}, {“name”: “node_name2”}] will exclude these two node names from the quantization
quantize_options – see QuantizeOptions

Returns:

Graph or None if not modified

Transformation are logged with logger onnx-extended/transformer. The graph is modified inplace. Enables the logs gives a better idea of the progress.

TransformResults#

class onnx_extended.tools.graph.onnx_graph_transformer.TransformResults(removed_nodes: List[Node], added_nodes: List[NodeProto], new_opsets: Dict[str, int] | None = None, local_functions: List[FunctionProto] | None = None)[source]#

Output of a function transforming a graph.

Parameters:

removed_nodes – node to remove from the graph
added_nodes – node to add to the graph
new_opsets – opsets to update
local_functions – necessary functions to add to the graph

QuantizationError#

class onnx_extended.tools.graph.QuantizationError[source]#: Raised when a model or a node cannot be quantized.