yobx.sql.parsed_query_to_onnx#

yobx.sql.parsed_query_to_onnx(pq: ~yobx.xtracing.parse.ParsedQuery | ~typing.List[~yobx.xtracing.parse.ParsedQuery], target_opset: int = 21, custom_functions: ~typing.Dict[str, ~typing.Callable] | None = None, builder_cls: type | ~typing.Callable = <class 'yobx.xbuilder.graph_builder.GraphBuilder'>, filename: str | None = None, verbose: int = 0, large_model: bool = False, external_threshold: int = 1024, return_optimize_report: bool = False) → ExportArtifact[source]#

Convert an already-parsed ParsedQuery to ONNX.

The query must have been produced by trace_dataframe() so that all ColumnRef objects carry a non-zero dtype. Type information is read exclusively from those dtype fields; no input_dtypes argument is needed. For queries produced by parse_sql() (which do not carry dtype information) use sql_to_onnx() instead.

Parameters:

pq – a ParsedQuery produced by trace_dataframe(), or a list of such queries when the traced function returns multiple dataframes. All queries in the list are compiled into a single ONNX graph whose shared inputs are de-duplicated and whose outputs are the concatenation of the outputs from each individual query.
target_opset – ONNX opset version to target.
custom_functions – optional mapping from function name to Python callable. Each callable is traced via trace_numpy_function().
builder_cls – graph-builder class or factory callable.
filename – if set, the exported ONNX model is saved to this path and the ExportReport is written as a companion Excel file (same base name with .xlsx extension).
verbose – verbosity level (0 = silent).
large_model – if True the returned ExportArtifact has its container attribute set to an ExtendedModelContainer
external_threshold – if large_model is True, every tensor whose element count exceeds this threshold is stored as external data
return_optimize_report – if True, the returned ExportArtifact has its report attribute populated with per-pattern optimization statistics

Returns:

ExportArtifact wrapping the exported ONNX model together with an ExportReport.

Example — single output:

import numpy as np
from yobx.xtracing.dataframe_trace import trace_dataframe
from yobx.sql.sql_convert import parsed_query_to_onnx
from yobx.reference import ExtendedReferenceEvaluator

def transform(df):
    return df.select([(df["a"] + df["b"]).alias("total")])

pq = trace_dataframe(transform, {"a": np.float32, "b": np.float32})
artifact = parsed_query_to_onnx(pq)

ref = ExtendedReferenceEvaluator(artifact)
a = np.array([1.0, -2.0, 3.0], dtype=np.float32)
b = np.array([4.0,  5.0, 6.0], dtype=np.float32)
(total,) = ref.run(None, {"a": a, "b": b})

Example — multiple outputs:

import numpy as np
from yobx.xtracing.dataframe_trace import trace_dataframe
from yobx.sql.sql_convert import parsed_query_to_onnx
from yobx.reference import ExtendedReferenceEvaluator

def transform(df):
    out1 = df.select([(df["a"] + df["b"]).alias("sum_ab")])
    out2 = df.select([(df["a"] - df["b"]).alias("diff_ab")])
    return out1, out2

pqs = trace_dataframe(transform, {"a": np.float32, "b": np.float32})
artifact = parsed_query_to_onnx(pqs)

ref = ExtendedReferenceEvaluator(artifact)
a = np.array([1.0, 2.0], dtype=np.float32)
b = np.array([3.0, 4.0], dtype=np.float32)
sum_ab, diff_ab = ref.run(None, {"a": a, "b": b})