yobx.sql.parsed_query_to_onnx#
- yobx.sql.parsed_query_to_onnx(pq: ~yobx.xtracing.parse.ParsedQuery | ~typing.List[~yobx.xtracing.parse.ParsedQuery], target_opset: int = 21, custom_functions: ~typing.Dict[str, ~typing.Callable] | None = None, builder_cls: type | ~typing.Callable = <class 'yobx.xbuilder.graph_builder.GraphBuilder'>, filename: str | None = None, verbose: int = 0, large_model: bool = False, external_threshold: int = 1024, return_optimize_report: bool = False) ExportArtifact[source]#
Convert an already-parsed
ParsedQueryto ONNX.The query must have been produced by
trace_dataframe()so that allColumnRefobjects carry a non-zerodtype. Type information is read exclusively from thosedtypefields; noinput_dtypesargument is needed. For queries produced byparse_sql()(which do not carry dtype information) usesql_to_onnx()instead.- Parameters:
pq – a
ParsedQueryproduced bytrace_dataframe(), or a list of such queries when the traced function returns multiple dataframes. All queries in the list are compiled into a single ONNX graph whose shared inputs are de-duplicated and whose outputs are the concatenation of the outputs from each individual query.target_opset – ONNX opset version to target.
custom_functions – optional mapping from function name to Python callable. Each callable is traced via
trace_numpy_function().builder_cls – graph-builder class or factory callable.
filename – if set, the exported ONNX model is saved to this path and the
ExportReportis written as a companion Excel file (same base name with.xlsxextension).verbose – verbosity level (0 = silent).
large_model – if True the returned
ExportArtifacthas itscontainerattribute set to anExtendedModelContainerexternal_threshold – if
large_modelis True, every tensor whose element count exceeds this threshold is stored as external datareturn_optimize_report – if True, the returned
ExportArtifacthas itsreportattribute populated with per-pattern optimization statistics
- Returns:
ExportArtifactwrapping the exported ONNX model together with anExportReport.
Example — single output:
import numpy as np from yobx.xtracing.dataframe_trace import trace_dataframe from yobx.sql.sql_convert import parsed_query_to_onnx from yobx.reference import ExtendedReferenceEvaluator def transform(df): return df.select([(df["a"] + df["b"]).alias("total")]) pq = trace_dataframe(transform, {"a": np.float32, "b": np.float32}) artifact = parsed_query_to_onnx(pq) ref = ExtendedReferenceEvaluator(artifact) a = np.array([1.0, -2.0, 3.0], dtype=np.float32) b = np.array([4.0, 5.0, 6.0], dtype=np.float32) (total,) = ref.run(None, {"a": a, "b": b})
Example — multiple outputs:
import numpy as np from yobx.xtracing.dataframe_trace import trace_dataframe from yobx.sql.sql_convert import parsed_query_to_onnx from yobx.reference import ExtendedReferenceEvaluator def transform(df): out1 = df.select([(df["a"] + df["b"]).alias("sum_ab")]) out2 = df.select([(df["a"] - df["b"]).alias("diff_ab")]) return out1, out2 pqs = trace_dataframe(transform, {"a": np.float32, "b": np.float32}) artifact = parsed_query_to_onnx(pqs) ref = ExtendedReferenceEvaluator(artifact) a = np.array([1.0, 2.0], dtype=np.float32) b = np.array([3.0, 4.0], dtype=np.float32) sum_ab, diff_ab = ref.run(None, {"a": a, "b": b})