yobx.sql.lazyframe_to_onnx#
- yobx.sql.lazyframe_to_onnx(lf: polars.LazyFrame, input_dtypes: Dict[str, Union[np.dtype, type, str]], target_opset: int = 21, builder_cls: Union[type, Callable] = <class 'yobx.xbuilder.graph_builder.GraphBuilder'>, filename: Optional[str] = None, verbose: int = 0, large_model: bool = False, external_threshold: int = 1024, return_optimize_report: bool = False) ExportArtifact[source]#
Convert a
polars.LazyFrameinto a self-contained ONNX model.The function extracts the logical execution plan from the
LazyFrameviapolars.LazyFrame.explain(), translates it into a SQL query understood bysql_to_onnx(), and returns anExportArtifactcontaining the ONNX model.Each source column of the plan is represented as a separate 1-D ONNX input tensor. The ONNX model outputs correspond to the columns or expressions in the
select(oragg) step of the plan.Supported
LazyFrameoperations#select— column pass-through and arithmetic expressionsfilter— row filtering with comparison and boolean predicatesgroup_by+agg— aggregations (sum,mean,min,max,count)
- param lf:
a
polars.LazyFrame. The execution plan returned bylf.explain()is parsed and converted.- param input_dtypes:
a mapping from source column name to numpy dtype (e.g.
{"a": np.float32, "b": np.float64}). Only the columns that actually appear in the plan need to be listed.- param target_opset:
ONNX opset version to target (default:
yobx.DEFAULT_TARGET_OPSET).- param builder_cls:
the graph-builder class (or factory callable) to use. Defaults to
GraphBuilder.- param filename:
if set, the exported ONNX model is saved to this path and the
ExportReportis written as a companion Excel file (same base name with.xlsxextension).- param verbose:
verbosity level (0 = silent).
- param large_model:
if True the returned
ExportArtifacthas itscontainerattribute set to anExtendedModelContainer- param external_threshold:
if
large_modelis True, every tensor whose element count exceeds this threshold is stored as external data- param return_optimize_report:
if True, the returned
ExportArtifacthas itsreportattribute populated with per-pattern optimization statistics- return:
ExportArtifactwrapping the exported ONNX model together with anExportReport.
Example:
import numpy as np import polars as pl from yobx.sql import lazyframe_to_onnx from yobx.reference import ExtendedReferenceEvaluator lf = pl.LazyFrame({"a": [1.0, 2.0, 3.0], "b": [4.0, 5.0, 6.0]}) lf = lf.filter(pl.col("a") > 0).select( [(pl.col("a") + pl.col("b")).alias("total")] ) dtypes = {"a": np.float64, "b": np.float64} artifact = lazyframe_to_onnx(lf, dtypes) ref = ExtendedReferenceEvaluator(artifact) a = np.array([1.0, -2.0, 3.0], dtype=np.float64) b = np.array([4.0, 5.0, 6.0], dtype=np.float64) (total,) = ref.run(None, {"a": a, "b": b}) # total contains rows where a > 0: [5.0, 9.0]
Note
GROUP BYaggregations are computed over the whole filtered dataset (same limitation assql_to_onnx()). True SQL group-by semantics (one output row per unique key) would require an ONNXLoopor custom kernel and are not yet supported.