Polars LazyFrame to ONNX#
Overview#
yobx.sql can convert a polars.LazyFrame execution plan directly
into a self-contained ONNX model. The conversion works by extracting the
logical plan from the LazyFrame via polars.LazyFrame.explain(),
translating that plan into an intermediate SQL query, and then delegating to
the SQL-to-ONNX pipeline (sql_to_onnx()).
Architecture#
polars.LazyFrame
│
▼
lf.explain() ─── execution plan string
│
▼
_parse_polars_plan()─── _PolarsPlan (select / filter / group_by)
│
▼
_plan_to_sql() ─── SQL query string
│
▼
sql_to_onnx() ─── GraphBuilder ──► ExportArtifact
Supported LazyFrame operations#
Polars operation |
SQL clause generated |
ONNX nodes emitted |
|---|---|---|
|
|
|
|
|
|
|
|
|
Arithmetic ( |
Inlined into |
|
Comparisons ( |
|
|
Boolean compound ( |
|
|
|
|
(rename only) |
Aggregation methods ( |
|
|
Columnar input convention#
As with the SQL converter, each source column of the plan is represented as a
separate 1-D ONNX input tensor. The input_dtypes parameter maps source
column names to numpy dtypes and must include every column that appears in the
plan.
Polars dtype mapping#
The following polars data types are mapped to numpy equivalents:
Polars type |
numpy dtype |
|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
Example#
import numpy as np
import polars as pl
from yobx.sql import lazyframe_to_onnx
from yobx.reference import ExtendedReferenceEvaluator
lf = pl.LazyFrame({"a": [1.0, 2.0, 3.0], "b": [4.0, 5.0, 6.0]})
lf = lf.filter(pl.col("a") > 0).select(
[(pl.col("a") + pl.col("b")).alias("total")]
)
dtypes = {"a": np.float64, "b": np.float64}
artifact = lazyframe_to_onnx(lf, dtypes)
ref = ExtendedReferenceEvaluator(artifact)
a = np.array([1.0, -2.0, 3.0], dtype=np.float64)
b = np.array([4.0, 5.0, 6.0], dtype=np.float64)
(total,) = ref.run(None, {"a": a, "b": b})
# total contains rows where a > 0: [5.0, 9.0]
Limitations#
GROUP BYon multiple columns casts the key columns tofloat64before combining them, which causes precision loss for integer keys greater than 2**53.Only a single
filterstep, a singleselectstep, and a singlegroup_by/aggstep are handled. Complex multi-step plans may not translate correctly.join,sort,limit,distinct,pivot,melt, and other advanced polars operations are not yet supported.The plan text produced by
polars.LazyFrame.explain()may change between polars versions; the parser targets the format used by polars ≥ 0.19.