yobx.sql.sql_to_onnx#

yobx.sql.sql_to_onnx(query: str, input_dtypes: ~typing.Dict[str, ~numpy.dtype | type | str], right_input_dtypes: ~typing.Dict[str, ~numpy.dtype | type | str] | None = None, target_opset: int = 21, n_rows: int | None = None, custom_functions: ~typing.Dict[str, ~typing.Callable] | None = None, builder_cls: type | ~typing.Callable = <class 'yobx.xbuilder.graph_builder.GraphBuilder'>) ExportArtifact[source]#

Convert a SQL query to a self-contained ONNX model.

Each column in the query is represented as a separate 1-D ONNX input tensor, allowing the caller to feed column vectors independently. The resulting model’s outputs correspond to the columns (or expressions) in the SELECT clause, in order.

Internally this function creates a fresh GraphBuilder (or the class supplied via builder_cls), delegates to sql_to_onnx_graph() to populate it, and then calls to_onnx() to finalise the model. Use sql_to_onnx_graph() directly when you need to embed the SQL subgraph inside a larger ONNX model you are already building.

Parameters:
  • query – a SQL string. Supported clauses: SELECT, FROM, [INNER|LEFT|RIGHT|FULL] JOIN ON, WHERE, GROUP BY. Custom Python functions can be called by name in the SELECT and WHERE clauses when registered via custom_functions.

  • input_dtypes – a mapping from left-table column name to numpy dtype (np.float32, np.int64, etc.). Only columns actually referenced in the query need to be listed.

  • right_input_dtypes – if the query contains a JOIN, a mapping from right-table column name to numpy dtype. Defaults to input_dtypes when None.

  • target_opset – ONNX opset version to target (default: yobx.DEFAULT_TARGET_OPSET).

  • n_rows – optional static number of rows; used to fix the first dimension of every input tensor. When None the first dimension is symbolic ("N").

  • custom_functions

    an optional mapping from function name (as it appears in the SQL string) to a Python callable. Each callable must accept one or more numpy arrays and return a numpy array. The function body is traced with trace_numpy_function() so that numpy arithmetic is translated into ONNX nodes.

    Example:

    import numpy as np
    from yobx.sql import sql_to_onnx
    
    dtypes = {"a": np.float32}
    artifact = sql_to_onnx(
        "SELECT my_sqrt(a) AS r FROM t",
        dtypes,
        custom_functions={"my_sqrt": np.sqrt},
    )
    

  • builder_cls – the graph-builder class (or factory callable) to instantiate when creating the internal GraphBuilder. Defaults to GraphBuilder. Any class that implements the Shape and type tracking can be supplied here, e.g. a custom subclass that adds extra optimisation passes.

Returns:

ExportArtifact wrapping the exported ONNX proto together with an ExportReport.

Example:

import numpy as np
from yobx.sql import sql_to_onnx
from yobx.reference import ExtendedReferenceEvaluator

dtypes = {"a": np.float32, "b": np.float32}
artifact = sql_to_onnx("SELECT a + b AS total FROM t WHERE a > 0", dtypes)

ref = ExtendedReferenceEvaluator(artifact)
a = np.array([1.0, -2.0, 3.0], dtype=np.float32)
b = np.array([4.0,  5.0, 6.0], dtype=np.float32)
(total,) = ref.run(None, {"a": a, "b": b})

Note

GROUP BY aggregations are computed over the whole filtered dataset. True SQL group-by semantics (one output row per unique key) would require an ONNX Loop or custom kernel and are not yet supported.