yobx.sklearn.lightgbm.lgbm#
ONNX converters for lightgbm.LGBMClassifier,
lightgbm.LGBMRegressor, and lightgbm.LGBMRanker.
The trees are extracted from the fitted booster via
booster_.dump_model() and encoded using the ONNX ML
TreeEnsembleClassifier / TreeEnsembleRegressor operators (legacy
ai.onnx.ml opset ≤ 4) or the unified TreeEnsemble operator
(ai.onnx.ml opset ≥ 5).
Binary classification — the raw per-sample margin is passed through a sigmoid function and assembled into a
[N, 2]probability matrix.Multi-class classification — per-class margins are passed through softmax to produce a
[N, n_classes]probability matrix.Regression — raw margin output with an objective-dependent output transform:
Identity objectives (
regression,regression_l1,huber,quantile,mape, …): no transform; raw == prediction.Exp objectives (
poisson,tweedie):exp(margin); prediction is in positive-real space.
Ranking — raw margin output (identity transform); output shape
[N, 1]. Supported objectives:lambdarank,rank_xendcg.
Numerical splits: LightGBM’s condition “go to left child when
x ≤ split_condition” maps to BRANCH_LEQ for both ai.onnx.ml
opset ≤ 4 and opset ≥ 5 — exact match.
Categorical splits: LightGBM encodes categorical splits as
decision_type == '==' with a threshold like '0||1||2'. ONNX only
supports single-value BRANCH_EQ comparisons, so each multi-value
categorical node is expanded into a chain of single-value checks by
_expand_categorical_splits() before flattening. The memoised DFS in
_flatten_lgbm_tree() ensures shared subtree references (the left
branch of every chain node) are assigned exactly one flat node ID.
- yobx.sklearn.lightgbm.lgbm.sklearn_lgbm_classifier(g: GraphBuilderExtendedProtocol, sts: Dict, outputs: List[str], estimator, X: str, name: str = 'lgbm_classifier') Tuple[str, str][source]#
Convert an
lightgbm.LGBMClassifierto ONNX.The converter supports:
Binary classification (
n_classes_ == 2) — one tree per boosting round; sigmoid post-processing; output shape[N, 2].Multi-class classification (
n_classes_ > 2) —n_classestrees per round; softmax post-processing; output shape[N, n_classes].
Both
ai.onnx.mllegacy (opset ≤ 4) and modern (opset ≥ 5) encodings are emitted based on the active opset in g.- Parameters:
g – the graph builder to add nodes to
sts – shapes dict (passed through, not used internally)
outputs – desired output names
[label, probabilities]estimator – a fitted
LGBMClassifierX – input tensor name
name – prefix for node names added to the graph
- Returns:
tuple
(label_result_name, proba_result_name)
- yobx.sklearn.lightgbm.lgbm.sklearn_lgbm_ranker(g: GraphBuilderExtendedProtocol, sts: Dict, outputs: List[str], estimator, X: str, name: str = 'lgbm_ranker') str[source]#
Convert an
lightgbm.LGBMRankerto ONNX.The raw margin (sum of all tree leaf values) is computed via a
TreeEnsembleRegressor/TreeEnsemblenode. Ranking objectives always use the identity link, so no output transform is applied.Supported objectives:
lambdarank(default),rank_xendcg.- Parameters:
g – the graph builder to add nodes to
sts – shapes dict (passed through, not used internally)
outputs – desired output names
[scores]estimator – a fitted
LGBMRankerX – input tensor name
name – prefix for node names added to the graph
- Returns:
output tensor name (shape
[N, 1])- Raises:
NotImplementedError – if the model’s objective is not supported
- yobx.sklearn.lightgbm.lgbm.sklearn_lgbm_regressor(g: GraphBuilderExtendedProtocol, sts: Dict, outputs: List[str], estimator, X: str, name: str = 'lgbm_regressor') str[source]#
Convert an
lightgbm.LGBMRegressorto ONNX.The raw margin (sum of all tree leaf values) is computed via a
TreeEnsembleRegressor/TreeEnsemblenode, and then an objective-dependent output transform is applied to matchpredict():Identity (
regression,regression_l1,huber,quantile,mape, …): no transform.poisson,tweedie:exp(margin).
Unsupported objectives raise
NotImplementedError.- Parameters:
g – the graph builder to add nodes to
sts – shapes dict (passed through, not used internally)
outputs – desired output names
[predictions]estimator – a fitted
LGBMRegressorX – input tensor name
name – prefix for node names added to the graph
- Returns:
output tensor name (shape
[N, 1])- Raises:
NotImplementedError – if the model’s objective is not supported