yobx.sklearn.ensemble.hist_gradient_boosting#
Converter for sklearn.ensemble.HistGradientBoostingClassifier and
sklearn.ensemble.HistGradientBoostingRegressor.
The ONNX graph mirrors the model’s prediction pipeline:
raw_prediction = sum(tree_values for all trees) + baseline_prediction
# regression: raw_prediction → output (N, 1)
# binary cls: Sigmoid(raw) → [1-p, p], ArgMax → label
# multiclass: Softmax(raw) → proba, ArgMax → label
Two encoding paths are supported:
Legacy (
ai.onnx.mlopset ≤ 4):TreeEnsembleRegressorwithaggregate_function="SUM"andbase_values.Modern (
ai.onnx.mlopset 5):TreeEnsemblewithaggregate_function=1(SUM) andbase_values_as_tensor.
Both paths raise NotImplementedError when the model contains
categorical splits (is_categorical == 1 in any tree node), as the
ONNX ML operator set does not support bitset-based categorical splits.
- yobx.sklearn.ensemble.hist_gradient_boosting.sklearn_hgb_classifier(g: GraphBuilderExtendedProtocol, sts: Dict, outputs: List[str], estimator: HistGradientBoostingClassifier, X: str, name: str = 'hgb_classifier') Tuple[str, str][source]#
Converts a
sklearn.ensemble.HistGradientBoostingClassifierto ONNX.When
ai.onnx.mlopset 5 (or later) is active the unifiedTreeEnsembleoperator is used; otherwise the legacyTreeEnsembleRegressoris emitted.Binary classification — the raw sum (one logit per sample) passes through a
Sigmoid; the resulting probabilitypfor class 1 is concatenated as[1-p, p]to matchpredict_proba.Multiclass — the raw sums (one logit per class) pass through a
Softmaxalong axis 1.In both cases the predicted label is derived via
ArgMaxand aGatherinto theclasses_array.- Parameters:
g – graph builder
sts – shapes provided by scikit-learn
outputs – desired output names (label, probabilities)
estimator – fitted
HistGradientBoostingClassifierX – input tensor name
name – node-name prefix
- Returns:
tuple
(label_name, proba_name)- Raises:
NotImplementedError – if the model contains categorical splits
- yobx.sklearn.ensemble.hist_gradient_boosting.sklearn_hgb_regressor(g: GraphBuilderExtendedProtocol, sts: Dict, outputs: List[str], estimator: HistGradientBoostingRegressor, X: str, name: str = 'hgb_regressor') str[source]#
Converts a
sklearn.ensemble.HistGradientBoostingRegressorto ONNX.When
ai.onnx.mlopset 5 (or later) is active the unifiedTreeEnsembleoperator is used; otherwise the legacyTreeEnsembleRegressoris emitted.The prediction formula is:
raw = sum(tree.predict(X) for tree in _predictors) + _baseline_prediction output = raw # shape (N, 1)
When the input is
float64the output is cast back tofloat64(both ONNX ML tree operators always outputfloat32).- Parameters:
g – graph builder
sts – shapes provided by scikit-learn
outputs – desired output names
estimator – fitted
HistGradientBoostingRegressorX – input tensor name
name – node-name prefix
- Returns:
output tensor name (shape
[N, 1])- Raises:
NotImplementedError – if the model contains categorical splits