yobx.sklearn.ensemble.gradient_boosting#

Converter for sklearn.ensemble.GradientBoostingClassifier and sklearn.ensemble.GradientBoostingRegressor.

The ONNX graph mirrors the model’s prediction pipeline:

raw_prediction = baseline + learning_rate * sum(tree_values for all trees)
# regression:        raw_prediction  →  output (N, 1)
# binary cls:        Sigmoid(raw)    →  [1-p, p],  ArgMax → label
# multiclass:        Softmax(raw)    →  proba,     ArgMax → label

where baseline is the initial raw score from init_ (a constant for the default DummyRegressor / DummyClassifier init, or zero when init='zero').

Two encoding paths are supported:

  • Legacy (ai.onnx.ml opset ≤ 4): TreeEnsembleRegressor with aggregate_function="SUM" and base_values.

  • Modern (ai.onnx.ml opset 5): TreeEnsemble with aggregate_function=1 (SUM) and a constant Add for the baseline.

Custom init estimators (other than DummyRegressor / DummyClassifier or 'zero') are not supported and raise NotImplementedError.

yobx.sklearn.ensemble.gradient_boosting.sklearn_gradient_boosting_classifier(g: GraphBuilderExtendedProtocol, sts: Dict, outputs: List[str], estimator: GradientBoostingClassifier, X: str, name: str = 'gradient_boosting_classifier') Tuple[str, str][source]#

Converts a sklearn.ensemble.GradientBoostingClassifier into ONNX.

The raw score (logit) per class is:

raw[:,k] = baseline[k] + learning_rate * sum(trees_k.predict(X))

Binary classification — the raw score (one logit per sample) passes through a Sigmoid; the resulting probability p for class 1 is concatenated as [1-p, p] to match predict_proba.

Multiclass — the raw scores (one logit per class) pass through a Softmax along axis 1.

In both cases the predicted label is derived via ArgMax and a Gather into the classes_ array.

When ai.onnx.ml opset 5 (or later) is active the unified TreeEnsemble operator is used; otherwise the legacy TreeEnsembleRegressor is emitted.

Parameters:
  • g – graph builder.

  • sts – shapes provided by scikit-learn.

  • outputs – desired output names (label, probabilities).

  • estimator – fitted GradientBoostingClassifier.

  • X – input tensor name.

  • name – node-name prefix.

Returns:

tuple (label_name, proba_name).

Raises:

NotImplementedError – if a custom init estimator is used.

yobx.sklearn.ensemble.gradient_boosting.sklearn_gradient_boosting_regressor(g: GraphBuilderExtendedProtocol, sts: Dict, outputs: List[str], estimator: GradientBoostingRegressor, X: str, name: str = 'gradient_boosting_regressor') str[source]#

Converts a sklearn.ensemble.GradientBoostingRegressor into ONNX.

The prediction is:

output = baseline + learning_rate * sum(tree.predict(X) for tree in estimators_)

where baseline is the constant initial raw score from the init_ estimator (DummyRegressor by default, or zero when init='zero').

When ai.onnx.ml opset 5 (or later) is active the unified TreeEnsemble operator is used; otherwise the legacy TreeEnsembleRegressor is emitted with aggregate_function="SUM" and base_values carrying the baseline.

When the input is float64 the output is cast back to float64 (ONNX ML tree operators always output float32).

Parameters:
  • g – graph builder.

  • sts – shapes provided by scikit-learn.

  • outputs – desired output names.

  • estimator – fitted GradientBoostingRegressor.

  • X – input tensor name.

  • name – node-name prefix.

Returns:

output tensor name (shape [N, 1]).

Raises:

NotImplementedError – if a custom init estimator is used.