yobx.sklearn.xgboost.xgb#

ONNX converters for xgboost.XGBClassifier, xgboost.XGBRFClassifier, xgboost.XGBRegressor, xgboost.XGBRFRegressor, and xgboost.XGBRanker.

The trees are extracted from the fitted booster via booster.get_dump(dump_format='json') and encoded using the ONNX ML TreeEnsembleClassifier / TreeEnsembleRegressor operators (legacy ai.onnx.ml opset ≤ 4) or the unified TreeEnsemble operator (ai.onnx.ml opset ≥ 5).

Binary classification — the raw per-sample margin is passed through a sigmoid function and assembled into a [N, 2] probability matrix.
Multi-class classification — per-class margins are passed through softmax to produce a [N, n_classes] probability matrix.
Regression — raw margin output with the XGBoost base_score bias added, followed by an objective-dependent output transform:
- Identity objectives (reg:squarederror, reg:absoluteerror, …): no transform; bias = base_score added directly.
- Sigmoid objective (reg:logistic): sigmoid(margin); bias = logit(base_score).
- Exp objectives (count:poisson, reg:gamma, reg:tweedie, survival:cox): exp(margin); bias = log(base_score).
- Unknown objectives: raises NotImplementedError.
Multi-class objectives: no bias (base score is zero for each class).

The conversion supports XGBoost 2.x and 3.x and treats the stored base_score configuration value as the untransformed prediction-space value.

XGBoost’s tree-branching condition “go to yes-child when x < split_condition” maps to:

BRANCH_LT for both ai.onnx.ml opset ≤ 4 and opset ≥ 5 — exact match.

yobx.sklearn.xgboost.xgb.sklearn_xgb_classifier(g: GraphBuilderExtendedProtocol, sts: Dict, outputs: List[str], estimator, X: str, name: str = 'xgb_classifier') → Tuple[str, str][source]#

Convert an xgboost.XGBClassifier or xgboost.XGBRFClassifier to ONNX.

The converter supports:

Binary classification (n_classes_ == 2) — one tree per boosting round; sigmoid post-processing; output shape [N, 2].
Multi-class classification (n_classes_ > 2) — n_classes trees per round; softmax post-processing; output shape [N, n_classes].

Both ai.onnx.ml legacy (opset ≤ 4) and modern (opset ≥ 5) encodings are emitted based on the active opset in g.

XGBRFClassifier is a random-forest-style variant of XGBClassifier. It sets num_parallel_tree equal to n_estimators and performs a single boosting round, so the ONNX representation uses the same operators with the correct per-class tree grouping.

Parameters:

g – the graph builder to add nodes to
sts – shapes dict (passed through, not used internally)
outputs – desired output names [label, probabilities]
estimator – a fitted XGBClassifier or XGBRFClassifier
X – input tensor name
name – prefix for node names added to the graph

Returns:

tuple (label_result_name, proba_result_name)

yobx.sklearn.xgboost.xgb.sklearn_xgb_ranker(g: GraphBuilderExtendedProtocol, sts: Dict, outputs: List[str], estimator, X: str, name: str = 'xgb_ranker') → str[source]#

Convert an xgboost.XGBRanker to ONNX.

The raw margin (sum of all tree leaf values) is computed via a TreeEnsembleRegressor / TreeEnsemble node. Ranking objectives always use the identity link, so no output transform is applied.

Supported objectives: rank:pairwise (default), rank:ndcg, rank:map.

Parameters:

g – the graph builder to add nodes to
sts – shapes dict (passed through, not used internally)
outputs – desired output names [scores]
estimator – a fitted XGBRanker
X – input tensor name
name – prefix for node names added to the graph

Returns:

output tensor name (shape [N, 1])

Raises:

NotImplementedError – if the model’s objective is not supported

yobx.sklearn.xgboost.xgb.sklearn_xgb_regressor(g: GraphBuilderExtendedProtocol, sts: Dict, outputs: List[str], estimator, X: str, name: str = 'xgb_regressor') → str[source]#

Convert an xgboost.XGBRegressor or xgboost.XGBRFRegressor to ONNX.

The raw margin (sum of all tree leaf values) is computed via a TreeEnsembleRegressor / TreeEnsemble node, the XGBoost base_score bias is added, and then an objective-dependent output transform is applied to match predict() / predict():

Identity (reg:squarederror, reg:absoluteerror, …): no transform.
reg:logistic: sigmoid(margin).
count:poisson, reg:gamma, reg:tweedie, survival:cox: exp(margin).

XGBRFRegressor is a random-forest-style variant of XGBRegressor. It sets num_parallel_tree equal to n_estimators and performs a single boosting round, so all trees contribute to the same output target and the ONNX representation is identical to the gradient-boosting case.

Unsupported objectives raise NotImplementedError.

Parameters:

g – the graph builder to add nodes to
sts – shapes dict (passed through, not used internally)
outputs – desired output names [predictions]
estimator – a fitted XGBRegressor or XGBRFRegressor
X – input tensor name
name – prefix for node names added to the graph

Returns:

output tensor name (shape [N, 1])

Raises:

NotImplementedError – if the model’s objective is not supported