yobx.sklearn.ensemble.adaboost#

yobx.sklearn.ensemble.adaboost.sklearn_adaboost_classifier(g: GraphBuilderExtendedProtocol, sts: Dict, outputs: List[str], estimator: AdaBoostClassifier, X: str, name: str = 'adaboost_classifier') → str | Tuple[str, str][source]#

Converts a sklearn.ensemble.AdaBoostClassifier into ONNX.

Implements the SAMME (Stagewise Additive Modelling using a Multi-class Exponential loss function) algorithm that is the only algorithm supported by recent versions of scikit-learn.

Algorithm overview — for each base estimator i with weight wᵢ:

pred_i = estimator_i.predict(X)           # (N,) class values
vote_i[j, k] = wᵢ          if pred_i[j] == classes_[k]
             = -wᵢ/(C−1)   otherwise      # (N, C) float

decision = Σᵢ vote_i / Σᵢ wᵢ             # (N, C)

For binary classification (C = 2) the decision is folded to 1-D (decision[:,0] *= -1 then sum(axis=1)) before computing probabilities.

Graph structure (multiclass, two base estimators as an example):

X ──[base est 0]──► label_0 (N,)
X ──[base est 1]──► label_1 (N,)
    label_i == classes_k ? w_i : -w_i/(C-1)  ──► vote_i (N, C)
                Add votes ──► decision (N, C)
            ArgMax(axis=1) ──Cast──Gather(classes_) ──► label
        decision/(C-1) ──Softmax ──► probabilities

Parameters:

g – the graph builder to add nodes to
sts – shapes and types defined by scikit-learn
outputs – desired output tensor names; two entries (label + probabilities) or one (label only)
estimator – a fitted AdaBoostClassifier
X – name of the input tensor
name – prefix used for names of nodes added by this converter

Returns:

label tensor name, or tuple (label, probabilities)

yobx.sklearn.ensemble.adaboost.sklearn_adaboost_regressor(g: GraphBuilderExtendedProtocol, sts: Dict, outputs: List[str], estimator: AdaBoostRegressor, X: str, name: str = 'adaboost_regressor') → str[source]#

Converts a sklearn.ensemble.AdaBoostRegressor into ONNX.

The prediction is the weighted median of the base estimators’ predictions, following scikit-learn’s R2 (AdaBoost.R2) algorithm.

Algorithm overview:

predictions = [est_i.predict(X) for i in range(E)]   # (E, N)
sorted_idx  = argsort(predictions, axis=0)            # per-sample
cumsum      = cumsum(weights[sorted_idx], axis=0)
median_pos  = argmax(cumsum >= 0.5 * total_weight)
output      = predictions[sorted_idx[median_pos]]

Graph structure (three base estimators as an example):

X ──[base est 0]──► pred_0 (N,)
X ──[base est 1]──► pred_1 (N,)
X ──[base est 2]──► pred_2 (N,)
       Concat(axis=1) ──► all_preds (N, E)
     TopK(k=E, asc) ──► sorted_vals (N, E), sorted_idx (N, E)
  Gather(weights, idx) ──► weights_sorted (N, E)
      CumSum(axis=1) ──► cumsum (N, E)
cumsum >= 0.5*total ──► ArgMax ──► median_pos (N,)
 GatherElements ──► predictions (N,)

Parameters:

g – the graph builder to add nodes to
sts – shapes and types defined by scikit-learn
outputs – desired output tensor names (one entry: predictions)
estimator – a fitted AdaBoostRegressor
X – name of the input tensor
name – prefix used for names of nodes added by this converter

Returns:

name of the predictions output tensor