yobx.sklearn.preprocessing.kbins_discretizer#
- yobx.sklearn.preprocessing.kbins_discretizer.sklearn_kbins_discretizer(g: GraphBuilderExtendedProtocol, sts: Dict, outputs: List[str], estimator: KBinsDiscretizer, X: str, name: str = 'kbins') str[source]#
Converts a
sklearn.preprocessing.KBinsDiscretizerinto ONNX.Supported values of the encode hyperparameter:
'ordinal'— each feature is replaced by its 0-based integer bin index, cast to the input floating-point dtype. Output shape:(N, F).'onehot-dense'and'onehot'— each feature is one-hot encoded inton_bins_[j]columns. The one-hot blocks are concatenated along axis 1. Output shape:(N, sum(n_bins_)).
The bin index for feature j is computed by counting how many interior thresholds (
bin_edges_[j][1:-1]) are less than or equal to the sample value:X (N, F) │ └─Unsqueeze(axis=2)──► X_exp (N, F, 1) │ thresholds (1, F, T) ─────────┤ ▼ GreaterOrEqual ──► (N, F, T) bool │ Cast(int64) │ ReduceSum(axis=2) ──► bin_indices (N, F) int64 │ Min / Max clip │ [ordinal] Cast(float) ──► output (N, F) [onehot(-dense)] OneHot + Concat ──► output (N, sum(n_bins_))Interior thresholds for features that have fewer bins than the maximum are padded with
+infso that the excess comparisons always yieldFalseand contribute 0 to the sum.- Parameters:
g – the graph builder to add nodes to
sts – shapes defined by scikit-learn
estimator – a fitted
KBinsDiscretizeroutputs – desired output names
X – input tensor name
name – prefix name for the added nodes
- Returns:
output name