yobx.sklearn.preprocessing.one_hot_encoder#

yobx.sklearn.preprocessing.one_hot_encoder.sklearn_one_hot_encoder(g: GraphBuilderExtendedProtocol, sts: Dict, outputs: List[str], estimator: OneHotEncoder, X: str, name: str = 'one_hot_encoder') → str[source]#

Converts a sklearn.preprocessing.OneHotEncoder into ONNX.

The converter handles all standard configurations of OneHotEncoder:

Multiple features – each feature column is processed independently and the resulting one-hot blocks are concatenated along the feature axis.
Drop – when drop is not None (e.g. 'first' or 'if_binary'), the converter uses the fitted drop_idx_ attribute to skip the dropped category column for each feature.
Unknown handling – because the encoding is implemented via element-wise comparisons (Equal), samples with categories that were not seen during training produce an all-zero row for that feature, matching handle_unknown='ignore' behaviour.

The conversion for a single feature i with categories [c_0, c_1, …, c_{K-1}] is:

X ──Gather(col i)──► col_i (N×1)
                        │
                  Equal(col_i, [[c_0,…,c_{K-1}]])  ──► (N×K) bool
                        │
                     Cast(float)                    ──► (N×K) float
                        │
               [Gather(keep_idx)]   (only when drop≠None)
                        │
                     feature_i_out (N×K')

Each feature_i_out is concatenated along axis=1 to form the final output.

Parameters:

g – the graph builder to add nodes to
sts – shapes defined by scikit-learn
outputs – desired output tensor names
estimator – a fitted OneHotEncoder
X – name of the input tensor
name – prefix used for names of nodes added by this converter

Returns:

name of the output tensor

Raises:

ValueError – when no output columns remain after applying drop