yobx.sklearn.category_encoders.binary_encoder#
- yobx.sklearn.category_encoders.binary_encoder.category_encoders_binary_encoder(g: GraphBuilderExtendedProtocol, sts: Dict, outputs: List[str], estimator: BinaryEncoder, X: str, name: str = 'binary_encoder') str[source]#
Converts a
category_encoders.BinaryEncoderinto ONNX.Each categorical column is replaced by a block of binary indicator columns that encode the ordinal index of the category value in base 2 (MSB first). Non-categorical columns pass through unchanged.
X ──col_j (categorical, K cats)──► bit_0 (MSB) (N, 1) bit_1 (N, 1) ... bit_B (LSB) (N, 1) Concat(bit_0 ... bit_B, axis=1)──► block (N, B) X ──col_k (numerical)──► unchanged (N, 1) Concat(all blocks and pass-through cols, axis=1)──► output (N, F_out)where
Bis the number of bits required to represent the largest ordinal in binary (max_ordinal.bit_length(), e.g. 4 categories with ordinals 1–4 giveB = 3) andF_outis the total number of output columns.The conversion reads the fitted
ordinal_encoderattribute to determine the known category values and their ordinal assignments.Unknown categories (values not seen during training):
handle_unknown='value'(default): all binary columns for that row are 0.handle_unknown='return_nan': all binary columns for that row areNaN.
Missing values (NaN inputs):
handle_missing='value'(default): all binary columns for that row are 0.handle_missing='return_nan': all binary columns for that row areNaN.
- Parameters:
g – the graph builder to add nodes to
sts – shapes defined by scikit-learn
outputs – desired output tensor names
estimator – a fitted
BinaryEncoderX – name of the input tensor (shape
(N, F))name – prefix used for names of nodes added by this converter
- Returns:
name of the output tensor
- Raises:
AssertionError – if
estimatoris not fitted or type info is missing from the graph