yobx.sklearn.multiclass.output_code#
- yobx.sklearn.multiclass.output_code.sklearn_output_code_classifier(g: GraphBuilderExtendedProtocol, sts: Dict, outputs: List[str], estimator: OutputCodeClassifier, X: str, name: str = 'output_code') str[source]#
Converts a
sklearn.multiclass.OutputCodeClassifierinto ONNX.The converter iterates over the fitted binary sub-estimators, calls the registered converter for each one to obtain per-class positive-class probabilities (
predict_proba[:, 1]), stacks them into a score matrixYof shape(N, M)(where M is the number of sub-estimators), and then finds the nearest row incode_book_using squared Euclidean distance.Note
sklearn’s
predict()uses_predict_binary(), which callsdecision_functionwhen available and falls back topredict_proba[:, 1]. This ONNX converter always usespredict_proba[:, 1]for all sub-estimators, matching sklearn exactly for classifiers that expose onlypredict_proba(such asDecisionTreeClassifier). For classifiers with adecision_function(e.g.LogisticRegression), the ONNX output may differ from sklearn’s prediction in borderline cases.Two distance-computation paths:
With
com.microsoftopset (CDist fast path):X --[sub-est 0]--> proba_0 (N,2) --Slice[:,1]--> pred_0 (N,1) --+ X --[sub-est 1]--> proba_1 (N,2) --Slice[:,1]--> pred_1 (N,1) --| Concat ... | axis=1 X --[sub-est M-1]-> proba_{M-1} --Slice[:,1]--> pred_{M-1} --+ | Y (N,M) | code_book_ (C,M) --com.microsoft.CDist(sqeuclidean)---------> sq_dists (N,C) | ArgMin(axis=1) --+-> label_idx (N,) | Gather(classes_, label_idx) -------------> label (N,)Without
com.microsoftopset (standard ONNX path):X --[sub-est 0]--> proba_0 (N,2) --Slice[:,1]--> pred_0 (N,1) --+ X --[sub-est 1]--> proba_1 (N,2) --Slice[:,1]--> pred_1 (N,1) --| Concat ... | axis=1 X --[sub-est M-1]-> proba_{M-1} --Slice[:,1]--> pred_{M-1} --+ | Y (N,M) | code_book_T (M,C) --MatMul(Y) ---------------------------> cross (N,C) y_sq (N,M) --ReduceSum(axis=1,keepdims=1) ----------------> y_norms (N,1) cb_sq (1,C) --------------------------------- Add(y_norms) -> y_plus_cb (N,C) Sub(Mul(2, cross)) -------------> sq_dists (N,C) | ArgMin(axis=1) --+-> label_idx (N,) | Gather(classes_, label_idx) -------------> label (N,)- Parameters:
g – the graph builder to add nodes to
sts – shapes and types defined by scikit-learn
outputs – desired output tensor names (label only; OutputCodeClassifier has no
predict_proba)estimator – a fitted
OutputCodeClassifierX – name of the input tensor
name – prefix used for names of nodes added by this converter
- Returns:
label tensor name
- Raises:
NotImplementedError – when a sub-estimator does not expose
predict_proba()