yobx.sklearn.cluster.mini_batch_kmeans#

yobx.sklearn.cluster.mini_batch_kmeans.sklearn_mini_batch_kmeans(g: GraphBuilderExtendedProtocol, sts: Dict, outputs: List[str], estimator: MiniBatchKMeans, X: str, name: str = 'mini_batch_kmeans') str | Tuple[str, str][source]#

Converts a sklearn.cluster.MiniBatchKMeans into ONNX.

The converter produces two outputs: the predicted cluster labels (equivalent to predict()) and the Euclidean distances from each sample to every cluster centre (equivalent to transform()).

Two computation paths are used depending on the available opsets:

With com.microsoft opset (CDist path):

X (N,F)  centers (K,F)
      │       │
 com.microsoft.CDist(metric="euclidean") ──► distances (N,K)
                                                  │
                                       ArgMin(axis=1) ──► labels (N,)

Without com.microsoft opset (standard ONNX path):

X (N,F)
  │
  ├──Mul──ReduceSum(axis=1, keepdims=1)──────────────────────────────► x_sq (N,1)
  │                                                                         │
  └──MatMul(centersᵀ)────────────────────────────────────────────────► cross (N,K)
                                                                            │
c_sq (1,K) ─────────────────────── Add(x_sq) ─── Sub(Mul(2,cross)) ──► sq_dists (N,K)
                                                                            │
                                           Sqrt ──────────────────────► distances (N,K)
                                                                            │
                                       ArgMin(axis=1) ─────────────────► labels (N,)
Parameters:
  • g – the graph builder to add nodes to

  • sts – shapes defined by scikit-learn

  • estimator – a fitted MiniBatchKMeans

  • outputs – desired output names; outputs[0] receives the cluster labels and outputs[1] (if present) receives the distances matrix

  • X – input tensor name

  • name – name prefix for the added nodes

Returns:

tuple (labels, distances) when two outputs are requested, otherwise just labels