yobx.sklearn.cluster.kmeans#

yobx.sklearn.cluster.kmeans.sklearn_kmeans(g: GraphBuilderExtendedProtocol, sts: Dict, outputs: List[str], estimator: KMeans, X: str, name: str = 'kmeans') str | Tuple[str, str][source]#

Converts a sklearn.cluster.KMeans into ONNX.

The converter produces two outputs: the predicted cluster labels (equivalent to predict()) and the Euclidean distances from each sample to every cluster centre (equivalent to transform()).

The squared Euclidean distance between a sample x and a centre c is computed without explicit broadcasting via the identity:

||x - c||² = ||x||² - 2·x·cᵀ + ||c||²

Full graph structure:

X (N,F)
  │
  ├──Mul──ReduceSum(axis=1, keepdims=1)──────────────────────────────► x_sq (N,1)
  │                                                                         │
  └──MatMul(centersᵀ)────────────────────────────────────────────────► cross (N,K)
                                                                            │
c_sq (1,K) ─────────────────────── Add(x_sq) ─── Sub(Mul(2,cross)) ──► sq_dists (N,K)
                                                                            │
                                           Sqrt ──────────────────────► distances (N,K)
                                                                            │
                                       ArgMin(axis=1) ─────────────────► labels (N,)
Parameters:
  • g – the graph builder to add nodes to

  • sts – shapes defined by scikit-learn

  • estimator – a fitted KMeans

  • outputs – desired output names; outputs[0] receives the cluster labels and outputs[1] (if present) receives the distances matrix

  • X – input tensor name

  • name – prefix names for the added nodes

Returns:

tuple (labels, distances) when two outputs are requested, otherwise just labels