yobx.sklearn.cluster.feature_agglomeration#

yobx.sklearn.cluster.feature_agglomeration.sklearn_feature_agglomeration(g: GraphBuilderExtendedProtocol, sts: Dict, outputs: List[str], estimator: FeatureAgglomeration, X: str, name: str = 'feature_agglomeration') str[source]#

Converts a sklearn.cluster.FeatureAgglomeration into ONNX.

The converter replicates transform(), which pools features that belong to the same cluster using pooling_func (default: numpy.mean). The fitted labels_ attribute assigns each input feature to a cluster index in [0, n_clusters).

Supported pooling functions:

  • numpy.mean — implemented as a single MatMul with a precomputed weight matrix W of shape (n_features, n_clusters) where W[i, c] = 1 / count_c when labels_[i] == c and 0 otherwise. This replicates the fast bincount-based path in scikit-learn.

  • numpy.max — implemented as per-cluster Gather + ReduceMax(axis=1) followed by a Concat.

  • numpy.min — implemented as per-cluster Gather + ReduceMin(axis=1) followed by a Concat.

**numpy.mean path**

X (N, F)
  │
  └──MatMul(W)──► transform_output (N, n_clusters)

where W (F, n_clusters): W[i, c] = 1/count_c if labels_[i]==c else 0

**numpy.max / numpy.min path**

X (N, F)
  │
  ├──Gather(cols_0, axis=1)──ReduceMax/Min(axis=1)──► cluster_0 (N,1)
  ├──Gather(cols_1, axis=1)──ReduceMax/Min(axis=1)──► cluster_1 (N,1)
  │  …
  └──Concat(axis=1)─────────────────────────────────► transform_output (N, C)
Parameters:
  • g – the graph builder to add nodes to

  • sts – shapes defined by scikit-learn

  • estimator – a fitted FeatureAgglomeration

  • outputs – desired output names; outputs[0] receives the transformed feature matrix

  • X – input tensor name

  • name – prefix for added node names

Returns:

name of the output tensor of shape (N, n_clusters)