yobx.sklearn.ensemble.random_trees_embedding#

yobx.sklearn.ensemble.random_trees_embedding.sklearn_random_trees_embedding(g: GraphBuilderExtendedProtocol, sts: Dict, outputs: List[str], estimator: RandomTreesEmbedding, X: str, name: str = 'random_trees_embedding') str[source]#

Converts a sklearn.ensemble.RandomTreesEmbedding into ONNX.

RandomTreesEmbedding maps inputs through a forest of totally random trees. Each tree returns the leaf node id for a given sample; the leaf ids from all trees are then one-hot encoded and concatenated, yielding a high-dimensional binary embedding.

The conversion mirrors the scikit-learn transform implementation:

  1. For each fitted ExtraTreeRegressor in estimators_, emit a _make_leaf_id_node sub-graph that outputs the (float) leaf node id for every sample – shape (N, 1).

  2. Concatenate the per-tree leaf-id columns along axis=1 to produce a matrix of shape (N, n_estimators).

  3. Apply the fitted one_hot_encoder_ via the registered OneHotEncoder converter to obtain the final (N, total_leaves) embedding.

Both float32 and float64 inputs are handled correctly: the leaf-id tensors inherit the input dtype (ai.onnx.ml opset 5 path), and the one-hot indicator cast respects that dtype too, so the output dtype matches the input dtype.

Parameters:
  • g – the graph builder to add nodes to

  • sts – shapes defined by scikit-learn

  • outputs – desired output tensor names

  • estimator – a fitted RandomTreesEmbedding

  • X – name of the input tensor

  • name – prefix used for names of nodes added by this converter

Returns:

name of the output tensor