yobx.sklearn.ensemble.random_trees_embedding#
- yobx.sklearn.ensemble.random_trees_embedding.sklearn_random_trees_embedding(g: GraphBuilderExtendedProtocol, sts: Dict, outputs: List[str], estimator: RandomTreesEmbedding, X: str, name: str = 'random_trees_embedding') str[source]#
Converts a
sklearn.ensemble.RandomTreesEmbeddinginto ONNX.RandomTreesEmbeddingmaps inputs through a forest of totally random trees. Each tree returns the leaf node id for a given sample; the leaf ids from all trees are then one-hot encoded and concatenated, yielding a high-dimensional binary embedding.The conversion mirrors the scikit-learn
transformimplementation:For each fitted
ExtraTreeRegressorinestimators_, emit a_make_leaf_id_nodesub-graph that outputs the (float) leaf node id for every sample – shape(N, 1).Concatenate the per-tree leaf-id columns along
axis=1to produce a matrix of shape(N, n_estimators).Apply the fitted
one_hot_encoder_via the registeredOneHotEncoderconverter to obtain the final(N, total_leaves)embedding.
Both
float32andfloat64inputs are handled correctly: the leaf-id tensors inherit the input dtype (ai.onnx.mlopset 5 path), and the one-hot indicator cast respects that dtype too, so the output dtype matches the input dtype.- Parameters:
g – the graph builder to add nodes to
sts – shapes defined by scikit-learn
outputs – desired output tensor names
estimator – a fitted
RandomTreesEmbeddingX – name of the input tensor
name – prefix used for names of nodes added by this converter
- Returns:
name of the output tensor