yobx.sklearn.feature_extraction.tfidf_transformer#

yobx.sklearn.feature_extraction.tfidf_transformer.sklearn_tfidf_transformer(g: GraphBuilderExtendedProtocol, sts: Dict, outputs: List[str], estimator: TfidfTransformer, X: str, name: str = 'tfidf_transformer') → str[source]#

Converts a sklearn.feature_extraction.text.TfidfTransformer into ONNX.

The transformer applies the following steps in order:

Term-frequency scaling (sublinear_tf): if True, replace each non-zero count with 1 + log(count); zero counts stay zero.
IDF weighting (use_idf): if True, multiply each term-frequency value element-wise by the fitted idf_ vector.
Row normalisation (norm): scale each row to unit 'l2' or 'l1' norm; None skips this step.

Graph layout (all three options active):

X  ──Greater(0)──────────────────────────────────┐
   ──Log ──────── Add(1) ──── Where(>0, ·, 0) ───┤
                                                  Mul(idf_) ──ReduceL2──Div── output

Parameters:

g – the graph builder to add nodes to
sts – shapes defined by scikit-learn
estimator – a fitted TfidfTransformer
outputs – desired output names
X – input tensor name (shape (N, F), dtype float32 or float64)
name – prefix for added node names

Returns:

output tensor name