yobx.sklearn.feature_extraction.tfidf_transformer#

yobx.sklearn.feature_extraction.tfidf_transformer.sklearn_tfidf_transformer(g: GraphBuilderExtendedProtocol, sts: Dict, outputs: List[str], estimator: TfidfTransformer, X: str, name: str = 'tfidf_transformer') str[source]#

Converts a sklearn.feature_extraction.text.TfidfTransformer into ONNX.

The transformer applies the following steps in order:

  1. Term-frequency scaling (sublinear_tf): if True, replace each non-zero count with 1 + log(count); zero counts stay zero.

  2. IDF weighting (use_idf): if True, multiply each term-frequency value element-wise by the fitted idf_ vector.

  3. Row normalisation (norm): scale each row to unit 'l2' or 'l1' norm; None skips this step.

Graph layout (all three options active):

X  ──Greater(0)──────────────────────────────────┐
   ──Log ──────── Add(1) ──── Where(>0, ·, 0) ───┤
                                                  Mul(idf_) ──ReduceL2──Div── output
Parameters:
  • g – the graph builder to add nodes to

  • sts – shapes defined by scikit-learn

  • estimator – a fitted TfidfTransformer

  • outputs – desired output names

  • X – input tensor name (shape (N, F), dtype float32 or float64)

  • name – prefix for added node names

Returns:

output tensor name