yobx.helpers.stats_helper#

ModelStatistics#

class yobx.helpers.ModelStatistics(model: ModelProto | GraphBuilderExtendedProtocol, verbose: int = 0)[source]#

Computes statistics on an ONNX model, including node counts per op_type and estimated FLOPs.

model may be either an onnx.ModelProto or a GraphBuilderExtendedProtocol instance. When a graph builder is provided its already-computed shape information is used directly (no second shape-inference pass is run) and the ONNX model is obtained via to_onnx().

Parameters:
  • model – ONNX model or graph builder

  • verbose – verbosity level passed to BasicShapeBuilder (ignored when model is a graph builder)

Usage:

stats = ModelStatistics(model).compute()
compute() Dict[str, Any][source]#

Runs the full analysis and returns a statistics dictionary.

Returns:

dictionary with the following keys:

  • "n_nodes" – total number of nodes

  • "node_count_per_op_type" – dict mapping op_type → count

  • "total_estimated_flops" – total estimated FLOPs (int or None)

  • "flops_per_op_type" – dict mapping op_type → estimated FLOPs (None means the cost could not be estimated for some nodes)

  • "node_stats" – list of per-node dicts with keys op_type, name, inputs, outputs, estimated_flops

literal_fn(name: str) Tuple[int, ...] | None[source]#

Returns the integer values stored in a 1-D integer constant tensor (e.g. the shape input of a Reshape node), or None when name is not a known constant.

shape_fn(name: str) Tuple | None[source]#

Returns the inferred shape of name, or None if unknown.

model_statistics#

yobx.helpers.model_statistics(model: ModelProto | GraphBuilderExtendedProtocol, verbose: int = 0) Dict[str, Any][source]#

Computes statistics on an ONNX model.

This is a convenience wrapper around ModelStatistics.

Parameters:
  • model – ONNX model or graph builder

  • verbose – verbosity level

Returns:

statistics dictionary — see ModelStatistics.compute() for details

NodeStatistics#

class yobx.helpers.NodeStatistics(parent: GraphProto | FunctionProto, node: NodeProto)[source]#

Stores per-node statistics for a onnx.NodeProto.

Parameters:
property dict_values: Dict[str, Any]#

Returns the statistics as a flat dictionary for DataFrame construction.

TreeStatistics#

class yobx.helpers.TreeStatistics(node: NodeProto, tree_id: int)[source]#

Stores per-tree statistics extracted from TreeEnsemble* operators.

Parameters:
  • node – the TreeEnsembleClassifier or TreeEnsembleRegressor node

  • tree_id – zero-based index of this tree within the ensemble

property dict_values: Dict[str, Any]#

Returns the statistics as a flat dictionary for DataFrame construction.

HistTreeStatistics#

class yobx.helpers.HistTreeStatistics(node: NodeProto, featureid: int, values: ndarray, bins: int = 20)[source]#

Stores threshold-distribution statistics for a single feature across all trees in a TreeEnsemble* node.

Parameters:
  • node – the TreeEnsembleClassifier or TreeEnsembleRegressor node

  • featureid – zero-based feature index

  • values – array of threshold values for featureid

  • bins – number of histogram bins (default 20)

property dict_values: Dict[str, Any]#

Returns the statistics as a flat dictionary for DataFrame construction.

HistStatistics#

class yobx.helpers.HistStatistics(parent: GraphProto | FunctionProto, node: NodeProto | TensorProto | SparseTensorProto, bins: int = 20)[source]#

Stores distribution statistics for a constant tensor (initializer or Constant node).

Parameters:
property dict_values: Dict[str, Any]#

Returns the statistics as a flat dictionary for DataFrame construction.

property name: str#

Returns the tensor name.

extract_attributes#

yobx.helpers.extract_attributes(node: NodeProto) Dict[str, Any][source]#

Extracts all attributes of a node into a plain Python/NumPy dictionary.

Delegates to attr_proto_to_python() for scalar and tensor attribute types. List-typed attributes (INTS, FLOATS, STRINGS) are returned as NumPy arrays so that callers can use boolean-mask indexing directly. GRAPH and ref-attribute entries are stored as None.

Parameters:

node – node to inspect

Returns:

dictionary mapping attribute name to a Python/NumPy value, or None for graph and ref-attribute entries.

stats_tree_ensemble#

yobx.helpers.stats_tree_ensemble(parent: GraphProto | FunctionProto, node: NodeProto) NodeStatistics[source]#

Computes statistics on every tree of a TreeEnsembleClassifier, TreeEnsembleRegressor, or TreeEnsemble (ai.onnx.ml opset 5) node.

The returned NodeStatistics instance contains the following entries:

  • "kind""Classifier", "Regressor", or "TreeEnsemble"

  • "n_trees" – total number of trees

  • "n_outputs" – number of outputs / classes

  • "max_featureid" – maximum feature index used across all nodes

  • "n_features" – number of distinct features used across all nodes

  • "n_rules" – number of distinct node modes (split types) used

  • "rules"set of node mode strings (e.g. {"BRANCH_LEQ", "LEAF"})

  • "hist_rules"collections.Counter of node mode frequencies

  • "features" – list of HistTreeStatistics, one per feature

  • "trees" – list of TreeStatistics, one per tree

Each TreeStatistics in "trees" contains:

  • "n_nodes" – total nodes in the tree

  • "n_leaves" – leaf nodes

  • "max_featureid" – maximum feature index

  • "n_features" – distinct feature count

  • "n_rules" – distinct split-mode count

  • "rules"set of mode strings

  • "hist_rules"collections.Counter of mode frequencies

For TreeEnsembleClassifier / TreeEnsembleRegressor (ai.onnx.ml opset ≤ 4) the legacy flat nodes_treeids / nodes_values / string nodes_modes attributes are used. For the unified TreeEnsemble operator (ai.onnx.ml opset ≥ 5) the tree_roots / nodes_splits / nodes_modes (UINT8 tensor) attributes are used instead.

Parameters:
  • parent – the GraphProto or FunctionProto that contains node

  • node – a TreeEnsembleClassifier, TreeEnsembleRegressor, or TreeEnsemble node

Returns:

NodeStatistics populated with the statistics listed above

Raises:

KeyError – if required tree-structure attributes are missing from node

enumerate_nodes#

yobx.helpers.enumerate_nodes(onx: FunctionProto | GraphProto | ModelProto, recursive: bool = True) Iterable[Tuple[Tuple[str, ...], GraphProto | FunctionProto, NodeProto | TensorProto | SparseTensorProto]][source]#

Enumerates all nodes in a model.

Parameters:
  • onx – the model, graph, or function to traverse

  • recursive – if True, recurse into sub-graphs (e.g. inside If / Loop / Scan)

Returns:

yields tuples (path, parent, node) where path is a tuple of name strings identifying the location of node in the model, parent is the containing GraphProto or FunctionProto, and node is a NodeProto, TensorProto, or SparseTensorProto.

enumerate_stats_nodes#

yobx.helpers.enumerate_stats_nodes(onx: FunctionProto | GraphProto | ModelProto, recursive: bool = True, stats_fcts: Dict[Tuple[str, str], Callable[[GraphProto | FunctionProto, NodeProto | TensorProto | SparseTensorProto], NodeStatistics | HistStatistics]] | None = None) Iterable[Tuple[Tuple[str, ...], GraphProto | FunctionProto, NodeStatistics | HistStatistics]][source]#

Iterates over nodes in onx, yielding statistics for those that match entries in stats_fcts.

By default the function handles both TreeEnsembleClassifier and TreeEnsembleRegressor nodes in the "ai.onnx.ml" domain via stats_tree_ensemble().

Parameters:
  • onx – the model, graph, or function to traverse

  • recursive – if True, recurse into sub-graphs

  • stats_fcts – mapping of (domain, op_type) to a callable that accepts (parent, node) and returns a statistics object. When None the default handlers for tree-ensemble operators are used.

Returns:

yields tuples (path, parent, statistics) for every matched node

Module#

Functions to compute statistics on an ONNX model such as number of nodes per op_type and estimation of computational cost. Also provides classes and helpers for computing per-tree statistics on TreeEnsemble* operators.

yobx.helpers.stats_helper.enumerate_nodes(onx: FunctionProto | GraphProto | ModelProto, recursive: bool = True) Iterable[Tuple[Tuple[str, ...], GraphProto | FunctionProto, NodeProto | TensorProto | SparseTensorProto]][source]#

Enumerates all nodes in a model.

Parameters:
  • onx – the model, graph, or function to traverse

  • recursive – if True, recurse into sub-graphs (e.g. inside If / Loop / Scan)

Returns:

yields tuples (path, parent, node) where path is a tuple of name strings identifying the location of node in the model, parent is the containing GraphProto or FunctionProto, and node is a NodeProto, TensorProto, or SparseTensorProto.

yobx.helpers.stats_helper.enumerate_stats_nodes(onx: FunctionProto | GraphProto | ModelProto, recursive: bool = True, stats_fcts: Dict[Tuple[str, str], Callable[[GraphProto | FunctionProto, NodeProto | TensorProto | SparseTensorProto], NodeStatistics | HistStatistics]] | None = None) Iterable[Tuple[Tuple[str, ...], GraphProto | FunctionProto, NodeStatistics | HistStatistics]][source]#

Iterates over nodes in onx, yielding statistics for those that match entries in stats_fcts.

By default the function handles both TreeEnsembleClassifier and TreeEnsembleRegressor nodes in the "ai.onnx.ml" domain via stats_tree_ensemble().

Parameters:
  • onx – the model, graph, or function to traverse

  • recursive – if True, recurse into sub-graphs

  • stats_fcts – mapping of (domain, op_type) to a callable that accepts (parent, node) and returns a statistics object. When None the default handlers for tree-ensemble operators are used.

Returns:

yields tuples (path, parent, statistics) for every matched node

yobx.helpers.stats_helper.extract_attributes(node: NodeProto) Dict[str, Any][source]#

Extracts all attributes of a node into a plain Python/NumPy dictionary.

Delegates to attr_proto_to_python() for scalar and tensor attribute types. List-typed attributes (INTS, FLOATS, STRINGS) are returned as NumPy arrays so that callers can use boolean-mask indexing directly. GRAPH and ref-attribute entries are stored as None.

Parameters:

node – node to inspect

Returns:

dictionary mapping attribute name to a Python/NumPy value, or None for graph and ref-attribute entries.

yobx.helpers.stats_helper.model_statistics(model: ModelProto | GraphBuilderExtendedProtocol, verbose: int = 0) Dict[str, Any][source]#

Computes statistics on an ONNX model.

This is a convenience wrapper around ModelStatistics.

Parameters:
  • model – ONNX model or graph builder

  • verbose – verbosity level

Returns:

statistics dictionary — see ModelStatistics.compute() for details

yobx.helpers.stats_helper.stats_tree_ensemble(parent: GraphProto | FunctionProto, node: NodeProto) NodeStatistics[source]#

Computes statistics on every tree of a TreeEnsembleClassifier, TreeEnsembleRegressor, or TreeEnsemble (ai.onnx.ml opset 5) node.

The returned NodeStatistics instance contains the following entries:

  • "kind""Classifier", "Regressor", or "TreeEnsemble"

  • "n_trees" – total number of trees

  • "n_outputs" – number of outputs / classes

  • "max_featureid" – maximum feature index used across all nodes

  • "n_features" – number of distinct features used across all nodes

  • "n_rules" – number of distinct node modes (split types) used

  • "rules"set of node mode strings (e.g. {"BRANCH_LEQ", "LEAF"})

  • "hist_rules"collections.Counter of node mode frequencies

  • "features" – list of HistTreeStatistics, one per feature

  • "trees" – list of TreeStatistics, one per tree

Each TreeStatistics in "trees" contains:

  • "n_nodes" – total nodes in the tree

  • "n_leaves" – leaf nodes

  • "max_featureid" – maximum feature index

  • "n_features" – distinct feature count

  • "n_rules" – distinct split-mode count

  • "rules"set of mode strings

  • "hist_rules"collections.Counter of mode frequencies

For TreeEnsembleClassifier / TreeEnsembleRegressor (ai.onnx.ml opset ≤ 4) the legacy flat nodes_treeids / nodes_values / string nodes_modes attributes are used. For the unified TreeEnsemble operator (ai.onnx.ml opset ≥ 5) the tree_roots / nodes_splits / nodes_modes (UINT8 tensor) attributes are used instead.

Parameters:
  • parent – the GraphProto or FunctionProto that contains node

  • node – a TreeEnsembleClassifier, TreeEnsembleRegressor, or TreeEnsemble node

Returns:

NodeStatistics populated with the statistics listed above

Raises:

KeyError – if required tree-structure attributes are missing from node