yobx.helpers.stats_helper#

ModelStatistics#

class yobx.helpers.ModelStatistics(model: ModelProto | GraphBuilderExtendedProtocol, verbose: int = 0)[source]#

Computes statistics on an ONNX model, including node counts per op_type and estimated FLOPs.

model may be either an onnx.ModelProto or a GraphBuilderExtendedProtocol instance. When a graph builder is provided its already-computed shape information is used directly (no second shape-inference pass is run) and the ONNX model is obtained via to_onnx().

Parameters:

model – ONNX model or graph builder
verbose – verbosity level passed to BasicShapeBuilder (ignored when model is a graph builder)

Usage:

stats = ModelStatistics(model).compute()

compute() → Dict[str, Any][source]#

Runs the full analysis and returns a statistics dictionary.

Returns:

dictionary with the following keys:

"n_nodes" – total number of nodes
"node_count_per_op_type" – dict mapping op_type → count
"total_estimated_flops" – total estimated FLOPs (int or None)
"flops_per_op_type" – dict mapping op_type → estimated FLOPs (None means the cost could not be estimated for some nodes)
"node_stats" – list of per-node dicts with keys op_type, name, inputs, outputs, estimated_flops

literal_fn(name: str) → Tuple[int, ...] | None[source]#: Returns the integer values stored in a 1-D integer constant tensor (e.g. the shape input of a Reshape node), or None when name is not a known constant.

shape_fn(name: str) → Tuple | None[source]#: Returns the inferred shape of name, or None if unknown.

model_statistics#

yobx.helpers.model_statistics(model: ModelProto | GraphBuilderExtendedProtocol, verbose: int = 0) → Dict[str, Any][source]#

Computes statistics on an ONNX model.

This is a convenience wrapper around ModelStatistics.

Parameters:

model – ONNX model or graph builder
verbose – verbosity level

Returns:

statistics dictionary — see ModelStatistics.compute() for details

NodeStatistics#

class yobx.helpers.NodeStatistics(parent: GraphProto | FunctionProto, node: NodeProto)[source]#

Stores per-node statistics for a onnx.NodeProto.

Parameters:

parent – the GraphProto or FunctionProto that contains node
node – the ONNX node being described

property dict_values: Dict[str, Any]#: Returns the statistics as a flat dictionary for DataFrame construction.

TreeStatistics#

class yobx.helpers.TreeStatistics(node: NodeProto, tree_id: int)[source]#

Stores per-tree statistics extracted from TreeEnsemble* operators.

Parameters:

node – the TreeEnsembleClassifier or TreeEnsembleRegressor node
tree_id – zero-based index of this tree within the ensemble

property dict_values: Dict[str, Any]#: Returns the statistics as a flat dictionary for DataFrame construction.

HistTreeStatistics#

class yobx.helpers.HistTreeStatistics(node: NodeProto, featureid: int, values: ndarray, bins: int = 20)[source]#

Stores threshold-distribution statistics for a single feature across all trees in a TreeEnsemble* node.

Parameters:

node – the TreeEnsembleClassifier or TreeEnsembleRegressor node
featureid – zero-based feature index
values – array of threshold values for featureid
bins – number of histogram bins (default 20)

property dict_values: Dict[str, Any]#: Returns the statistics as a flat dictionary for DataFrame construction.

HistStatistics#

class yobx.helpers.HistStatistics(parent: GraphProto | FunctionProto, node: NodeProto | TensorProto | SparseTensorProto, bins: int = 20)[source]#

Stores distribution statistics for a constant tensor (initializer or Constant node).

Parameters:

parent – the GraphProto or FunctionProto that contains node
node – a NodeProto (Constant op), TensorProto, or SparseTensorProto
bins – number of histogram bins (default 20)

property dict_values: Dict[str, Any]#: Returns the statistics as a flat dictionary for DataFrame construction.

property name: str#: Returns the tensor name.

extract_attributes#

yobx.helpers.extract_attributes(node: NodeProto) → Dict[str, Any][source]#

Extracts all attributes of a node into a plain Python/NumPy dictionary.

Delegates to attr_proto_to_python() for scalar and tensor attribute types. List-typed attributes (INTS, FLOATS, STRINGS) are returned as NumPy arrays so that callers can use boolean-mask indexing directly. GRAPH and ref-attribute entries are stored as None.

Parameters:: node – node to inspect
Returns:: dictionary mapping attribute name to a Python/NumPy value, or None for graph and ref-attribute entries.

stats_tree_ensemble#

yobx.helpers.stats_tree_ensemble(parent: GraphProto | FunctionProto, node: NodeProto) → NodeStatistics[source]#

Computes statistics on every tree of a TreeEnsembleClassifier, TreeEnsembleRegressor, or TreeEnsemble (ai.onnx.ml opset 5) node.

The returned NodeStatistics instance contains the following entries:

"kind" – "Classifier", "Regressor", or "TreeEnsemble"
"n_trees" – total number of trees
"n_outputs" – number of outputs / classes
"max_featureid" – maximum feature index used across all nodes
"n_features" – number of distinct features used across all nodes
"n_rules" – number of distinct node modes (split types) used
"rules" – set of node mode strings (e.g. {"BRANCH_LEQ", "LEAF"})
"hist_rules" – collections.Counter of node mode frequencies
"features" – list of HistTreeStatistics, one per feature
"trees" – list of TreeStatistics, one per tree

Each TreeStatistics in "trees" contains:

"n_nodes" – total nodes in the tree
"n_leaves" – leaf nodes
"max_featureid" – maximum feature index
"n_features" – distinct feature count
"n_rules" – distinct split-mode count
"rules" – set of mode strings
"hist_rules" – collections.Counter of mode frequencies

For TreeEnsembleClassifier / TreeEnsembleRegressor (ai.onnx.ml opset ≤ 4) the legacy flat nodes_treeids / nodes_values / string nodes_modes attributes are used. For the unified TreeEnsemble operator (ai.onnx.ml opset ≥ 5) the tree_roots / nodes_splits / nodes_modes (UINT8 tensor) attributes are used instead.

Parameters:

parent – the GraphProto or FunctionProto that contains node
node – a TreeEnsembleClassifier, TreeEnsembleRegressor, or TreeEnsemble node

Returns:

NodeStatistics populated with the statistics listed above

Raises:

KeyError – if required tree-structure attributes are missing from node

enumerate_nodes#

Enumerates all nodes in a model.

Parameters:

onx – the model, graph, or function to traverse
recursive – if True, recurse into sub-graphs (e.g. inside If / Loop / Scan)

Returns:

yields tuples (path, parent, node) where path is a tuple of name strings identifying the location of node in the model, parent is the containing GraphProto or FunctionProto, and node is a NodeProto, TensorProto, or SparseTensorProto.

enumerate_stats_nodes#

Iterates over nodes in onx, yielding statistics for those that match entries in stats_fcts.

By default the function handles both TreeEnsembleClassifier and TreeEnsembleRegressor nodes in the "ai.onnx.ml" domain via stats_tree_ensemble().

Parameters:

onx – the model, graph, or function to traverse
recursive – if True, recurse into sub-graphs
stats_fcts – mapping of (domain, op_type) to a callable that accepts (parent, node) and returns a statistics object. When None the default handlers for tree-ensemble operators are used.

Returns:

yields tuples (path, parent, statistics) for every matched node

Module#

Functions to compute statistics on an ONNX model such as number of nodes per op_type and estimation of computational cost. Also provides classes and helpers for computing per-tree statistics on TreeEnsemble* operators.

Enumerates all nodes in a model.

Parameters:

onx – the model, graph, or function to traverse
recursive – if True, recurse into sub-graphs (e.g. inside If / Loop / Scan)

Returns:

Iterates over nodes in onx, yielding statistics for those that match entries in stats_fcts.

By default the function handles both TreeEnsembleClassifier and TreeEnsembleRegressor nodes in the "ai.onnx.ml" domain via stats_tree_ensemble().

Parameters:

onx – the model, graph, or function to traverse
recursive – if True, recurse into sub-graphs
stats_fcts – mapping of (domain, op_type) to a callable that accepts (parent, node) and returns a statistics object. When None the default handlers for tree-ensemble operators are used.

Returns:

yields tuples (path, parent, statistics) for every matched node

yobx.helpers.stats_helper.extract_attributes(node: NodeProto) → Dict[str, Any][source]#

Extracts all attributes of a node into a plain Python/NumPy dictionary.

Parameters:: node – node to inspect
Returns:: dictionary mapping attribute name to a Python/NumPy value, or None for graph and ref-attribute entries.

yobx.helpers.stats_helper.model_statistics(model: ModelProto | GraphBuilderExtendedProtocol, verbose: int = 0) → Dict[str, Any][source]#

Computes statistics on an ONNX model.

This is a convenience wrapper around ModelStatistics.

Parameters:

model – ONNX model or graph builder
verbose – verbosity level

Returns:

statistics dictionary — see ModelStatistics.compute() for details

yobx.helpers.stats_helper.stats_tree_ensemble(parent: GraphProto | FunctionProto, node: NodeProto) → NodeStatistics[source]#

Computes statistics on every tree of a TreeEnsembleClassifier, TreeEnsembleRegressor, or TreeEnsemble (ai.onnx.ml opset 5) node.

The returned NodeStatistics instance contains the following entries:

"kind" – "Classifier", "Regressor", or "TreeEnsemble"
"n_trees" – total number of trees
"n_outputs" – number of outputs / classes
"max_featureid" – maximum feature index used across all nodes
"n_features" – number of distinct features used across all nodes
"n_rules" – number of distinct node modes (split types) used
"rules" – set of node mode strings (e.g. {"BRANCH_LEQ", "LEAF"})
"hist_rules" – collections.Counter of node mode frequencies
"features" – list of HistTreeStatistics, one per feature
"trees" – list of TreeStatistics, one per tree

Each TreeStatistics in "trees" contains:

"n_nodes" – total nodes in the tree
"n_leaves" – leaf nodes
"max_featureid" – maximum feature index
"n_features" – distinct feature count
"n_rules" – distinct split-mode count
"rules" – set of mode strings
"hist_rules" – collections.Counter of mode frequencies

Parameters:

parent – the GraphProto or FunctionProto that contains node
node – a TreeEnsembleClassifier, TreeEnsembleRegressor, or TreeEnsemble node

Returns:

NodeStatistics populated with the statistics listed above

Raises:

KeyError – if required tree-structure attributes are missing from node