onnx_diagnostic.helpers.log_helper¶
- class onnx_diagnostic.helpers.log_helper.CubeLogs(data: Any, time: str = 'date', keys: Sequence[str] = ('version_.*', 'model_.*'), values: Sequence[str] = ('time_.*', 'disc_.*'), ignored: Sequence[str] = (), recent: bool = False, formulas: Dict[str, Callable[[DataFrame], Series]] | None = None)[source][source]¶
Processes logs coming from experiments.
- to_excel(output: str, views: Dict[str, CubeViewDef], main: str | None = 'main', raw: str | None = 'raw', verbose: int = 0)[source][source]¶
Creates an excel file with a list of view.
- Parameters:
output – output file to create
views – list of views to append
main – add a page with statitcs on all variables
raw – add a page with the raw data
verbose – verbosity
- view(view_def: CubeViewDef) DataFrame [source][source]¶
Returns a dataframe, a pivot view. key_index determines the index, the other key columns determines the columns. If ignore_unique is True, every columns with a unique value is removed.
- Parameters:
view_def – view definition
- Returns:
dataframe
- class onnx_diagnostic.helpers.log_helper.CubeViewDef(key_index: Sequence[str], values: Sequence[str], ignore_unique: bool = True, order: Sequence[str] | None = None, key_agg: Sequence[str] | None = None, agg_args: Sequence[Any] = ('sum',), agg_kwargs: Dict[str, Any] | None = None)[source][source]¶
Defines how to compute a view.
- Parameters:
key_index – keys to put in the row index
values – values to show
ignore_unique – ignore keys with a unique value
order – to reorder key in columns index
key_agg – aggregate according to these columns before creating the view
agg_args – see
pandas.core.groupby.DataFrameGroupBy.agg()
agg_kwargs – see
pandas.core.groupby.DataFrameGroupBy.agg()
- onnx_diagnostic.helpers.log_helper.enumerate_csv_files(data: DataFrame | List[str | Tuple[str, str]] | str | Tuple[str, str, str, str], verbose: int = 0) Iterator[DataFrame | str | Tuple[str, str, str, str]] [source][source]¶
Enumerates files considered for the aggregation. Only csv files are considered. If a zip file is given, the function digs into the zip files and loops over csv candidates.
- Parameters:
data – dataframe with the raw data or a file or list of files
data can contains: * a dataframe * a string for a filename, zip or csv * a list of string * a tuple