onnx_diagnostic.helpers._log_helper¶
- onnx_diagnostic.helpers._log_helper.align_dataframe_with(df: DataFrame, baseline: DataFrame, fill_value: float = 0) DataFrame | None [source][source]¶
Modifies the first dataframe df to get the exact same number of columns and rows. They must share the same levels on both axes. Empty cells are filled with 0. We only keep the numerical columns. The function return None if the output is empty.
- onnx_diagnostic.helpers._log_helper.apply_excel_style(filename_or_writer: Any, f_highlights: Dict[str, Callable[[Any], CubeViewDef.HighLightKind]] | None = None, time_mask_view: Dict[str, DataFrame] | None = None, verbose: int = 0)[source][source]¶
Applies styles on all sheets in a file unless the sheet is too big.
- Parameters:
filename_or_writer – filename, modified inplace
f_highlight – color function to apply, one per sheet
time_mask_view – if specified, it contains dataframe with the same shape and values in {-1, 0, +1} which indicates if a value is unexpectedly lower (-1) or higher (+1), it changes the color of the background then.
verbosity – progress loop
- onnx_diagnostic.helpers._log_helper.breaking_last_point(series: Sequence[float], threshold: float = 1.2)[source][source]¶
Assuming a timeseries is constant, we check the last value is not an outlier.
- Parameters:
series – series
- Returns:
significant change (-1, 0, +1), test value
- onnx_diagnostic.helpers._log_helper.enumerate_csv_files(data: DataFrame | List[str | Tuple[str, str]] | str | Tuple[str, str, str, str], verbose: int = 0, filtering: Callable[[str], bool] | None = None) Iterator[DataFrame | str | Tuple[str, str, str, str]] [source][source]¶
Enumerates files considered for the aggregation. Only csv files are considered. If a zip file is given, the function digs into the zip files and loops over csv candidates.
- Parameters:
data – dataframe with the raw data or a file or list of files
vrbose – verbosity
filtering – function to filter in or out files in zip files, must return true to keep the file, false to skip it.
- Returns:
a generator yielding tuples with the filename, date, full path and zip file
data can contains: * a dataframe * a string for a filename, zip or csv * a list of string * a tuple
- onnx_diagnostic.helpers._log_helper.filter_data(df: DataFrame, filter_in: str | None = None, filter_out: str | None = None, verbose: int = 0) DataFrame [source][source]¶
Argument filter follows the syntax
<column1>:<fmt1>//<column2>:<fmt2>
.The format is the following:
a value or a set of values separated by
;
- onnx_diagnostic.helpers._log_helper.mann_kendall(series: Sequence[float], threshold: float = 0.5)[source][source]¶
Computes the test of Mann-Kendall.
- Parameters:
series – series
threshold – 1.96 is the usual value, 0.5 means a short timeseries
(0, 1, 2, 3, 4)
has a significant trend
- Returns:
trend (-1, 0, +1), test value
where the function sign is:
And:
- onnx_diagnostic.helpers._log_helper.open_dataframe(data: str | Tuple[str, str, str, str] | DataFrame) DataFrame [source][source]¶
Opens a filename defined by function
onnx_diagnostic.helpers.log_helper.enumerate_csv_files()
.- Parameters:
data – a dataframe, a filename, a tuple indicating the file is coming from a zip file
- Returns:
a dataframe