experimental_experiment.torch_bench._bash_bench_benchmark_runner_agg¶
- experimental_experiment.torch_bench._bash_bench_benchmark_runner_agg.enumerate_csv_files(data: DataFrame | List[str | Tuple[str, str]] | str | Tuple[str, str, str, str], verbose: int = 0) Iterator[DataFrame | str | Tuple[str, str, str, str]] [source]¶
Enumerates files considered for the aggregation. Only csv files are considered. If a zip file is given, the function digs into the zip files and loops over csv candidates.
- Parameters:
data – dataframe with the raw data or a file or list of files
data can contains: * a dataframe * a string for a filename, zip or csv * a list of string * a tuple
- experimental_experiment.torch_bench._bash_bench_benchmark_runner_agg.merge_benchmark_reports(data: DataFrame | List[str] | str, model=('suite', 'model_name'), keys=('architecture', 'exporter', 'opt_patterns', 'rtopt', 'device', 'device_name', 'dtype', 'dynamic', 'flag_fake_tensor', 'flag_no_grad', 'flag_training', 'machine', 'processor', 'processor_name', 'version_python', 'version_onnx', 'version_onnxruntime', 'version_onnxscript', 'version_tag', 'version_torch', 'version_transformers', 'version_monai', 'version_timm', 'strategy'), column_keys=('stat', 'exporter', 'opt_patterns', 'dynamic', 'rtopt'), report_on=('speedup', 'speedup_increase', 'speedup_med', 'discrepancies_*', 'TIME_ITER', 'time_*', 'ERR_*', 'onnx_*', 'op_*', 'memory_*', 'mem_*', 'config_*', 'torch_*'), formulas=('export', 'memory_peak', 'buckets', 'status', 'memory_delta', 'control_flow', 'pass_rate', 'accuracy_rate', 'date', 'correction', 'error'), timestamp_column: str = 'timestamp', excel_output: str | None = None, exc: bool = True, filter_in: str | None = None, filter_out: str | None = None, verbose: int = 0, output_clean_raw_data: str | None = None, baseline: DataFrame | None = None, export_simple: str | None = None, export_correlations: str | None = None, broken: bool = False, disc: float | None = None, slow: float | None = None, fast: float | None = None, slow_script: float | None = None, fast_script: float | None = None, exclude: List[int] | None = None, keep_more_recent: bool = False) Dict[str, DataFrame] [source]¶
Merges multiple files produced by bash_benchmark…
_index,DATE,ERR_export,ITER,TIME_ITER,capability,cpu,date_start,device,device_name,... 101Dummy-custom,2024-07-08,,0,7.119158490095288,7.0,40,2024-07-08,cuda,... 101Dummy-script,2024-07-08,,1,6.705480073112994,7.0,40,2024-07-08,cuda,... 101Dummy16-custom,2024-07-08,,2,6.970448340754956,7.0,40,2024-07-08,cuda,...
- Parameters:
data – dataframe with the raw data or a file or list of files
model – columns defining a unique model
keys – colimns definined a unique experiment
report_on – report on those metrics,
<prefix>*
means all columns starting with this prefixformulas – add computed metrics
timestamp_column – a day, used to tell the user this was run on this day
excel_output – output the computed dataframe into a excel document
exc – raise exception by default
filter_in – filter in some data to make the report smaller (see below)
filter_out – filter out some data to make the report smaller (see below)
verbose – verbosity
output_clean_raw_data – output the concatenated raw data so that it can be used later to make a comparison
baseline – to compute difference
export_simple – if not None, export simple in this file.
export_correlations – if not None, export correlations between exporters
broken – produce a document for the broken models per exporter
slow – produce a document for the slow models per exporter
fast – produce a document for the fast models per exporter
slow_script – produce a document for the slow models per exporter compare to torch_script
fast_script – produce a document for the fast models per exporter compare to torch_script
exclude – exclude a list of files in the list
keep_more_recent – in case of duplicates, keep the most recent value
- Returns:
dictionary of dataframes
Every key with a unique value is removed. Every column with a unique value is displayed on main. List of knowns columns:
DATE ERR_export ERR_warmup ITER TIME_ITER capability cpu date_start device device_name discrepancies_abs discrepancies_rel dtype dump_folder dynamic executable exporter ...
Argument filter_in or filter_out follows the syntax
<column1>:<fmt1>/<column2>:<fmt2>
.The format is the following:
a value or a set of values separated by
;