experimental_experiment.torch_bench.dort_bench¶
Run llama model with DORT¶
The script runs a few iterations of a dummy llama model.
python -m experimental_experiment.torch_bench.dort_bench --help
Example, run llama model with onnxrt backend on cuda.
python -m experimental_experiment.torch_bench.dort_bench \
--backend ort --device cuda --config medium
To export the models:
python -m experimental_experiment.torch_bench.dort_bench \
--backend custom --device cuda --export a -w 3
Profiling:
nsys profile python -m experimental_experiment.torch_bench.dort_bench \
--device cuda -w 3 -r 5 --mixed 1 --config large \
--backend eager --enable_pattern=default+onnxruntime
With experimental optimizers:
python -m experimental_experiment.torch_bench.dort_bench --backend custom \
--device cuda --mixed=1 --export model -w 3 \
--enable_pattern=default+onnxruntime+experimental
Or:
python -m experimental_experiment.torch_bench.dort_bench --backend ort+ \
--device cuda --mixed=1 --export model -w 3 \
--enable_pattern=default+onnxruntime+experimental