experimental_experiment.torch_bench.dort_bench

Run llama model with DORT

The script runs a few iterations of a dummy llama model.

python -m experimental_experiment.torch_bench.dort_bench --help

Example, run llama model with onnxrt backend on cuda.

python -m experimental_experiment.torch_bench.dort_bench \
       --backend ort --device cuda --config medium

To export the models:

python -m experimental_experiment.torch_bench.dort_bench \
       --backend custom --device cuda --export a -w 3

Profiling:

nsys profile python -m experimental_experiment.torch_bench.dort_bench \
                    --device cuda -w 3 -r 5 --mixed 1 --config large \
                    --backend eager --enable_pattern=default+onnxruntime

With experimental optimizers:

python -m experimental_experiment.torch_bench.dort_bench --backend custom \
       --device cuda --mixed=1 --export model -w 3 \
       --enable_pattern=default+onnxruntime+experimental

Or:

python -m experimental_experiment.torch_bench.dort_bench --backend ort+ \
      --device cuda --mixed=1 --export model -w 3 \
      --enable_pattern=default+onnxruntime+experimental
experimental_experiment.torch_bench.dort_bench.main(args=None)[source]

Main function for command line python -m experimental_experiment.torch_bench.dort_bench.