-m yobx validate … validate a HuggingFace model exported to ONNX#

The command validates an ONNX export for a HuggingFace model. It captures real inputs by running the model on a text prompt with InputObserver, exports to ONNX, and checks numerical discrepancies between the original PyTorch model and the ONNX runtime outputs.

Description#

See yobx.torch.validate.validate_model().

    usage: validate [-h] -m MID [-p PROMPT] [-e EXPORT] [--opt OPT] [-r | --run | --no-run] [--patch | --no-patch] [-q | --quiet | --no-quiet]
                    [--opset OPSET] [--dtype DTYPE] [--device DEVICE] [--max-new-tokens MAX_NEW_TOKENS] [-o DUMP_FOLDER] [-v VERBOSE]
                    [--random-weights] [--config-override KEY=VALUE]
    
    Validates an ONNX export for a HuggingFace model.
    This command captures real inputs by running the model on a default
    text prompt with InputObserver, then exports to ONNX and checks
    discrepancies.
    
    options:
      -h, --help            show this help message and exit
      -m MID, --mid MID     Model id, usually <author>/<name> on HuggingFace Hub.
      -p PROMPT, --prompt PROMPT
                            Text prompt used to drive model.generate() during input capture. Default: 'Continue: it rains, what should I do?'
      -e EXPORT, --export EXPORT
                            ONNX exporter to use (default: 'yobx').
      --opt OPT             Optimisation level applied after export (default: 'default').
      -r, --run, --no-run   Check discrepancies after export (default: True).
      --patch, --no-patch   Apply apply_patches_for_model and register_flattening_functions during input capture and export (default: True).
      -q, --quiet, --no-quiet
                            Catch exceptions and report them in the summary instead of re-raising.
      --opset OPSET         ONNX opset version to target (default: 22).
      --dtype DTYPE         Cast the model and inputs to this dtype before exporting, e.g. 'float16'.
      --device DEVICE       Device to run on, e.g. 'cpu' or 'cuda' (default: 'cpu').
      --max-new-tokens MAX_NEW_TOKENS
                            Number of tokens generated by model.generate() during input capture (default: 10).
      -o DUMP_FOLDER, --dump-folder DUMP_FOLDER
                            Save ONNX artefacts under this folder.
      -v VERBOSE, --verbose VERBOSE
                            Verbosity level (default: 0).
      --random-weights      Instantiate the model from config with random weights instead of downloading pretrained weights (useful for fast CI tests).
      --config-override KEY=VALUE
                            Override a config attribute before creating the model, e.g. --config-override num_hidden_layers=2. Can be repeated for multiple overrides.
    
    Examples:
    
        python -m yobx validate -m arnir0/Tiny-LLM -v 1
        python -m yobx validate -m arnir0/Tiny-LLM -v 1 -o dump_validate
        python -m yobx validate -m arnir0/Tiny-LLM --no-patch --no-run
    
    With mode arguments:
    
        python -m yobx validate -m arnir0/Tiny-LLM \
               -e yobx --opt default --opset 22 --device cuda --dtype float32 \
               --patch -r -o dump_test -v 1

Examples#

Basic validation with default settings:

python -m yobx validate -m arnir0/Tiny-LLM -v 1

Save ONNX artifacts to a folder for further inspection:

python -m yobx validate -m arnir0/Tiny-LLM -v 1 -o dump_validate

Export without applying patches and without running discrepancy checks:

python -m yobx validate -m arnir0/Tiny-LLM --no-patch --no-run

Full set of options — target CUDA with float32 at opset 22:

python -m yobx validate -m arnir0/Tiny-LLM \
       -e yobx --opt default --opset 22 --device cuda --dtype float32 \
       --patch -r -o dump_test -v 1

Fast validation with random weights (useful for CI):

python -m yobx validate -m arnir0/Tiny-LLM --random-weights \
       --config-override num_hidden_layers=2

Override multiple config attributes at once:

python -m yobx validate -m arnir0/Tiny-LLM --random-weights \
       --config-override num_hidden_layers=2 \
       --config-override hidden_size=64