-m yobx validate … validate a HuggingFace model exported to ONNX#
The command validates an ONNX export for a HuggingFace model.
It captures real inputs by running the model on a text prompt with
InputObserver, exports to ONNX, and checks
numerical discrepancies between the original PyTorch model and the
ONNX runtime outputs.
Description#
See yobx.torch.validate.validate_model().
usage: validate [-h] -m MID [-p PROMPT] [-e EXPORT] [--opt OPT] [-r | --run | --no-run] [--patch | --no-patch] [-q | --quiet | --no-quiet]
[--opset OPSET] [--dtype DTYPE] [--device DEVICE] [--max-new-tokens MAX_NEW_TOKENS] [-o DUMP_FOLDER] [-v VERBOSE]
[--random-weights] [--config-override KEY=VALUE]
Validates an ONNX export for a HuggingFace model.
This command captures real inputs by running the model on a default
text prompt with InputObserver, then exports to ONNX and checks
discrepancies.
options:
-h, --help show this help message and exit
-m MID, --mid MID Model id, usually <author>/<name> on HuggingFace Hub.
-p PROMPT, --prompt PROMPT
Text prompt used to drive model.generate() during input capture. Default: 'Continue: it rains, what should I do?'
-e EXPORT, --export EXPORT
ONNX exporter to use (default: 'yobx').
--opt OPT Optimisation level applied after export (default: 'default').
-r, --run, --no-run Check discrepancies after export (default: True).
--patch, --no-patch Apply apply_patches_for_model and register_flattening_functions during input capture and export (default: True).
-q, --quiet, --no-quiet
Catch exceptions and report them in the summary instead of re-raising.
--opset OPSET ONNX opset version to target (default: 22).
--dtype DTYPE Cast the model and inputs to this dtype before exporting, e.g. 'float16'.
--device DEVICE Device to run on, e.g. 'cpu' or 'cuda' (default: 'cpu').
--max-new-tokens MAX_NEW_TOKENS
Number of tokens generated by model.generate() during input capture (default: 10).
-o DUMP_FOLDER, --dump-folder DUMP_FOLDER
Save ONNX artefacts under this folder.
-v VERBOSE, --verbose VERBOSE
Verbosity level (default: 0).
--random-weights Instantiate the model from config with random weights instead of downloading pretrained weights (useful for fast CI tests).
--config-override KEY=VALUE
Override a config attribute before creating the model, e.g. --config-override num_hidden_layers=2. Can be repeated for multiple overrides.
Examples:
python -m yobx validate -m arnir0/Tiny-LLM -v 1
python -m yobx validate -m arnir0/Tiny-LLM -v 1 -o dump_validate
python -m yobx validate -m arnir0/Tiny-LLM --no-patch --no-run
With mode arguments:
python -m yobx validate -m arnir0/Tiny-LLM \
-e yobx --opt default --opset 22 --device cuda --dtype float32 \
--patch -r -o dump_test -v 1
Examples#
Basic validation with default settings:
python -m yobx validate -m arnir0/Tiny-LLM -v 1
Save ONNX artifacts to a folder for further inspection:
python -m yobx validate -m arnir0/Tiny-LLM -v 1 -o dump_validate
Export without applying patches and without running discrepancy checks:
python -m yobx validate -m arnir0/Tiny-LLM --no-patch --no-run
Full set of options — target CUDA with float32 at opset 22:
python -m yobx validate -m arnir0/Tiny-LLM \
-e yobx --opt default --opset 22 --device cuda --dtype float32 \
--patch -r -o dump_test -v 1
Fast validation with random weights (useful for CI):
python -m yobx validate -m arnir0/Tiny-LLM --random-weights \
--config-override num_hidden_layers=2
Override multiple config attributes at once:
python -m yobx validate -m arnir0/Tiny-LLM --random-weights \
--config-override num_hidden_layers=2 \
--config-override hidden_size=64