onnx_diagnostic.torch_models.hghub¶

submodules

onnx_diagnostic.torch_models.hghub.get_untrained_model_with_inputs(model_id: str, config: Any | None = None, task: str | None = '', inputs_kwargs: Dict[str, Any] | None = None, model_kwargs: Dict[str, Any] | None = None, verbose: int = 0, dynamic_rope: bool | None = None, use_pretrained: bool = False, same_as_pretrained: bool = False, use_preinstalled: bool = True, add_second_input: int = 1, subfolder: str | None = None, use_only_preinstalled: bool = False) → Dict[str, Any][source][source]¶

Gets a non initialized model similar to the original model based on the model id given to the function. The model size is reduced compare to the original model. No weight is downloaded, only the configuration file sometimes.

Parameters:

model_id – model id, ex: arnir0/Tiny-LLM
config – to overwrite the configuration
task – model task, can be overwritten, otherwise, it is automatically determined
input_kwargs – parameters sent to input generation
model_kwargs – to change the model generation
verbose – display found information
dynamic_rope – use dynamic rope (see transformers.LlamaConfig)
same_as_pretrained – if True, do not change the default values to get a smaller model
use_pretrained – download the pretrained weights as well
use_preinstalled – use preinstalled configurations
add_second_input – provides a second inputs to check a model supports different shapes
subfolder – subfolder to use for this model id
use_only_preinstalled – use only preinstalled version

Returns:

dictionary with a model, inputs, dynamic shapes, and the configuration, some necessary rewriting as well

Example:

<<<

import pprint
from onnx_diagnostic.helpers import string_type
from onnx_diagnostic.torch_models.hghub import get_untrained_model_with_inputs

data = get_untrained_model_with_inputs("arnir0/Tiny-LLM", verbose=1)

print("-- model size:", data["size"])
print("-- number of parameters:", data["n_weights"])
print("-- inputs:", string_type(data["inputs"], with_shape=True))
print("-- dynamic shapes:", pprint.pformat(data["dynamic_shapes"]))
print("-- configuration:", pprint.pformat(data["configuration"]))

>>>

    [get_untrained_model_with_inputs] model_id='arnir0/Tiny-LLM'
    [get_untrained_model_with_inputs] use preinstalled 'arnir0/Tiny-LLM'
    [get_untrained_model_with_inputs] architectures=['LlamaForCausalLM']
    [get_untrained_model_with_inputs] cls='LlamaConfig'
    [get_untrained_model_with_inputs] task='text-generation'
    [get_untrained_model_with_inputs] default config._attn_implementation='eager'
    [get_untrained_model_with_inputs] use fct=<function get_inputs at 0x7124959c60c0>
    -- model size: 51955968
    -- number of parameters: 12988992
    -- inputs: dict(input_ids:T7s2x3,attention_mask:T7s2x33,position_ids:T7s2x3,past_key_values:DynamicCache(key_cache=#1[T1s2x1x30x96], value_cache=#1[T1s2x1x30x96]))
    -- dynamic shapes: {'attention_mask': {0: Dim('batch', min=1, max=1024), 1: 'cache+seq'},
     'input_ids': {0: Dim('batch', min=1, max=1024), 1: 'seq_length'},
     'past_key_values': [[{0: Dim('batch', min=1, max=1024), 2: 'cache_length'}],
                         [{0: Dim('batch', min=1, max=1024), 2: 'cache_length'}]],
     'position_ids': {0: Dim('batch', min=1, max=1024), 1: 'cache+seq'}}
    -- configuration: LlamaConfig {
      "architectures": [
        "LlamaForCausalLM"
      ],
      "attention_bias": false,
      "attention_dropout": 0.0,
      "bos_token_id": 1,
      "eos_token_id": 2,
      "head_dim": 96,
      "hidden_act": "silu",
      "hidden_size": 192,
      "initializer_range": 0.02,
      "intermediate_size": 1024,
      "max_position_embeddings": 1024,
      "mlp_bias": false,
      "model_type": "llama",
      "num_attention_heads": 2,
      "num_hidden_layers": 1,
      "num_key_value_heads": 1,
      "pretraining_tp": 1,
      "rms_norm_eps": 1e-05,
      "rope_scaling": null,
      "rope_theta": 10000.0,
      "subfolder": null,
      "tie_word_embeddings": false,
      "torch_dtype": "float32",
      "transformers_version": "4.54.0.dev0",
      "use_cache": true,
      "vocab_size": 32000
    }