onnx_diagnostic.torch_models.hghub¶
submodules
- onnx_diagnostic.torch_models.hghub.get_untrained_model_with_inputs(model_id: str, config: Any | None = None, task: str | None = '', inputs_kwargs: Dict[str, Any] | None = None, model_kwargs: Dict[str, Any] | None = None, verbose: int = 0, dynamic_rope: bool | None = None, use_pretrained: bool = False, same_as_pretrained: bool = False, use_preinstalled: bool = True, add_second_input: int = 1, subfolder: str | None = None, use_only_preinstalled: bool = False) Dict[str, Any][source][source]¶
- Gets a non initialized model similar to the original model based on the model id given to the function. The model size is reduced compare to the original model. No weight is downloaded, only the configuration file sometimes. - Parameters:
- model_id – model id, ex: arnir0/Tiny-LLM 
- config – to overwrite the configuration 
- task – model task, can be overwritten, otherwise, it is automatically determined 
- input_kwargs – parameters sent to input generation 
- model_kwargs – to change the model generation 
- verbose – display found information 
- dynamic_rope – use dynamic rope (see - transformers.LlamaConfig)
- same_as_pretrained – if True, do not change the default values to get a smaller model 
- use_pretrained – download the pretrained weights as well 
- use_preinstalled – use preinstalled configurations 
- add_second_input – provides a second inputs to check a model supports different shapes 
- subfolder – subfolder to use for this model id 
- use_only_preinstalled – use only preinstalled version 
 
- Returns:
- dictionary with a model, inputs, dynamic shapes, and the configuration, some necessary rewriting as well 
 - Example: - <<< - import pprint from onnx_diagnostic.helpers import string_type from onnx_diagnostic.torch_models.hghub import get_untrained_model_with_inputs data = get_untrained_model_with_inputs("arnir0/Tiny-LLM", verbose=1) print("-- model size:", data["size"]) print("-- number of parameters:", data["n_weights"]) print("-- inputs:", string_type(data["inputs"], with_shape=True)) print("-- dynamic shapes:", pprint.pformat(data["dynamic_shapes"])) print("-- configuration:", pprint.pformat(data["configuration"])) - >>> - [get_untrained_model_with_inputs] model_id='arnir0/Tiny-LLM' [get_untrained_model_with_inputs] use preinstalled 'arnir0/Tiny-LLM' [get_untrained_model_with_inputs] architectures=['LlamaForCausalLM'] [get_untrained_model_with_inputs] cls='LlamaConfig' [get_untrained_model_with_inputs] task='text-generation' [get_untrained_model_with_inputs] default config._attn_implementation=None [get_untrained_model_with_inputs] use fct=<function get_inputs at 0x7ac202bc68e0> -- model size: 51955968 -- number of parameters: 12988992 -- inputs: dict(input_ids:T7s2x3,attention_mask:T7s2x33,position_ids:T7s2x3,past_key_values:DynamicCache(key_cache=#1[T1s2x1x30x96], value_cache=#1[T1s2x1x30x96])) -- dynamic shapes: {'attention_mask': {0: Dim('batch', min=1, max=1024), 1: 'cache+seq'}, 'input_ids': {0: Dim('batch', min=1, max=1024), 1: 'seq_length'}, 'past_key_values': [[{0: Dim('batch', min=1, max=1024), 2: 'cache_length'}], [{0: Dim('batch', min=1, max=1024), 2: 'cache_length'}]], 'position_ids': {0: Dim('batch', min=1, max=1024), 1: 'cache+seq'}} -- configuration: LlamaConfig { "architectures": [ "LlamaForCausalLM" ], "attention_bias": false, "attention_dropout": 0.0, "bos_token_id": 1, "eos_token_id": 2, "head_dim": 96, "hidden_act": "silu", "hidden_size": 192, "initializer_range": 0.02, "intermediate_size": 1024, "max_position_embeddings": 1024, "mlp_bias": false, "model_type": "llama", "num_attention_heads": 2, "num_hidden_layers": 1, "num_key_value_heads": 1, "pretraining_tp": 1, "rms_norm_eps": 1e-05, "rope_scaling": null, "rope_theta": 10000.0, "subfolder": null, "tie_word_embeddings": false, "torch_dtype": "float32", "transformers_version": "4.54.0.dev0", "use_cache": true, "vocab_size": 32000 }