onnx_diagnostic.tasks.mixture_of_expert¶
- onnx_diagnostic.tasks.mixture_of_expert.get_inputs(model: Module, config: Any | None, dummy_max_token_id: int, num_key_value_heads: int, num_hidden_layers: int, head_dim: int, width: int, height: int, num_channels: int, batch_size: int = 2, sequence_length: int = 30, sequence_length2: int = 3, n_images: int = 2, dynamic_rope: bool = False, add_second_input: int = 1, **kwargs)[source][source]¶
- Generates input for task - MoE.- Parameters:
- model – model to get the missing information 
- config – configuration used to generate the model 
- head_dim – last dimension of the cache 
- dummy_max_token_id – dummy max token id 
- batch_size – batch size 
- sequence_length – sequence length 
- sequence_length2 – new sequence length 
- n_images – number of images 
- width – width of the image 
- height – height of the image 
- num_channels – number of channels 
- dynamic_rope – use dynamic rope (see - transformers.LlamaConfig)
 
- Returns:
- dictionary