onnx_diagnostic.tasks.summarization¶
- onnx_diagnostic.tasks.summarization.get_inputs(model: Module, config: Any | None, dummy_max_token_id: int, num_key_value_heads_encoder: int, num_key_value_heads_decoder: int, num_hidden_layers: int, head_dim_encoder: int, head_dim_decoder: int, batch_size: int = 2, sequence_length: int = 30, sequence_length2: int = 3, add_second_input: int = 1, **kwargs)[source][source]¶
- Generates input for task - summarization.- Parameters:
- model – model to get the missing information 
- config – configuration used to generate the model 
- head_dim_encoder – last dimension of the cache for the encoder 
- head_dim_decoder – last dimension of the cache for the decoder 
- num_key_value_heads_encoder – number of heads for the encoder 
- num_key_value_heads_decoder – number of heads for the decoder 
- dummy_max_token_id – dummy max token id 
- batch_size – batch size 
- sequence_length – sequence length 
- sequence_length2 – new sequence length 
 
- Returns:
- dictionary 
 - Stolen inputs for one model. - cache_position:T7s1 past_key_values:EncoderDecoderCache( self_attention_cache=DynamicCache( key_cache=#6[T1s1x8x1x64,...], value_cache=#6[T1s1x8x1x64,...]), cross_attention_cache=DynamicCache( key_cache=#6[T1s1x8x16x64,...], value_cache=#6[T1s1x8x16x64,...])), decoder_input_ids:T7s1x1, encoder_outputs:dict(last_hidden_state:T1s1x16x512)