onnx_diagnostic.export.image_text_to_text¶

onnx_diagnostic.tasks.image_text_to_text.get_inputs(model: Module, config: Any | None, dummy_max_token_id: int, num_key_value_heads: int, num_hidden_layers: int, pad_token_id: int, image_token_index: int, head_dim: int, width: int, height: int, num_channels: int, batch_size: int | None = None, sequence_length: int | None = None, n_images: int | None = None, max_sequence_length: int | None = None, total_sequence_length: int | None = None, add_second_input: int = 0, **kwargs)[source][source]¶

Generates input for task image-text-to-text.

Parameters:

model – model to get the missing information
config – configuration used to generate the model
head_dim – last dimension of the cache
dummy_max_token_id – dummy max token id
pad_token_id – pad_token_id
image_token_index – image_token_index
batch_size – batch size
sequence_length – sequence length
max_sequence_length – for the cache
total_sequence_length – for the mask
n_images – number of images
width – width of the image
height – height of the image
num_channels – number of channels

Returns:

dictionary

Note

The content of the input_ids and its shape is correlated to the images. The function uses a predefined values. The function raises an exception if dimension are not the expected ones.

onnx_diagnostic.tasks.image_text_to_text.random_input_kwargs(config: Any) → Tuple[Dict[str, Any], Callable][source][source]¶

Inputs kwargs.

If the configuration is None, the function selects typical dimensions.

onnx_diagnostic.tasks.image_text_to_text.reduce_model_config(config: Any) → Dict[str, Any][source][source]¶: Reduces a model size.