.torch_interpreter.patches.patch_transformers¶
- class experimental_experiment.torch_interpreter.patches.patch_transformers.patched_AttentionMaskConverter[source]¶
Patches
transformers.modeling_attn_mask_utils.AttentionMaskConverter._make_causal_mask.
- class experimental_experiment.torch_interpreter.patches.patch_transformers.patched_DynamicCache(num_hidden_layers: int | None = None)[source]¶
Removes the dependency on
torch.nn.Modulefromtransformers.cache_utils.DynamicCache.- batch_split(full_batch_size: int, split_size: int, num_hidden_layers: int | None = None) List[transformers.cache_utils.DynamicCache][source]¶
- classmethod from_batch_splits(splits: List[transformers.cache_utils.DynamicCache], num_hidden_layers: int | None = None) transformers.cache_utils.DynamicCache[source]¶