experimental_experiment.torch_interpreter.patches.patch_transformers¶
- class experimental_experiment.torch_interpreter.patches.patch_transformers.patched_AttentionMaskConverter[source]¶
Patches
transformers.modeling_attn_mask_utils.AttentionMaskConverter._make_causal_mask
.
- class experimental_experiment.torch_interpreter.patches.patch_transformers.patched_DynamicCache(num_hidden_layers: int | None = None)[source]¶
Removes the dependency on
torch.nn.Module
fromtransformers.cache_utils.DynamicCache
.- batch_split(full_batch_size: int, split_size: int, num_hidden_layers: int | None = None) List[transformers.cache_utils.DynamicCache] [source]¶
- classmethod from_batch_splits(splits: List[transformers.cache_utils.DynamicCache], num_hidden_layers: int | None = None) transformers.cache_utils.DynamicCache [source]¶