experimental_experiment.torch_models.llama_helper

Code modified from different sources:

experimental_experiment.torch_models.llama_helper.get_llama_attention(input_dims: Sequence[Tuple[int, int]] = ((2, 8), (4, 7), (9, 15)), hidden_size=16, num_hidden_layers=1, vocab_size=1024, intermediate_size=16, max_position_embeddings=1024, num_attention_heads=2, _attn_implementation='eager')[source]

Returns the attention part. See experimental_experiment.torch_models.llama_helper.get_llama_model().

experimental_experiment.torch_models.llama_helper.get_llama_decoder(input_dims: Sequence[Tuple[int, int]] = ((2, 8), (4, 7), (9, 15)), hidden_size=16, num_hidden_layers=1, vocab_size=1024, intermediate_size=16, max_position_embeddings=1024, num_attention_heads=2, _attn_implementation='eager')[source]

Returns the decoder part. See experimental_experiment.torch_models.llama_helper.get_llama_model().

experimental_experiment.torch_models.llama_helper.get_llama_model(input_dims: Sequence[Tuple[int, int]] = ((2, 8), (4, 7), (9, 15)), hidden_size: int = 16, num_hidden_layers: int = 1, vocab_size: int = 1024, intermediate_size: int = 16, max_position_embeddings: int = 1024, num_attention_heads: int = 2, _attn_implementation: str = 'eager', with_mask: bool = True, dynamic_shapes: bool = False)[source]

Returns a model. See LlamaConfig. The parameters are chosen for a unit test configuration.