llms¶
- class mfai.pytorch.models.llms.FreezeMLMMixin[source]¶
Bases:
objectA Mixin for (un)freezing llm and vision stages of a multimodal model.
-
backend:
GPT2|Llama2|CrossAttentionGPT2|Llama3¶
-
backend:
Pytorch implementation of GPT-2. |
|
Pytorch implementation of Llama2. |
|
Llama3 standalone implementation inspired from https://github.com/rasbt/LLMs-from-scratch/blob/main/ch05/07_gpt_to_llama/standalone-llama32.ipynb Explanations on grouped query attention: https://www.ibm.com/think/topics/grouped-query-attention. |