Meta has released a new large language model called LLaMA (Large Language Model Meta AI) to support AI researchers. The model will enable more people in the research community to study language models and provide easier access to this important field.
LLaMA is available in several sizes (7B, 13B, 33B, and 65B parameters). By training smaller foundation models like LLaMA, researchers can use less computing power and resources to test new approaches and explore new use cases. Meta wrote that LLaMA is ideal for fine-tuning a variety of tasks as it trains on a large set of unlabeled data. Meta claims to be committed to responsible AI practices and shares a LLAMA model card that details how the model was built.
To develop LLaMA, Meta selected texts from the 20 most widely spoken languages, focusing on languages that use Latin and Cyrillic alphabets. This large language model generates text by taking a sequence of words as input and predicting the next word recursively, similar to other models in this category.
“As a foundation model, LLaMA is designed to be versatile and can be applied to many different use cases, versus a fine-tuned model that is designed for a specific task. By sharing the code for LLaMA, other researchers can more easily test new approaches to limiting or eliminating these problems in large language models,”
Meta wrote.
The company admits it still needs to address the risks of bias, toxic comments, and hallucinations in large language models, including LLaMA.
Meta is releasing the model under a noncommercial license focused on research use cases “to maintain integrity and prevent misuse.” Individuals and organizations seeking access to the model will be evaluated on a case-by-case basis. Eligible parties include academic researchers, government and civil society organizations, and industry research laboratories worldwide.
You can read an entire paper on LLaMA here. If eligible, you can also apply to test the language model.
Read more:
Read More: mpost.io