LLaMA: Open and Efficient Foundation Language Models

February 24, 2023


We introduce LLaMA, a collection of founda- tion language models ranging from 7B to 65B parameters. We train our models on trillions of tokens, and show that it is possible to train state-of-the-art models using publicly available datasets exclusively, without resorting to proprietary and inaccessible datasets. In particular, LLaMA-13B outperforms GPT-3 (175B) on most benchmarks, and LLaMA-65B is competitive with the best models, Chinchilla- 70B and PaLM-540B. We release all our models to the research community.

Download the Paper


Written by

Faisal Azhar

Hugo Touvron

Armand Joulin

Aurelien Rodriguez

Baptiste Rozière

Eric Hambro

Gautier Izacard

Guillaume Lample

Marie-Anne Lachaux

Naman Goyal

Thibaut Lavril

Timothee Lacroix

Xavier Martinet

Edouard Grave



Related Publications

May 22, 2023


Scaling Speech Technology to 1,000+ Languages

Vineel Pratap, Andros Tjandra, Bowen Shi, Paden Tomasello, Arun Babu, Sayani Kundu, Ali Elkahky, Apoorv Vyas, Maryam Fazel-Zarandi, Alexei Baevski, Wei-Ning Hsu, Alexis Conneau, Michael Auli

May 22, 2023

February 20, 2023



UNIREX: A Unified Learning Framework for Language Model Rationale Extraction

Maziar Sanjabi, Aaron Chan, Hamed Firooz, Lambert Mathias, Liang Tan, Shaoliang Nie, Xiaochang Peng, Xiang Ren

February 20, 2023

December 31, 2022


Textless Speech Emotion Conversion using Discrete & Decomposed Representations

Yossef Mordechay Adi, Abdelrahman Mohamed, Adam Polyak, Emmanuel Dupoux, Evgeny Kharitonov, Jade Copet, Morgane Rivière, Tu Anh Nguyen, Wei-Ning Hsu, Felix Kreuk

December 31, 2022

December 29, 2022


Staircase Attention for Recurrent Processing of Sequences

Dexter Ju, Jason Weston, Sainbayar Sukhbaatar, Stephen Roller

December 29, 2022

Help Us Pioneer The Future of AI

We share our open source frameworks, tools, libraries, and models for everything from research exploration to large-scale production deployment.