Research

NLP

BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension

July 8, 2020

Abstract

We present BART, a denoising autoencoder for pretraining sequence-to-sequence models. BART is trained by (1) corrupting text with an arbitrary noising function, and (2) learning a model to reconstruct the original text. It uses a standard Tranformer-based neural machine translation architecture which, despite its simplicity, can be seen as generalizing BERT (due to the bidirectional encoder), GPT (with the left-to-right decoder), and other recent pre- training schemes. We evaluate a number of noising approaches, finding the best perfor- mance by both randomly shuffling the order of sentences and using a novel in-filling scheme,where spans of text are replaced with a single mask token. BART is particularly effective when fine tuned for text generation but also works well for comprehension tasks. It matches the performance of RoBERTa on GLUE and SQuAD, and achieves new state-of-the-art results on a range of abstractive di-alogue, question answering, and summarization tasks, with gains of up to 3.5 ROUGE. BART also provides a 1.1 BLEU increase over a back-translation system for machine translation, with only target language pretraining. We also replicate other pretraining schemes within the BART framework, to understand their effect on end-task performance.

Download the Paper

AUTHORS

Written by

Mike Lewis

Yinhan Liu

Naman Goyal

Marjan Ghazvininejad

Abdelrahman Mohamed

Omer Levy

Ves Stoyanov

Luke Zettlemoyer

Publisher

Association for Computational Linguistics (ACL)

Related Publications

November 16, 2022

NLP

Memorization Without Overfitting: Analyzing the Training Dynamics of Large Language Models

Kushal Tirumala, Aram H. Markosyan, Armen Aghajanyan, Luke Zettlemoyer

November 16, 2022

October 31, 2022

NLP

Autoregressive Search Engines: Generating Substrings as Document Identifiers

Fabio Petroni, Giuseppe Ottaviano, Michele Bevilacqua, Patrick Lewis, Scott Yih, Sebastian Riedel

October 31, 2022

December 06, 2020

NLP

Pre-training via Paraphrasing

Michael Lewis, Armen Aghajanyan, Gargi Ghosh, Luke Zettlemoyer, Marjan Ghazvininejad, Sida Wang

December 06, 2020

November 30, 2020

NLP

Where Are You? Localization from Embodied Dialog

Dhruv Batra, Devi Parikh, Meera Hahn, Jacob Krantz, James Rehg, Peter Anderson, Stefan Lee

November 30, 2020

April 30, 2018

NLP

Speech & Audio

Identifying Analogies Across Domains | Facebook AI Research

Yedid Hoshen, Lior Wolf

April 30, 2018

November 01, 2018

NLP

Computer Vision

Non-Adversarial Unsupervised Word Translation | Facebook AI Research

Yedid Hoshen, Lior Wolf

November 01, 2018

December 02, 2018

NLP

Computer Vision

One-Shot Unsupervised Cross Domain Translation | Facebook AI Research

Sagie Benaim, Lior Wolf

December 02, 2018

June 30, 2019

NLP

Variational Training for Large-Scale Noisy-OR Bayesian Networks | Facebook AI Research

Geng Ji, Dehua Cheng, Huazhong Ning, Changhe Yuan, Hanning Zhou, Liang Xiong, Erik B. Sudderth

June 30, 2019

Help Us Pioneer The Future of AI

We share our open source frameworks, tools, libraries, and models for everything from research exploration to large-scale production deployment.