RESEARCH

NLP

Beyond English-Centric Multilingual Machine Translation

October 18, 2020

Abstract

Existing work in translation demonstrated the potential of massively multilingual machine translation by training a single model able to translate between any pair of languages. However, much of this work is English-Centric by training only on data which was translated from or to English. While this is supported by large sources of training data, it does not reflect translation needs worldwide. In this work, we create a true Many-to-Many multilingual translation model that can translate directly between any pair of 100 languages. We build and open source a training dataset that covers thousands of language directions with supervised data, created through large-scale mining. Then, we explore how to effectively increase model capacity through a combination of dense scaling and language-specific sparse parameters to create high quality models. Our focus on non-English-Centric models brings gains of more than 10 BLEU when directly translating between non-English directions while performing competitively to the best single systems of WMT. We open-source our scripts so that others may reproduce the data, evaluation, and final M2M-100 model here.

Download the Paper

AUTHORS

Written by

Angela Fan

Shruti Bhosale

Holger Schwenk

Zhiyi Ma

Ahmed El-Kishky

Siddharth Goyal

Mandeep Baines

Onur Celebi

Guillaume Wenzek

Vishrav Chaudhary

Naman Goyal

Tom Birch

Vitaliy Liptchinsky

Sergey Edunov

Edouard Grave

Michael Auli

Armand Joulin

Publisher

arXiv

Related Publications

August 01, 2019

NLP

Simple and Effective Curriculum Pointer-Generator Networks for Reading Comprehension over Long Narratives | Facebook AI Research

Yi Tay, Shuohang Wang, Luu Anh Tuan, Jie Fu, Minh C. Phan, Xingdi Yuan, Jinfeng Rao, Siu Cheung Hui, Aston Zhang

August 01, 2019

July 29, 2019

NLP

Improved Zero-shot Neural Machine Translation via Ignoring Spurious Correlations | Facebook AI Research

Jiatao Gu, Yong Wang, Kyunghyun Cho, Victor O.K. Li

July 29, 2019

June 11, 2019

NLP

COMPUTER VISION

Adversarial Inference for Multi-Sentence Video Description | Facebook AI Research

Jae Sung Park, Marcus Rohrbach, Trevor Darrell, Anna Rohrbach

June 11, 2019

June 10, 2019

NLP

COMPUTER VISION

Mixture Models for Diverse Machine Translation: Tricks of the Trade | Facebook AI Research

Tianxiao Shen, Myle Ott, Michael Auli, Marc'Aurelio Ranzato

June 10, 2019

Help Us Pioneer The Future of AI

We share our open source frameworks, tools, libraries, and models for everything from research exploration to large-scale production deployment.