NLP

Scaling Speech Technology to 1,000+ Languages

May 22, 2023

Abstract

Expanding the language coverage of speech technology has the potential to improve access to information for many more people. However, current speech technology is restricted to about one hundred languages which is a small fraction of the over 7,000 languages spoken around the world. The Massively Multilingual Speech (MMS) project increases the number of supported languages by 10-40x, depending on the task. The main ingredients are a new dataset based on readings of publicly available religious texts and effectively leveraging self-supervised learning. We built pre-trained wav2vec 2.0 models covering 1,406 languages, a single multilingual automatic speech recognition model for 1,107 languages, speech synthesis models for the same number of languages, as well as a language identification model for 4,017 languages. Experiments show that our multilingual speech recognition model more than halves the word error rate of Whisper on 54 languages of the FLEURS benchmark while being trained on a small fraction of the labeled data.

Download the Paper

AUTHORS

Written by

Vineel Pratap

Andros Tjandra

Bowen Shi

Paden Tomasello

Arun Babu

Sayani Kundu

Ali Elkahky

Apoorv Vyas

Maryam Fazel-Zarandi

Alexei Baevski

Wei-Ning Hsu

Alexis Conneau

Michael Auli

Publisher

NeurIPS

Related Publications

February 24, 2023

NLP

LLaMA: Open and Efficient Foundation Language Models

Faisal Azhar, Hugo Touvron, Armand Joulin, Aurelien Rodriguez, Baptiste Rozière, Eric Hambro, Gautier Izacard, Guillaume Lample, Marie-Anne Lachaux, Naman Goyal, Thibaut Lavril, Timothee Lacroix, Xavier Martinet, Edouard Grave

February 24, 2023

February 20, 2023

INTEGRITY

NLP

UNIREX: A Unified Learning Framework for Language Model Rationale Extraction

Maziar Sanjabi, Aaron Chan, Hamed Firooz, Lambert Mathias, Liang Tan, Shaoliang Nie, Xiaochang Peng, Xiang Ren

February 20, 2023

December 31, 2022

NLP

Textless Speech Emotion Conversion using Discrete & Decomposed Representations

Yossef Mordechay Adi, Abdelrahman Mohamed, Adam Polyak, Emmanuel Dupoux, Evgeny Kharitonov, Jade Copet, Morgane Rivière, Tu Anh Nguyen, Wei-Ning Hsu, Felix Kreuk

December 31, 2022

December 29, 2022

NLP

Staircase Attention for Recurrent Processing of Sequences

Dexter Ju, Jason Weston, Sainbayar Sukhbaatar, Stephen Roller

December 29, 2022

Help Us Pioneer The Future of AI

We share our open source frameworks, tools, libraries, and models for everything from research exploration to large-scale production deployment.