NLP

EXPRESSO: A Benchmark and Analysis of Discrete Expressive Speech Resynthesis

August 19, 2023

Abstract

Recent work has shown that it is possible to resynthesize high-quality speech based, not on text, but on low bitrate discrete units that have been learned in a self-supervised fashion and can therefore capture expressive aspects of speech that are hard to transcribe (prosody, voice styles, non-verbal vocalization). The adoption of these methods is still limited by the fact that most speech synthesis datasets are read, severely limiting spontaneity and expressivity. Here, we introduce \expresso, a high-quality expressive speech dataset for textless speech synthesis that includes both read speech and improvised dialogues rendered in 26 spontaneous expressive styles. We illustrate the challenges and potentials of this dataset with an \textit{expressive resynthesis benchmark} where the task is to encode the input in low-bitrate units and resynthesize it in a target voice while preserving content and style. We evaluate resynthesis quality with automatic metrics for different self-supervised discrete encoders, and explore tradeoffs between quality, bitrate and invariance to speaker and style. The dataset, evaluation metrics and baseline models will be open sourced.

Download the Paper

AUTHORS

Written by

Tu Anh Nguyen

Wei-Ning Hsu

Antony D'Avirro

Bowen Shi

Itai Gat

Maryam Fazel-Zarandi

Tal Remez

Jade Copet

Gabriel Synnaeve

Michael Hassid

Felix Kreuk

Yossef Mordechay Adi

Emmanuel Dupoux

Publisher

Interspeech

Related Publications

April 22, 2024

NLP

Text Quality-Based Pruning for Efficient Training of Language Models

Vasu Sharma *, Karthik Padthe *, Newsha Ardalani, Kushal Tirumala, Russ Howes, Hu Xu, Bernie Huang, Daniel Li (FAIR), Armen Aghajanyan, Gargi Ghosh, Luke Zettlemoyer

April 22, 2024

April 14, 2024

SPEECH & AUDIO

NLP

CoLLD: Contrastive Layer-to-Layer Distillation for Compressing Multilingual Pre-Trained Speech Encoders

Heng-Jui Chang, Ning Dong (AI), Ruslan Mavlyutov, Sravya Popuri, Andy Chung

April 14, 2024

April 05, 2024

CONVERSATIONAL AI

NLP

MART: Improving LLM Safety with Multi-round Automatic Red-Teaming

Suyu Ge, Chunting Zhou, Rui Hou, Madian Khabsa, Yi-Chia Wang, Qifan Wang, Jiawei Han, Yuning Mao

April 05, 2024

February 21, 2024

INTEGRITY

NLP

Watermarking Makes Language Models Radioactive

Tom Sander, Pierre Fernandez, Alain Durmus, Matthijs Douze, Teddy Furon

February 21, 2024

Help Us Pioneer The Future of AI

We share our open source frameworks, tools, libraries, and models for everything from research exploration to large-scale production deployment.