SPEECH & AUDIO

NLP

Real Time Speech Enhancement in the Waveform Domain

October 25, 2020

Abstract

We present a causal speech enhancement model working on the raw waveform that runs in real-time on a laptop CPU. The proposed model is based on an encoder-decoder architecture with skip-connections. It is optimized on both time and frequency domains, using multiple loss functions. Empirical evidence shows that it is capable of removing various kinds of background noise including stationary and non-stationary noises, as well as room reverb. Additionally, we suggest a set of data augmentation techniques applied directly on the raw waveform which further improve model performance and its generalization abilities. We perform evaluations on several standard benchmarks, both using objective metrics and human judgements. The proposed model matches state-of-the-art performance of both causal and non causal methods while working directly on the raw waveform.

Download the Paper

AUTHORS

Written by

Alexandre Defossez

Gabriel Synnaeve

Yossef (Yossi) Adi

Publisher

InterSpeech

Related Publications

April 14, 2024

SPEECH & AUDIO

NLP

CoLLD: Contrastive Layer-to-Layer Distillation for Compressing Multilingual Pre-Trained Speech Encoders

Heng-Jui Chang, Ning Dong (AI), Ruslan Mavlyutov, Sravya Popuri, Andy Chung

April 14, 2024

February 21, 2024

INTEGRITY

NLP

Watermarking Makes Language Models Radioactive

Tom Sander, Pierre Fernandez, Alain Durmus, Matthijs Douze, Teddy Furon

February 21, 2024

December 11, 2023

SPEECH & AUDIO

Audiobox: Unified Audio Generation with Natural Language Prompts

Wei-Ning Hsu, Akinniyi Akinyemi, Alice Rakotoarison, Andros Tjandra, Apoorv Vyas, Baishan Guo, Bapi Akula, Bowen Shi, Brian Ellis, Ivan Cruz, Jeff Wang, Jiemin Zhang, Mary Williamson, Matt Le, Rashel Moritz, Robbie Adkins, William Ngan, Xinyue Zhang, Yael Yungster, Yi-Chiao Wu

December 11, 2023

December 07, 2023

CONVERSATIONAL AI

NLP

Llama Guard: LLM-based Input-Output Safeguard for Human-AI Conversations

Hakan Inan, Kartikeya Upasani, Jianfeng Chi, Rashi Rungta, Krithika Iyer, Yuning Mao, Davide Testuggine, Madian Khabsa

December 07, 2023

Help Us Pioneer The Future of AI

We share our open source frameworks, tools, libraries, and models for everything from research exploration to large-scale production deployment.