RESEARCH

Voice Separation with an Unknown Number of Multiple Speakers

October 01, 2020

Abstract

We present a new method for separating a mixed audio sequence, in which multiple voices speak simultaneously. The new method employs gated neural networks that are trained to separate the voices at multiple processing steps, while maintaining the speaker in each output channel fixed. A different model is trained for every number of possible speakers, and the model with the largest number of speakers is employed to select the actual number of speakers in a given sample. Our method greatly outperforms the current state of the art, which, as we show, is not competitive for more than two speakers.

Download the Paper

AUTHORS

Written by

Eliya Nachmani

Lior Wolf

Yossef (Yossi) Adi

Publisher

ICML

Related Publications

November 28, 2022

RESEARCH

CORE MACHINE LEARNING

Neural Attentive Circuits

Nicolas Ballas, Bernhard Schölkopf, Chris Pal, Francesco Locatello, Li Erran, Martin Weiss, Nasim Rahaman, Yoshua Bengio

November 28, 2022

November 27, 2022

RESEARCH

Near Instance-Optimal PAC Reinforcement Learning for Deterministic MDPs

Andrea Tirinzoni, Aymen Al Marjani, Emilie Kaufmann

November 27, 2022

November 16, 2022

RESEARCH

NLP

Memorization Without Overfitting: Analyzing the Training Dynamics of Large Language Models

Kushal Tirumala, Aram H. Markosyan, Armen Aghajanyan, Luke Zettlemoyer

November 16, 2022

November 10, 2022

RESEARCH

COMPUTER VISION

Learning State-Aware Visual Representations from Audible Interactions

Unnat Jain, Abhinav Gupta, Himangi Mittal, Pedro Morgado

November 10, 2022

Help Us Pioneer The Future of AI

We share our open source frameworks, tools, libraries, and models for everything from research exploration to large-scale production deployment.