RESEARCH

SPEECH & AUDIO

A Universal Music Translation Network

May 5, 2019

Abstract

We present a method for translating music across musical instruments and styles. This method is based on unsupervised training of a multi-domain wavenet autoencoder, with a shared encoder and a domain-independent latent space that is trained end-to-end on waveforms. Employing a diverse training dataset and large net capacity, the single encoder allows us to translate also from musical domains that were not seen during training. We evaluate our method on a dataset collected from professional musicians, and achieve convincing translations. We also study the properties of the obtained translation and demonstrate translating even from a whistle, potentially enabling the creation of instrumental music by untrained humans.

Download the Paper

Related Publications

November 04, 2019

NLP

SPEECH & AUDIO

Countering Language Drift via Visual Grounding | Facebook AI Research

Emergent multi-agent communication protocols are very different from natural language and not easily interpretable by humans. We find that agents that were initially pretrained to produce natural language can also experience detrimental…

Jason Lee, Kyunghyun Cho, Douwe Kiela

November 04, 2019

November 02, 2019

NLP

SPEECH & AUDIO

Build it Break it Fix it for Dialogue Safety: Robustness from Adversarial Human Attack | Facebook AI Research

The detection of offensive language in the context of a dialogue has become an increasingly important application of natural language processing. The detection of trolls in public forums (Galan-García et al., 2016), and the deployment of…

Emily Dinan, Samuel Humeau, Bharath Chintagunta, Jason Weston

November 02, 2019

May 08, 2018

COMPUTER VISION

SPEECH & AUDIO

Optimization Methods for Large-Scale Machine Learning | Facebook AI Research

This paper provides a review and commentary on the past, present, and future of numerical optimization algorithms in the context of machine learning applications. Through case studies on text classification and the training of deep neural…

Leon Bottou, Frank E. Curtis, Jorge Nocedal

May 08, 2018

October 27, 2019

COMPUTER VISION

SPEECH & AUDIO

Live Face De-Identification in Video | Facebook AI Research

We propose a method for face de-identification that enables fully automatic video modification at high frame rates. The goal is to maximally decorrelate the identity, while having the perception (pose, illumination and expression) fixed. We…

Oran Gafni, Lior Wolf, Yaniv Taigman

October 27, 2019

Related Work

Help Us Pioneer The Future of AI

We share our open source frameworks, tools, libraries, and models for everything from research exploration to large-scale production deployment.