RESEARCH

SPEECH & AUDIO

A Universal Music Translation Network

May 5, 2019

Abstract

We present a method for translating music across musical instruments and styles. This method is based on unsupervised training of a multi-domain wavenet autoencoder, with a shared encoder and a domain-independent latent space that is trained end-to-end on waveforms. Employing a diverse training dataset and large net capacity, the single encoder allows us to translate also from musical domains that were not seen during training. We evaluate our method on a dataset collected from professional musicians, and achieve convincing translations. We also study the properties of the obtained translation and demonstrate translating even from a whistle, potentially enabling the creation of instrumental music by untrained humans.

Download the Paper

Related Publications

September 15, 2019

COMPUTER VISION

NLP

Sequence-to-Sequence Speech Recognition with Time-Depth Separable Convolutions | Facebook AI Research

We propose a fully convolutional sequence-to-sequence encoder architecture with a simple and efficient decoder. Our model improves WER on LibriSpeech while being an order of magnitude more efficient than a strong RNN baseline. Key to our…

Awni Hannun, Ann Lee, Qiantong Xu, Ronan Collobert

September 15, 2019

September 15, 2019

SPEECH & AUDIO

Who Needs Words? Lexicon-Free Speech Recognition | Facebook AI Research

Lexicon-free speech recognition naturally deals with the problem of out-of-vocabulary (OOV) words. In this paper, we show that character-based language models (LM) can perform as well as word-based LMs for speech recognition, in word error…

Tatiana Likhomanenko, Gabriel Synnaeve, Ronan Collobert

September 15, 2019

June 14, 2019

SPEECH & AUDIO

COMPUTER VISION

2.5D Visual Sound | Facebook AI Research

Binaural audio provides a listener with 3D sound sensation, allowing a rich perceptual experience of the scene. However, binaural recordings are scarcely available and require nontrivial expertise and equipment to obtain. We propose to convert…

Ruohan Gao, Kristen Grauman

June 14, 2019

June 13, 2019

COMPUTER VISION

SPEECH & AUDIO

ChamNet: Towards Efficient Network Design through Platform-Aware Model Adaptation | Facebook AI Research

This paper proposes an efficient neural network (NN) architecture design methodology called Chameleon that honors given resource constraints. Instead of developing new building blocks or using computationally-intensive reinforcement learning…

Xiaoliang Dai, Peizhao Zhang, Bichen Wu, Hongxu Yin, Fei Sun, Yanghan Wang, Marat Dukhan, Yunqing Hu, Yiming Wu, Yangqing Jia, Peter Vajda, Matt Uyttendaele, Niraj K. Jha

June 13, 2019

Related Work

Help Us Pioneer The Future of AI

We share our open source frameworks, tools, libraries, and models for everything from research exploration to large-scale production deployment.