RESEARCH

SPEECH & AUDIO

Pay less attention with Lightweight and Dynamic Convolutions

April 19, 2019

Abstract

Self-attention is a useful mechanism to build generative models for language and images. It determines the importance of context elements by comparing each element to the current time step. In this paper, we show that a very lightweight convolution can perform competitively to the best reported self-attention results. Next, we introduce dynamic convolutions which are simpler and more efficient than self-attention. We predict separate convolution kernels based solely on the current time-step in order to determine the importance of context elements. The number of operations required by this approach scales linearly in the input length, whereas self-attention is quadratic. Experiments on large-scale machine translation, language modeling and abstractive summarization show that dynamic convolutions improve over strong self-attention models. On the WMT'14 English-German test set dynamic convolutions achieve a new state of the art of 29.7 BLEU.

Download the Paper

Related Publications

December 15, 2021

RESEARCH

Sample-and-threshold differential privacy: Histograms and applications

Akash Bharadwaj, Graham Cormode

December 15, 2021

August 30, 2021

SPEECH & AUDIO

NLP

A Two-stage Approach to Speech Bandwidth Extension

Yun Wang, Christian Fuegen, Didi Zhang, Gil Keren, Kaustubh Kalgaonkar, Ju Lin

August 30, 2021

January 09, 2021

RESEARCH

COMPUTER VISION

Tarsier: Evolving Noise Injection in Super-Resolution GANs

Baptiste Rozière, Camille Couprie, Olivier Teytaud, Andry Rasoanaivo, Hanhe Lin, Nathanaël Carraz Rakotonirina, Vlad Hosu

January 09, 2021

January 09, 2021

RESEARCH

Improved Sample Complexity for Incremental Autonomous Exploration in MDPs

Jean Tarbouriech, Alessandro Lazaric, Matteo Pirotta, Michal Valko

January 09, 2021