Learning Invariant Representations for Reinforcement Learning without Reconstruction

May 4, 2021


We study how representation learning can accelerate reinforcement learning from rich observations, such as images, without relying either on domain knowledge or pixel-reconstruction. Our goal is to learn representations that provide for effective downstream control and invariance to task-irrelevant details. Bisimulation metrics quantify behavioral similarity between states in continuous MDPs, which we propose using to learn robust latent representations which encode only the task-relevant information from observations. Our method trains encoders such that distances in latent space equal bisimulation distances in state space. We demonstrate the effectiveness of our method at disregarding task-irrelevant information using modified visual MuJoCo tasks, where the background is replaced with moving distractors and natural videos, while achieving SOTA performance. We also test a first-person highway driving task where our method learns invariance to clouds, weather, and time of day. Finally, we provide generalization results drawn from properties of bisimulation metrics, and links to causal inference.

Download the Paper


Written by

Amy Zhang

Rowan McAllister

Roberto Calandra

Yarin Gal

Sergey Levine


ICLR 2021

Research Topics

Reinforcement Learning

Core Machine Learning

Related Publications

December 07, 2020


Joint Policy Search for Collaborative Multi-agent Imperfect Information Games

To learn good joint policies for multi-agent collaboration with imperfect information remains a fundamental challenge. While for two-player zero-sum games, coordinate-ascent approaches…

Stéphane d’Ascoli, Levent Sagun, Giulio Biroli

December 07, 2020

December 18, 2020



Reinforcement Learning-based Product Delivery Frequency Control

Frequency control is an important problem in modern recommender systems. It dictates the delivery frequency of recommendations to maintain product quality and efficiency.…

Yang Liu, Zhengxing Chen, Kittipat Virochsiri, Juan Wang, Jiahao Wu, Feng Liang

December 18, 2020

December 05, 2020


An Asymptotically Optimal Primal-Dual Incremental Algorithm for Contextual Linear Bandits

In the contextual linear bandit setting, algorithms built on the optimism principle fail to exploit the structure of the problem and have been shown to be asymptotically suboptimal. …

Andrea Tirinzonin, Matteo Pirotta, Marcello Restelli, Alessandro Lazaric

December 05, 2020

October 10, 2020



Active MR k-space Sampling with Reinforcement Learning

Deep learning approaches have recently shown great promise in accelerating magnetic resonance image (MRI) acquisition. The majority of existing work have focused on designing better reconstruction models…

Luis Pineda, Sumana Basu, Adriana Romero,Roberto CalandraRoberto Calandra, Michal Drozdzal

October 10, 2020

Help Us Pioneer The Future of AI

We share our open source frameworks, tools, libraries, and models for everything from research exploration to large-scale production deployment.