RESEARCH

SPEECH & AUDIO

Learning Dynamics Model in Reinforcement Learning by Incorporating the Long Term Future

May 6, 2019

Abstract

In model-based reinforcement learning, the agent interleaves between model learning and planning. These two components are inextricably intertwined. If the model is not able to provide sensible long-term prediction, the executed planner would exploit model flaws, which can yield catastrophic failures. This paper focuses on building a model that reasons about the long-term future and demonstrates how to use this for efficient planning and exploration. To this end, we build a latent-variable autoregressive model by leveraging recent ideas in variational inference. We argue that forcing latent variables to carry future information through an auxiliary task substantially improves long-term predictions. Moreover, by planning in the latent space, the planner’s solution is ensured to be within regions where the model is valid. An exploration strategy can be devised by searching for unlikely trajectories under the model. Our method achieves higher reward faster compared to baselines on a variety of tasks and environments in both the imitation learning and model-based reinforcement learning settings.

See our code on GitHub

Download the Paper

Related Publications

July 28, 2019

SPEECH & AUDIO

COMPUTER VISION

Learning to Optimize Halide with Tree Search and Random Programs | Facebook AI Research

We present a new algorithm to automatically schedule Halide programs for high-performance image processing and deep learning. We significantly improve upon the performance of previous methods, which considered a limited subset of schedules. We…

Andrew Adams, Karima Ma, Luke Anderson, Riyadh Baghdadi, Tzu-Mao Li, Michaël Gharbi, Benoit Steiner, Steven Johnson, Kayvon Fatahalian, Frédo Durand, Jonathan Ragan-Kelley

July 28, 2019

September 15, 2019

SPEECH & AUDIO

Who Needs Words? Lexicon-Free Speech Recognition | Facebook AI Research

Lexicon-free speech recognition naturally deals with the problem of out-of-vocabulary (OOV) words. In this paper, we show that character-based language models (LM) can perform as well as word-based LMs for speech recognition, in word error…

Tatiana Likhomanenko, Gabriel Synnaeve, Ronan Collobert

September 15, 2019

December 04, 2018

SPEECH & AUDIO

Non-Adversarial Mapping with VAEs | Facebook AI Research

The study of cross-domain mapping without supervision has recently attracted much attention. Much of the recent progress was enabled by the use of adversarial training as well as cycle constraints. The practical difficulty of adversarial…

Yedid Hoshen

December 04, 2018

May 01, 2019

SPEECH & AUDIO

Learning graphs from data: A signal representation perspective | Facebook AI Research

The construction of a meaningful graph topology plays a crucial role in the effective representation, processing, analysis and visualization of structured data. When a natural choice of the graph is not readily available from the datasets, it…

Xiaowen Dong, Dorina Thanou, Michael Rabbat, Pascal Frossard

May 01, 2019

Related Work

Help Us Pioneer The Future of AI

We share our open source frameworks, tools, libraries, and models for everything from research exploration to large-scale production deployment.