April 25, 2020
In this paper we consider self-supervised representation learning to improve sample efficiency in reinforcement learning (RL). We propose a forward prediction objective for simultaneously learning embeddings of states and action sequences. These embeddings capture the structure of the environment’s dynamics, enabling efficient policy learning. We demonstrate that our action embeddings alone improve the sample efficiency and peak performance of model-free RL on control from low-dimensional states. By combining state and action embeddings, we achieve efficient learning of high-quality policies on goal-conditioned continuous control from pixel observations in only 1-2 million environment steps.
Publisher
International Conference on Learning Representations (ICLR)
December 14, 2021
Akash Bharadwaj, Graham Cormode
December 14, 2021
May 14, 2021
Sainbayar Sukhbaatar, Da Ju, Spencer Poff, Stephen Roller, Arthur Szlam, Jason Weston, Angela Fan
May 14, 2021
May 03, 2021
Mandela Patrick, Po-Yao Huang, Florian Metze , Andrea Vedaldi, Alexander Hauptmann, Yuki M. Asano, JoĂŁo Henriques
May 03, 2021
April 08, 2021
Caner Hazirbas, Joanna Bitton, Brian Dolhansky, Jacqueline Pan, Albert Gordo, Cristian Canton Ferrer
April 08, 2021