Reinforcement Learni9ng

Prioritized Level Replay

July 17, 2021

Abstract

Environments with procedurally generated content serve as important benchmarks for testing systematic generalization in deep reinforcement learning. In this setting, each level is an algorithmically created environment instance with a unique configuration of its factors of variation. Training on a prespecified subset of levels allows for testing generalization to unseen levels. What can be learned from a level depends on the current policy, yet prior work defaults to uniform sampling of training levels independently of the policy. We introduce Prioritized Level Replay (PLR), a general framework for selectively sampling the next training level by prioritizing those with higher estimated learning potential when revisited in the future. We show TD-errors effectively estimate a level’s future learning potential and, when used to guide the sampling procedure, induce an emergent curriculum of increasingly difficult levels. By adapting the sampling of training levels, PLR significantly improves sample efficiency and generalization on Procgen Benchmark—matching the previous state-of-the-art in test return—and readily combines with other methods. Combined with the previous leading method, PLR raises the state-of-the-art to over 76% improvement in test return relative to standard RL baselines.

Download the Paper

AUTHORS

Publisher

ICML 2021

Research Topics

Reinforcement Learning

Related Publications

December 05, 2020

Robotics

Reinforcement Learni9ng

Neural Dynamic Policies for End-to-End Sensorimotor Learning

Deepak Pathak, Abhinav Gupta, Mustafa Mukadam, Shikhar Bahl

December 05, 2020

December 07, 2020

Reinforcement Learni9ng

Joint Policy Search for Collaborative Multi-agent Imperfect Information Games

Yuandong Tian, Qucheng Gong, Tina Jiang

December 07, 2020

March 13, 2021

Reinforcement Learni9ng

On the Importance of Hyperparameter Optimization for Model-based Reinforcement Learning

Baohe Zhang, Raghu Rajan, Luis Pineda, Nathan Lambert, Andre Biedenkapp, Kurtland Chua, Frank Hutter, Roberto Calandra

March 13, 2021

October 10, 2020

Computer Vision

Reinforcement Learni9ng

Active MR k-space Sampling with Reinforcement Learning

Luis Pineda, Sumana Basu, Adriana Romero,Roberto CalandraRoberto Calandra, Michal Drozdzal

October 10, 2020

December 05, 2020

Reinforcement Learni9ng

An Asymptotically Optimal Primal-Dual Incremental Algorithm for Contextual Linear Bandits

Andrea Tirinzonin, Matteo Pirotta, Marcello Restelli, Alessandro Lazaric

December 05, 2020

Help Us Pioneer The Future of AI

We share our open source frameworks, tools, libraries, and models for everything from research exploration to large-scale production deployment.