RESEARCH AREA

Reinforcement Learning

Reinforcement Learning (RL) researchers at Facebook develop AI agents that can learn to solve tasks in an unknown environment by interacting with it over time. RL agents can enable significant improvements in a broad range of applications, from personal assistants that naturally interact with people and adapt to their needs, to autonomous robots that can readily adjust to changing environments

At Facebook, our research spans several aspects of RL, including sample efficiency of deep RL algorithms, theoretical aspects of RL algorithms, RL algorithms integrating inputs from multiple sources (e.g., language), RL agents integrating real-world constraints (e.g., fairness, privacy, and security), RL agents for human interaction, multi-agent RL, and self-supervised RL.

Latest Publications

March 13, 2021

REINFORCEMENT LEARNING

On the Importance of Hyperparameter Optimization for Model-based Reinforcement Learning

In the contextual linear bandit setting, algorithms built on the optimism principle fail to exploit the structure of the problem and have been shown to be asymptotically suboptimal. …

Baohe Zhang, Raghu Rajan, Luis Pineda, Nathan Lambert, Andre Biedenkapp, Kurtland Chua, Frank Hutter, Roberto Calandra

March 13, 2021

December 05, 2020

REINFORCEMENT LEARNING

An Asymptotically Optimal Primal-Dual Incremental Algorithm for Contextual Linear Bandits

In the contextual linear bandit setting, algorithms built on the optimism principle fail to exploit the structure of the problem and have been shown to be asymptotically suboptimal. …

Andrea Tirinzonin, Matteo Pirotta, Marcello Restelli, Alessandro Lazaric

December 05, 2020

December 07, 2020

REINFORCEMENT LEARNING

Joint Policy Search for Collaborative Multi-agent Imperfect Information Games

To learn good joint policies for multi-agent collaboration with imperfect information remains a fundamental challenge. While for two-player zero-sum games, coordinate-ascent approaches (optimizing one agent's policy at a time, e.g., self-play) work with guarantees, in multi-agent cooperative setting they often converge to sub-optimal Nash equilibrium.

Yuandong Tian, Qucheng Gong, Tina Jiang

December 07, 2020