ML APPLICATIONS

The Differentiable Cross-Entropy Method

July 12, 2020

Abstract

We study the Cross-Entropy Method (CEM) for the non-convex optimization of a continuous and parameterized objective function and introduce a differentiable variant that enables us to differentiate the output of CEM with respect to the objective function’s parameters. In the machine learning setting this brings CEM inside of the endto-end learning pipeline where this has otherwise been impossible. We show applications in a synthetic energy-based structured prediction task and in non-convex continuous control. In the control setting we show how to embed optimal action sequences into a lower-dimensional space. This enables us to use policy optimization to fine-tune modeling components by differentiating through the CEM-based controller.

Download the Paper

AUTHORS

Publisher

International Conference on Machine Learning (ICML)

Related Publications

May 06, 2019

HUMAN & MACHINE INTELLIGENCE

Hierarchical RL Using an Ensemble of Proprioceptive Periodic Policies | Facebook AI Research

In this work we introduce a simple, robust approach to hierarchically training an agent in the setting of sparse reward tasks. The agent is split into a low-level and a high-level policy. The low-level policy only accesses internal,…

Kenneth Marino, Abhinav Gupta, Rob Fergus, Arthur Szlam

May 06, 2019

April 24, 2017

HUMAN & MACHINE INTELLIGENCE

COMPUTER VISION

Episodic Exploration for Deep Deterministic Policies for StarCraft Micro-Management | Facebook AI Research

We consider scenarios from the real-time strategy game StarCraft as benchmarks for reinforcement learning algorithms. We focus on micromanagement, that is, the short-term, low-level control of team members during a battle. We propose several…

Nicolas Usunier, Gabriel Synnaeve, Zeming Lin, Soumith Chintala

April 24, 2017

December 03, 2018

HUMAN & MACHINE INTELLIGENCE

SPEECH & AUDIO

Forward Modeling for Partial Observation Strategy Games | Facebook AI Research

We formulate the problem of defogging as state estimation and future state prediction from previous, partial observations in the context of real-time strategy games. We propose to employ encoder-decoder neural networks for this task, and…

Gabriel Synnaeve, Zeming Lin, Jonas Gehring, Dan Gant, Vegard Mella, Vasil Khalidov, Nicolas Carion, Nicolas Usunier

December 03, 2018

July 09, 2018

HUMAN & MACHINE INTELLIGENCE

Continuous Reasoning: Scaling the Impact of Formal Methods | Facebook AI Research

This paper describes work in continuous reasoning, where formal reasoning about a (changing) codebase is done in a fashion which mirrors the iterative, continuous model of software development that is increasingly practiced in industry. We…

Peter O'Hearn

July 09, 2018

Help Us Pioneer The Future of AI

We share our open source frameworks, tools, libraries, and models for everything from research exploration to large-scale production deployment.