ML APPLICATIONS

RESEARCH

On the Convergence of Nesterov’s Accelerated Gradient Method in Stochastic Settings

August 14, 2020

Abstract

We study Nesterov’s accelerated gradient method with constant step-size and momentum parameters in the stochastic approximation setting (unbiased gradients with bounded variance) and the finite-sum setting (where randomness is due to sampling mini-batches). To build better insight into the behavior of Nesterov’s method in stochastic settings, we focus throughout on objectives that are smooth, strongly-convex, and twice continuously differentiable. In the stochastic approximation setting, Nesterov’s method converges to a neighborhood of the optimal point at the same accelerated rate as in the deterministic setting. Perhaps surprisingly, in the finite-sum setting, we prove that Nesterov’s method may diverge with the usual choice of step-size and momentum, unless additional conditions on the problem related to conditioning and data coherence are satisfied. Our results shed light as to why Nesterov’s method may fail to converge or achieve acceleration in the finite-sum setting.

Download the Paper

AUTHORS

Written by

Mido Assran

Michael Rabbat

Publisher

International Conference on Machine Learning (ICML)

Research Topics

Machine Learning

Related Publications

June 03, 2019

NLP

FAIRSEQ: A Fast, Extensible Toolkit for Sequence Modeling | Facebook AI Research

FAIRSEQ is an open-source sequence modeling toolkit that allows researchers and developers to train custom models for translation, summarization, language modeling, and other text generation tasks. The toolkit is based on PyTorch and supports…

Myle Ott, Sergey Edunov, Alexei Baevski, Angela Fan, Sam Gross, Nathan Ng, David Grangier, Michael Auli

June 03, 2019

June 02, 2019

NLP

Cooperative Learning of Disjoint Syntax and Semantics | Facebook AI Research

There has been considerable attention devoted to models that learn to jointly infer an expression’s syntactic structure and its semantics. Yet, Nangia and Bowman (2018) has recently shown that the current best systems fail to learn the correct…

Serhii Havrylov, Germán Kruszewski, Armand Joulin

June 02, 2019

June 15, 2019

COMPUTER VISION

FBNet: Hardware-Aware Efficient ConvNet Design via Differentiable Neural Architecture Search | Facebook AI Research

Designing accurate and efficient ConvNets for mobile devices is challenging because the design space is combinatorially large. Due to this, previous neural architecture search (NAS) methods are computationally expensive. ConvNet architecture…

Bichen Wu, Xiaoliang Dai, Peizhao Zhang, Yanghan Wang, Fei Sun, Yiming Wu, Yuandong Tian, Peter Vajda, Yangqing Jia, Kurt Keutzer

June 15, 2019

April 28, 2019

COMPUTER VISION

Inverse Path Tracing for Joint Material and Lighting Estimation | Facebook AI Research

Modern computer vision algorithms have brought significant advancement to 3D geometry reconstruction. However, illumination and material reconstruction remain less studied, with current approaches assuming very simplified models for materials…

Dejan Azinović, Tzu-Mao Li, Anton Kaplanyan, Matthias Nießner

April 28, 2019

Help Us Pioneer The Future of AI

We share our open source frameworks, tools, libraries, and models for everything from research exploration to large-scale production deployment.