CORE MACHINE LEARNING

Learning-Rate-Free Learning by D-Adaptation

June 13, 2023

Abstract

D-Adaptation is an approach to automatically setting the learning rate which asymptotically achieves the optimal rate of convergence for minimizing convex Lipschitz functions, with no back-tracking or line searches, and no additional function value or gradient evaluations per step. Our approach is the first hyper-parameter free method for this class without additional multiplicative log factors in the convergence rate. We present extensive experiments for SGD and Adam variants of our method, where the method automatically matches hand-tuned learning rates across more than a dozen diverse machine learning problems, including large-scale vision and language problems. An open-source implementation is available.

Download the Paper

AUTHORS

Written by

Aaron Defazio

Konstantin Mishchenko

Publisher

ICML

Research Topics

Core Machine Learning

Related Publications

February 15, 2024

RANKING AND RECOMMENDATIONS

CORE MACHINE LEARNING

TASER: Temporal Adaptive Sampling for Fast and Accurate Dynamic Graph Representation Learning

Danny Deng, Hongkuan Zhou, Hanqing Zeng, Yinglong Xia, Chris Leung (AI), Jianbo Li, Rajgopal Kannan, Viktor Prasanna

February 15, 2024

February 15, 2024

CORE MACHINE LEARNING

Revisiting Feature Prediction for Learning Visual Representations from Video

Adrien Bardes, Quentin Garrido, Xinlei Chen, Michael Rabbat, Yann LeCun, Mido Assran, Nicolas Ballas, Jean Ponce

February 15, 2024

January 09, 2024

CORE MACHINE LEARNING

Accelerating a Triton Fused Kernel for W4A16 Quantized Inference with SplitK Work Decomposition

Less Wright, Adnan Hoque

January 09, 2024

January 06, 2024

RANKING AND RECOMMENDATIONS

REINFORCEMENT LEARNING

Learning to bid and rank together in recommendation systems

Geng Ji, Wentao Jiang, Jiang Li, Fahmid Morshed Fahid, Zhengxing Chen, Yinghua Li, Jun Xiao, Chongxi Bao, Zheqing (Bill) Zhu

January 06, 2024

Help Us Pioneer The Future of AI

We share our open source frameworks, tools, libraries, and models for everything from research exploration to large-scale production deployment.