RESEARCH AREA

Theory

Artificial intelligence has enjoyed immense practical success in recent years, largely due to advances in machine learning, especially deep learning via optimization. Rich mathematical theory explaining the many empirical results can help drive further advances, informed by feedback from those advances.

The latest results connect with celebrated techniques in learning theory, optimization, signal processing, and statistics. The interplay between rigorous theory and engineering advances pushes forward the frontiers of AI.

Latest Publications

May 08, 2019

THEORY

Fluctuation-dissipation relations for stochastic gradient descent

Here, we derive stationary fluctuation-dissipation relations that link measurable quantities and hyperparameters in the stochastic gradient descent algorithm.

Sho Yaida

May 08, 2019

June 09, 2019

THEORY

Manifold mixup: better representations by interpolating hidden states

Deep neural networks excel at learning the training data, but often provide incorrect and confident predictions when evaluated on slightly different test examples. This includes distribution shifts, outliers, and adversarial examples. To address these issues, we propose Manifold Mixup, a simple regularizer that encourages neural networks to predict less confidently on interpolations of hidden representations.

Vikas Verma, Alex Lamb, Christopher Beckham, Amir Najafi, Ioannis Mitliagkas, David Lopez-Paz, Yoshua Bengio

June 09, 2019

March 12, 2018

THEORY

Geometrical insights for implicit generative modeling

Learning algorithms for implicit generative models can optimize a variety of criteria that measure how the data distribution differs from the implicit model distribution, including the Wasserstein distance, the Energy distance, and the Maximum Mean Discrepancy criterion.

Leon Bottou, Martin Arjovsky, David Lopez-Paz, Maxime Oquab

March 12, 2018

April 30, 2018

THEORY

Mixup: beyond empirical risk minimization

Our experiments on the ImageNet-2012, CIFAR-10, CIFAR-100, Google commands and UCI datasets show that mixup improves the generalization of state-of-the-art neural network architectures.

Hongyi Zhang, Moustapha Cisse, Yann Dauphin, David Lopez-Paz

April 30, 2018