June 9, 2019
Over the past few years, neural networks were proven vulnerable to adversarial images: targeted but imperceptible image perturbations lead to drastically different predictions. We show that adversarial vulnerability increases with the gradients of the training objective when viewed as a function of the inputs. Surprisingly, vulnerability does not depend on network topology: for many standard network architectures, we prove that at initialization, the l1-norm of these gradients grows as the square root of the input dimension, leaving the networks increasingly vulnerable with growing image size. We empirically show that this dimension dependence persists after either usual or robust training, but gets attenuated with higher regularization.
Many visual scenes contain text that carries crucial information, and it is thus essential to understand text in images for downstream reasoning tasks. For example, a deep water label on a warning sign warns people about the danger in the…
Ronghang Hu, Amanpreet Singh, Trevor Darrell, Marcus Rohrbach
The long-tail distribution of the visual world poses great challenges for deep learning based classification models on how to handle the class imbalance problem.…
Bingyi Kang, Saining Xie, Marcus Rohrbach, Zhicheng Yan, Albert Gordo, Jiashi Feng, Yannis Kalantidis
The evolution of clothing styles and their migration across the world is intriguing, yet difficult to describe quantitatively.
Ziad Al-Halah, Kristen Grauman
Humans understand novel sentences by composing meanings and roles of core language components. In contrast, neural network models for natural language modeling fail when such compositional generalization is required. The main contribution of…
Jonathan Gordon, David Lopez-Paz, Marco Baroni, Diane Bouchacourt