Using platform-aware AI to design compact and efficient neural networks

6/17/2019

What the research is:

An adaptive approach to neural network design that uses a novel optimization algorithm to increase the computational efficiency of a network’s architecture. Previous efforts to build neural networks compact enough to run on resource-constrained platforms have relied on compute-heavy reinforcement learning (RL) algorithms, or time-consuming manually created custom architectures. This method, developed by researchers at Facebook, Princeton University, and the University of California, Berkeley, provides an automated alternative, with the AI model proposing an architecture based on searches that take into account the specific hardware, latency targets, and energy constraints of the platform that the network will run on.

How it works:

Called Chameleon, this approach employs accuracy, latency, and energy predictors to determine the most efficient use of a platform’s existing building blocks. These predictors are more efficient than RL-based algorithms, and incorporate Gaussian process regressors augmented with Bayesian optimization and imbalanced quasi Monte-Carlo sampling. Chameleon consistently outperformed state-of-the-art handcrafted designs and automated models in tests involving a range of hardware platforms and resource constraints. These results included accuracy gains of 5.8 percent over ResNet-152 on an Nvidia GTX 1060 GPU and 8.3 percent over MobileNetV2 on a mobile CPU. These wide-ranging improvements come with a significant reduction in computational overhead and development time, as compared with RL-based and manually customized architecture changes.

Why it matters:

Chameleon’s results show that predictive models can boost the efficiency of AI-based neural network design without increasing latency or energy use. And this approach is fast, taking just minutes to perform each adaptation search, further reducing the expense and time normally associated with optimizing a neural network to fit on leaner platforms. Since Chameleon adapts to the constraints and hardware it’s presented with, this method could expand access to neural networks for research organizations that don’t currently have the resources to take advantage of this technology.