RESEARCH

Machine Learning at Facebook: Understanding Inference at the Edge

February 16, 2019

Abstract

At Facebook, machine learning provides a wide range of capabilities that drive many aspects of user experience including ranking posts, content understanding, object detection and tracking for augmented and virtual reality, speech and text translations. While machine learning models are currently trained on customized datacenter infrastructure, Facebook is working to bring machine learning inference to the edge. By doing so, user experience is improved with reduced latency (inference time) and becomes less dependent on network connectivity. Furthermore, this also enables many more applications of deep learning with important features only made available at the edge. This paper takes a data-driven approach to present the opportunities and design challenges faced by Facebook in order to enable machine learning inference locally on smartphones and other edge platforms.

Download the Paper

Related Publications

June 16, 2019

COMPUTER VISION

3D human pose estimation in video with temporal convolutions and semi-supervised training | Facebook AI Research

In this work, we demonstrate that 3D poses in video can be effectively estimated with a fully convolutional model based on dilated temporal convolutions over 2D keypoints. We also introduce back-projection, a simple and effective…

Dario Pavllo, Christoph Feichtenhofer, David Grangier, Michael Auli

June 16, 2019

June 03, 2019

NLP

FAIRSEQ: A Fast, Extensible Toolkit for Sequence Modeling | Facebook AI Research

FAIRSEQ is an open-source sequence modeling toolkit that allows researchers and developers to train custom models for translation, summarization, language modeling, and other text generation tasks. The toolkit is based on PyTorch and supports…

Myle Ott, Sergey Edunov, Alexei Baevski, Angela Fan, Sam Gross, Nathan Ng, David Grangier, Michael Auli

June 03, 2019

June 02, 2019

NLP

Cooperative Learning of Disjoint Syntax and Semantics | Facebook AI Research

There has been considerable attention devoted to models that learn to jointly infer an expression’s syntactic structure and its semantics. Yet, Nangia and Bowman (2018) has recently shown that the current best systems fail to learn the correct…

Serhii Havrylov, Germán Kruszewski, Armand Joulin

June 02, 2019

June 15, 2019

COMPUTER VISION

FBNet: Hardware-Aware Efficient ConvNet Design via Differentiable Neural Architecture Search | Facebook AI Research

Designing accurate and efficient ConvNets for mobile devices is challenging because the design space is combinatorially large. Due to this, previous neural architecture search (NAS) methods are computationally expensive. ConvNet architecture…

Bichen Wu, Xiaoliang Dai, Peizhao Zhang, Yanghan Wang, Fei Sun, Yiming Wu, Yuandong Tian, Peter Vajda, Yangqing Jia, Kurt Keutzer

June 15, 2019

Help Us Pioneer The Future of AI

We share our open source frameworks, tools, libraries, and models for everything from research exploration to large-scale production deployment.