RESEARCH

NLP

The Architectural Implications of Facebook’s DNN-based Personalized Recommendation

February 17, 2020

Abstract

The widespread application of deep learning has changed the landscape of computation in data centers. In particular, personalized recommendation for content ranking is now largely accomplished using deep neural networks. However, despite their importance and the amount of compute cycles they consume, relatively little research attention has been devoted to recommendation systems. To facilitate research and advance the understanding of these workloads, this paper presents a set of real-world, production-scale DNNs for personalized recommendation coupled with relevant performance metrics for evaluation. In addition to releasing a set of open-source workloads, we conduct in-depth analysis that underpins future system design and optimization for at-scale recommendation: Inference latency varies by 60% across three Intel server generations, batching and co-location of inference jobs can drastically improve latency-bounded throughput, and diversity across recommendation models leads to different optimization strategies.

Download the Paper

Related Publications

Iterative Answer Prediction with Pointer-Augmented Multimodal Transformers for TextVQA | Facebook AI Research

Ronghang Hu, Amanpreet Singh, Trevor Darrell, Marcus Rohrbach

Permutation Equivariant Models for Compositional Generalization in Language | Facebook AI Research

Jonathan Gordon, David Lopez-Paz, Marco Baroni, Diane Bouchacourt

SPEECH & AUDIO

Who Needs Words? Lexicon-Free Speech Recognition | Facebook AI Research

Tatiana Likhomanenko, Gabriel Synnaeve, Ronan Collobert

NLP

Bridging the Gap Between Relevance Matching and Semantic Matching for Short Text Similarity Modeling | Facebook AI Research

Jinfeng Rao, Linqing Liu, Yi Tay, Wei Yang, Peng Shi, Jimmy Lin

Help Us Pioneer The Future of AI

We share our open source frameworks, tools, libraries, and models for everything from research exploration to large-scale production deployment.