Research

We're advancing the state-of-the-art in artificial intelligence through fundamental and applied research in open collaboration with the community.

Notable Papers

RESEARCH

ML APPLICATIONS

GrokNet: Unified Computer Vision Model Trunk and Embeddings For Commerce

Sean Bell

Yiqun Liu

Sami Alsheikh

Yina Tang...

KDD

COMPUTER VISION

Live Face De-Identification in Video

Oran Gafni

Lior Wolf

Yaniv Taigman

International Conference on Computer Vision (ICCV)

RESEARCH

Single-Network Whole-Body Pose Estimation

Gines Hidalgo

Yaadhav Raaj

Haroon Idrees

Donglai Xiang...

International Conference on Computer Vision (ICCV)

SPEECH & AUDIO

A Universal Music Translation Network

Noam Mor

Lior Wolf

Adam Polyak

Yaniv Taigman

International Conference on Learning Representations (ICLR)

Latest Publications

October 14, 2021

GRAPHICS

COMPUTER VISION

Ego4D: Around the World in 3,000 Hours of Egocentric Video

We introduce Ego4D, a massive-scale egocentric video dataset and benchmark suite. It offers 3,025 hours of dailylife activity video spanning hundreds of scenarios (household, outdoor, workplace, leisure, etc.) captured by 855 unique camera wearers from 74 worldwide locations and 9 different countries. …

Kristen Grauman, Andrew Westbury, Eugene Byrne, Zachary Chavis, Antonino Furnari, Rohit Girdhar, Jackson Hamburger, Hao Jiang, Miao Liu, Xingyu Liu, Miguel Martin, Tushar Nagarajan, Ilija Radosavovic, Santhosh Ramakrishnan, Fiona Ryan,Jayant Sharma, Michael Wray, Mengmeng Xu, Eric Zhongcong Xu, Chen Zhao, …

October 14, 2021

COMPUTER VISION

LiRA: Learning Visual Speech Representations from Audio through Self-supervision

The large amount of audiovisual content being shared online today has drawn substantial attention to the prospect of audio- visual self-supervised learning. Recent works have focused on each of these modalities separately, while others have attempted to model both simultaneously in a cross-modal fashion. How- ever, comparatively little attention has been given to leveraging one modality as a training objective to learn from the other. In this work, we propose Learning visual speech Representations from Audio via self-supervision (LiRA). Specifically, we train a ResNet+Conformer model to predict acoustic features from unlabelled visual speech. We find that this pre-trained model can be leveraged towards word-level and sentence-level lip-reading through feature extraction and fine-tuning experiments. We show that our approach significantly outperforms other self-supervised methods on the Lip Reading in the Wild (LRW) dataset and achieves state-of-the-art performance on Lip Reading Sentences 2 (LRS2) using only a fraction of the total labelled data.

Maja Pantic, Stavros Petridis, Bjoern Schuller, Pingchuan Ma, Rodrigo Schonburg Carrillo De Mira

October 01, 2021

COMPUTER VISION

CONVERSATIONAL AI

DVD: A Diagnostic Dataset for Multi-step Reasoning in Video Grounded Dialogue

A video-grounded dialogue system is required to understand both dialogue, which contains semantic dependencies from turn to turn …

Hung Le, Chinnadhurai Sankar, Seungwhan Moon, Ahmad Beiram, Alborz Geramifard, Satwik Kottur

October 01, 2021

September 27, 2021

RANKING & RECOMMENDATIONS

Transformers4Rec: Bridging the Gap between NLP and Sequential / Session-Based Recommendation

Much of the recent progress in sequential and session-based recommendation has been driven by improvements in model architecture and pretraining techniques originating in the field of Natural Language Processing. …

Gabriel de Souza Pereira Moreira, Sara Rabhi, Jeong Min Lee, Ronay Ak, Even Oldridge

September 27, 2021

Fundamental & Applied Research

At Facebook AI, we conduct both fundamental and applied research to advance our understanding and impact product experiences. We publish our discoveries in peer reviewed academic journals and conferences, and build AI technologies used by billions of people around the world.

Fundamental Research

FAIR seeks to further our fundamental understanding in both new and existing domains, covering the full spectrum of topics related to AI, with the mission of advancing the state-of-the-art of AI through open research for the benefit of all.

Along with the key principles of Facebook AI - openness, collaboration, excellence, and scale - we believe FAIR researchers also need to have the freedom and autonomy to design and follow their own research agendas so they can take on the most impactful work and develop the most disruptive projects, all while sharing their results with the community.

Applied Research

Facebook AI Applied Research engages in cutting-edge research that can improve and power new product experiences at huge scale for our community. Building on Facebook AI's key principles of openness, collaboration, excellence, and scale, we make big, bold research investments focused on building social value and bringing the world closer together.

Our Values

We align our fundamental and applied research efforts and applications around a few key principles:

Openness

We believe the latest advancements in AI should be published and open-sourced for the community to learn about and build upon.

Collaboration

We collaborate openly with both internal and external partners to share knowledge and cultivate diverse perspectives and needs.

Excellence

There is no shortage of new areas to explore in AI - our researchers focus on the projects that we believe will have the most positive impact on people and society.

Scale

To bring the benefits of AI to more people and improve accessibility, our research must account for both large scale data and computation needs.

Request for Proprosals

Facebook AI is pleased to invite university faculty to submit proposals that will help accelerate research on interpretable personalized recommendations using machine learning on graph data.

Help Us Pioneer the Future of AI

We share our open source frameworks, tools, libraries, and models for everything from research exploration to large-scale production deployment.