Bringing the world closer together by advancing AI

DensePose

Bringing the world closer together by advancing AI

AI for Commerce

Bringing the world closer together by advancing AI

Deepfake Detection

Bringing the world closer together by advancing AI

DensePose

Bringing the world closer together by advancing AI

AI for Commerce

Bringing the world closer together by advancing AI

Deepfake Detection

Open-Source AI Tools

We share our open source frameworks, tools, libraries, and models for everything from research exploration to large-scale production deployment.

Open-Source AI Research

We're advancing the state-of-the-art in artificial intelligence through fundamental and applied research in open collaboration with the community.

Notable Papers

RESEARCH

ML APPLICATIONS

GrokNet: Unified Computer Vision Model Trunk and Embeddings For Commerce

Sean Bell

Yiqun Liu

Sami Alsheikh

Yina Tang...

KDD

COMPUTER VISION

Live Face De-Identification in Video

Oran Gafni

Lior Wolf

Yaniv Taigman

International Conference on Computer Vision (ICCV)

RESEARCH

Single-Network Whole-Body Pose Estimation

Gines Hidalgo

Yaadhav Raaj

Haroon Idrees

Donglai Xiang...

International Conference on Computer Vision (ICCV)

SPEECH & AUDIO

A Universal Music Translation Network

Noam Mor

Lior Wolf

Adam Polyak

Yaniv Taigman

International Conference on Learning Representations (ICLR)

Latest Publications

October 14, 2021

GRAPHICS

COMPUTER VISION

Ego4D: Around the World in 3,000 Hours of Egocentric Video

We introduce Ego4D, a massive-scale egocentric video dataset and benchmark suite. It offers 3,025 hours of dailylife activity video spanning hundreds of scenarios (household, outdoor, workplace, leisure, etc.) captured by 855 unique camera wearers from 74 worldwide locations and 9 different countries. …

Kristen Grauman, Andrew Westbury, Eugene Byrne, Zachary Chavis, Antonino Furnari, Rohit Girdhar, Jackson Hamburger, Hao Jiang, Miao Liu, Xingyu Liu, Miguel Martin, Tushar Nagarajan, Ilija Radosavovic, Santhosh Ramakrishnan, Fiona Ryan,Jayant Sharma, Michael Wray, Mengmeng Xu, Eric Zhongcong Xu, Chen Zhao, …

October 14, 2021

COMPUTER VISION

LiRA: Learning Visual Speech Representations from Audio through Self-supervision

The large amount of audiovisual content being shared online today has drawn substantial attention to the prospect of audio- visual self-supervised learning. Recent works have focused on each of these modalities separately, while others have attempted to model both simultaneously in a cross-modal fashion. How- ever, comparatively little attention has been given to leveraging one modality as a training objective to learn from the other. In this work, we propose Learning visual speech Representations from Audio via self-supervision (LiRA). Specifically, we train a ResNet+Conformer model to predict acoustic features from unlabelled visual speech. We find that this pre-trained model can be leveraged towards word-level and sentence-level lip-reading through feature extraction and fine-tuning experiments. We show that our approach significantly outperforms other self-supervised methods on the Lip Reading in the Wild (LRW) dataset and achieves state-of-the-art performance on Lip Reading Sentences 2 (LRS2) using only a fraction of the total labelled data.

Maja Pantic, Stavros Petridis, Bjoern Schuller, Pingchuan Ma, Rodrigo Schonburg Carrillo De Mira

October 01, 2021

COMPUTER VISION

CONVERSATIONAL AI

DVD: A Diagnostic Dataset for Multi-step Reasoning in Video Grounded Dialogue

A video-grounded dialogue system is required to understand both dialogue, which contains semantic dependencies from turn to turn …

Hung Le, Chinnadhurai Sankar, Seungwhan Moon, Ahmad Beiram, Alborz Geramifard, Satwik Kottur

October 01, 2021

September 27, 2021

RANKING & RECOMMENDATIONS

Transformers4Rec: Bridging the Gap between NLP and Sequential / Session-Based Recommendation

Much of the recent progress in sequential and session-based recommendation has been driven by improvements in model architecture and pretraining techniques originating in the field of Natural Language Processing. …

Gabriel de Souza Pereira Moreira, Sara Rabhi, Jeong Min Lee, Ronay Ak, Even Oldridge

September 27, 2021

Help Us Pioneer the Future of AI

We share our open source frameworks, tools, libraries, and models for everything from research exploration to large-scale production deployment.