Large-scale datasets and benchmarks for training, evaluating, and testing models to measure and advance AI progress.

Featured dataset

Segment Anything 1 Billion (SA-1B) Dataset

Designed for training general-purpose object segmentation models from open world images.


Casual Conversations V2

For evaluating computer vision, audio and speech models for accuracy across a diverse set of ages, genders, language/dialects, geographies, disabilities, and more

Casual Conversations

For evaluating computer vision and audio models for accuracy across a diverse set of age, genders, apparent skin tones and ambient lighting conditions

Common Objects in 3D (CO3D)

For learning category-specific 3D reconstruction and new-view synthesis using multi-view images of common object categories

Deepfake Detection Challenge

Measures progress on deepfake detection technology

DISC21 Dataset

Helps researchers evaluate their image copy detection models for accuracy

EgoObjects Dataset

A project that seeks to advance the fundamental AI research needed for multi-modal machine perception for first-person video understanding

FLoRes Benchmarking Dataset

Used for machine translation between English and low-resource languages.


Ego4D is a collaborative project, seeking to advance the fundamental AI research needed for multimodal machine perception for first-person video understanding.

Frameworks and tools

Sharing our ML frameworks and tools with the community to collaborate and accelerate AI advancement

Models and Libraries

Our open-sourced libraries and models for those taking our AI learnings further through software and app development


Our demos for anyone wanting to experience our latest research breakthroughs first hand

System cards

Multiple machine learning (ML) models that help people understand the intention, impact and limitations of our AI systems


Our library of published papers to learn about our latest AI breakthroughs and innovations