Resources

Datasets

Large-scale datasets and benchmarks for training, evaluating, and testing models to measure and advance AI progress.

Featured Dataset

FACET Dataset

FACET is a comprehensive benchmark dataset designed for measuring or evaluating the robustness and algorithmic fairness of AI and machine-learning vision models for protected groups.

Overview

Datasets

MMCSG Dataset

The MMCSG (Multi-Modal Conversations in Smart Glasses) dataset comprises two-sided conversations recorded using Aria glasses, featuring multi-modal data such as multi-channel audio, video, accelerometer, and gyroscope measurements.

Overview

Speech Fairness Dataset

Designed for training general-purpose object segmentation models from open world images.

Overview

Casual Conversations V2

For evaluating computer vision, audio and speech models for accuracy across a diverse set of ages, genders, language/dialects, geographies, disabilities, and more.

Overview

Casual Conversations

For evaluating computer vision and audio models for accuracy across a diverse set of age, genders, apparent skin tones and ambient lighting conditions.

Overview

Common Objects in 3D (CO3D)

For learning category-specific 3D reconstruction and new-view synthesis using multi-view images of common object categories.

Overview

Segment Anything

Designed for training general-purpose object segmentation models from open world images.

Overview

DISC21 Dataset

Helps researchers evaluate their image copy detection models for accuracy.

Overview

EgoObjects Dataset

A project that seeks to advance the fundamental AI research needed for multi-modal machine perception for first-person video understanding.

Overview

FLoRes Benchmarking Dataset

Used for machine translation between English and low-resource languages.

Overview

Ego4d

Ego4D is a collaborative project, seeking to advance the fundamental AI research needed for multimodal machine perception for first-person video understanding.

Product experiences