JUNE 25, 2020

Deepfake Detection Challenge Dataset

The Deepfake Detection Challenge Dataset is designed to measure progress on deepfake detection technology.

Overview

We partnered with other industry leaders and academic experts in September 2019 to create the Deepfake Detection Challenge (DFDC) in order to accelerate development of new ways to detect deepfake videos. In doing so, we created and shared a unique new dataset for the challenge consisting of more than 100,000 videos. The DFDC has enabled experts from around the world to come together, benchmark their deepfake detection models, try new approaches, and learn from each others’ work.

The DFDC dataset consists of two versions:

  • Preview dataset

    • 5k videos

    • Featuring two facial modification algorithms

    • Associated research paper

  • Full dataset

    • 124k videos

    • Featuring eight facial modification algorithms

    • Associated research paper

This full dataset was used by participants during a Kaggle competition to create new and better models to detect manipulated media. The dataset was created by Facebook with paid actors who entered into an agreement to the use and manipulation of their likenesses in our creation of the dataset.

We hope that by making this dataset available outside the challenge, the research community will continue to accelerate progress on detecting harmful manipulated media.

Facebook AI’s work in this space can be found in this blog post for more information.

If using this dataset, please cite the paper associated with the relevant dataset (preview/full):


@misc{DFDC2019Preview,

title={The Deepfake Detection Challenge (DFDC) Preview Dataset},

author={Brian Dolhansky, Russ Howes, Ben Pflaum, Nicole Baram, Cristian Canton Ferrer},

year={2019},

eprint={1910.08854},

archivePrefix={arXiv},

primaryClass={cs.CV}}

}


@misc{DFDC2020,

title={The DeepFake Detection Challenge Dataset},

author={Brian Dolhansky, Joanna Bitton, Ben Pflaum, Jikuo Lu, Russ Howes, Menglin Wang, Cristian Canton Ferrer},

year={2020},

eprint={2006.07397},

archivePrefix={arXiv},

primaryClass={cs.CV}

}


Download Prerequisites

In order to access the datasets, each user must have an AWS account with an IAM user and Access Keys setup. Each user must also have an AWS account number ready in order to sign up and access the datasets.

3

Make note of the AWS account ID

Results and Impact of the DFDC

The top-performing model on the public data set achieved 82.56 percent average precision, a common accuracy measure for computer vision tasks. But when evaluating the entrants against the black box data set, the ranking of top-performing models changed significantly. The highest-performing entrant was a model entered by Selim Seferbekov. It achieved an average precision of 65.18 percent against the black box data set. Using the public data set, this model had been ranked fourth. Similarly, the other winning models, which were second through fifth when tested against the black box environment, also ranked lower on the public leaderboard. (They were 37th, 6th, 10th and 17th, respectively.) This outcome reinforces the importance of learning to generalize to unforeseen examples when addressing the challenges of deepfake detection. The competition was hosted by Kaggle and winners were selected using the log-loss score against the private test set.

Competition Leaderboard

1

Selim Seferbekov

0.42798

2

\WM/

0.42842

3

NtechLab

0.43452

4

Eighteen years old

0.43476

5

The Medics

0.43711

6

Konstantin Simonchik

0.44289

7

All Faces Are Real

0.445315

8

ID R&D

0.44837

9

名侦探柯西

0.44911

10

deeeepface

0.45149

This list shows the top 10 results on the final leaderboard, where participants were evaluated using the black box data set.

The Future of DFDC

As the research community looks to build upon the results of the challenge, we should all think more broadly and consider solutions that go beyond analyzing images and video. Considering context, provenance, and other signals may be the way to improve deepfake detection models.

Once the challenge was completed, we worked with several of our academic partners to stress-test the winning models. We wanted to better understand any specific vulnerabilities in the models before they were open-sourced. The University of Oxford, the National University of Singapore, and the Technical University of Munich all participated and utilized different techniques to try to trick the models. The participants presented methods to trigger false positives and false negatives, and relevant insights on the generalization capabilities of the winning models. These results were presented at the Media Forensics Workshop @ CVPR.