December 11, 2020
AI has made progress in detecting hate speech, but important and difficult technical challenges remain. Back in May 2020, Facebook AI partnered with Getty Images and DrivenData to launch the Hateful Memes Challenge, a first-of-its-kind $100K competition and dataset to accelerate research on the problem of detecting hate speech that combines images and text. As part of the challenge, Facebook AI created a unique dataset of 10,000+ new multimodal examples, using licensed images from Getty Images so that researchers could easily use them in their work.
More than 3,300 participants from around the world entered the Hateful Memes Challenge, and we are now sharing details on the winning entries. The top-performing teams were:
Ron Zhu - link to code
Niklas Muennighoff - link to code
Team HateDetectron: Riza Velioglu and Jewgeni Rose - link to code
Team Kingsterdam: Phillip Lippe, Nithin Holla, Shantanu Chandra, Santhosh Rajamanickam, Georgios Antoniou, Ekaterina Shutova and Helen Yannakoudakis - link to code
Vlad Sandulescu - link to code
You can see the full leaderboard here. As part of the NeurIPS 2020 competition track, the top five winners will discuss their solutions and we facilitated a Q&A with participants from around the world. Each of these five implementations has been made open source and is available now.
We're pleased with the level of engagement on this difficult task. The top submission achieved 0.8450 AUC ROC. This substantially exceeds the performance of the baseline models we provided as part of the challenge. Participants used the public dataset to train and test their models, but they were then evaluated against a new, unseen test set.
Hate speech can come in many forms, including memes that combine text and images. This kind of multimodal content can be particularly challenging for AI to detect because it requires a holistic understanding of the meme. In one of the examples created by Facebook AI, for example, the text “Look how many people love you” appears with a picture of an empty desert. When considered separately, the phrase and the image are each innocuous. In order for AI to detect the actual meaning, it must analyze the meme holistically. To learn more about the difficulty of this task, please see the Hateful Memes paper.
The top five submissions employed a variety of different methods including: 1) ensembles of state-of-the-art vision and language models such as VILLA, UNITER, ERNIE-ViL, VL-BERT, and others; 2) rule-based add-ons, and 3) external knowledge, including labels derived from public object detection pipelines.
As a requirement for prize eligibility, the winning teams were asked to open-source all their code and write an academic paper outlining how to reproduce their results. We hope others across the AI research community will build on their work and be able to improve their own systems. We’ll also share learnings from the challenge and from the Hateful Memes discussion at NeurIPS 2020.
Open challenges and shared datasets are some of the AI research community’s most effective tools to accelerate progress on fundamental problems. Hate speech continues to be an important challenge, and multimodal hate speech remains an especially difficult machine learning problem. For example, the Workshop on Online Abuse and Harms (WOAH), held at ACL 2021 and co-organized by Facebook AI researchers, will have a special multimodal hate speech theme.
The Hateful Memes Challenge competition is over, but the real challenge is far from solved: A lot of work remains to be done in multimodal AI research, and we hope that that dataset can play an important role in evaluating new solutions that the field comes up with. The dataset’s design makes it a good candidate for evaluating the power of next-generation multimodal pretrained models, as well as unimagined advances in the field. We hope that the Hateful Memes dataset will continue to inform new approaches and methods going forward.
Manager, Research Scientist