The Hateful Memes Challenge and Data Set is a competition and open source data set designed to measure progress in multimodal vision-and-language classification.
In order for AI to become a more effective tool for detecting hate speech, it must be able to understand content the way people do: holistically. When viewing a meme, for example, we don’t think about the words and photo independently of each other; we understand the combined meaning. This is extremely challenging for machines, however, because it means they can’t analyze the text and the image separately. They must combine these different modalities and understand how the meaning changes when they are presented together.
To address this challenge, the research community is focused on building tools that take the different modalities present in a particular piece of content and then fuse them early in the classification process. This approach enables the system to analyze the different modalities together, like people do. The graphic below illustrates three types of tasks for understanding content. Only the middle category requires early fusion in order to understand the overall meaning:
Facebook AI’s work in this area is summarized in this research paper. To catalyze research in this area, Facebook AI has created the first data set to help build systems that better understand multimodal hate speech. We have released this Hateful Memes data set to the broader research community and launched the associated Hateful Memes Challenge, hosted by DrivenData, with a $100,000 prize pool. The competition launches for researchers on May 12, 2020 and closes on October 31, 2020. See the competition site for full details.
For more information, please see our latest blog post.
The Hateful Memes Challenge: Detecting Hate Speech in Multimodal Memes
The Hateful Memes Challenge is a data science competition hosted on DrivenData based on the Hateful Memes data set. This competition has been accepted for the NeurIPS 2020 Competition track, and we look forward to discussing the challenge results in more depth there.
The Hateful Memes data set, created in partnership with Getty Images, focuses on detecting hate speech in multimodal memes and is only available for researchers to download through the DrivenData competition. The competition launches for researchers on May 12, 2020 and closes on October 31, 2020. See the competition site for full details.