Detecting manipulated images: The Image Similarity Challenge results and winners

December 6, 2021

At NeurIPS 2021 we’re sharing results of the Image Similarity Challenge, an online AI research competition to create systems that can assess the similarity of two images in order to identify manipulated images.
The challenge leverages a new million-image data set built and shared by Meta AI, providing an open, global benchmark for systems that combat harmful image manipulation and abuses online. The Image Similarity data set is the largest of its kind, and we are making it available to the public for academic and research use.
The competition drew more than 200 participants, who trained and tested models using a first-of-its kind dataset. The challenge was supported by Pinterest, BBC, Getty Images, iStock, and Shutterstock.

Most of us edit our online photos for fun: to enhance a sunset or give a friend cartoon bunny ears. But some people manipulate pictures to spread misinformation or to evade tools designed to detect harmful content. Even after a platform removes an image for breaking its rules, automatic tools might miss modified copies reposted by others.

To help address the risks of misinformation and abuse on social media associated with image manipulation, in June 2021 Meta AI launched the Image Similarity Challenge, an online competition that invited participants from all over the world to train models to predict the similarity of two pieces of visual content. Hosted by DrivenData, the contest drew more than 200 participants. The top-scoring teams, VisionForce and Iyakaap, each won $50,000. Pinterest, BBC, Getty Images, iStock, and Shutterstock supported the challenge. Details on the winning entrants are available on the DrivenData site here and links to their papers and code are available on GitHub here.

The winning models were based on feature extraction with ensembles of convolutional neural net or transformer models. They were trained in an unsupervised way, with strong data augmentation to simulate the copy detection task. Participants also used traditional SIFT feature matching and explicit detection of regions of interest. The models will be open-sourced to the research community, so others can leverage their work to better detect manipulated media.

As part of our effort to make social media safer and more trustworthy, we’ve also built the Image Similarity data set to provide a global benchmark for recognizing copies of harmful images. We are now making the data set available to the public for academic and research use.

If you are registered to NeurIPS, please join us on Dec 10th at 10:25 GMT to follow the breakout session we organize about this challenge. More information is available here now and we will add video of the session when it is available.

Manipulated images: A challenge at scale

We invited contestants to train AI models to determine whether all or part of an image had been reproduced from a different image, a task called image similarity detection. The obstacle is that copies may not be exact replicas of the original: people can flip or crop a picture, distort angles, alter the colors, and combine them with other images. These manipulations fall on a continuum from the benign — humorous text added to a cat photo, for example — to the malicious.

Because we believe that including many different perspectives is the best way to find a solution, we offered the broader research community the opportunity to collaborate on the problem by developing models for the Image Similarity Challenge. AI models look for patterns in pixels — for example, areas of contrast implying a distinctive edge — that appear in both the source and its duplicate. Doctoring or cropping out those features, then, can obscure the origin of a photo. The more extreme the manipulation, the harder it can be to identify the copy. The most frequent type of alteration to evade detection was when a fragment of one photo was pasted onto another, presumably because this type of edit preserves the fewest pixels from the original.

To take action quickly and at scale, Meta AI relies on algorithms to help flag or remove content that violates our policies. To that end, Meta AI researchers and engineers recently developed SSN++, an AI model that can detect near-exact duplicates. As part of our end-to-end image indexing and matching system, SSN++ weeds out new images that match any of the hundreds of millions that we have already found to be in violation of our rules. But no current solution can detect every copied image among billions of uploads. The task becomes even more difficult when people purposefully try to sneak content past our moderation system — sometimes resubmitting different versions until it is accepted.

A new, open-source image similarity data set

Compounding the challenge, the technology industry has lacked a large, standardized, open-license data set to help researchers develop new systems to identify duplicates of harmful images at scale. As with our work on the DeepFake Detection Challenge, The Hateful Memes Challenge, and the ML Code Completeness Checklist, Meta AI believes the best solutions will come from open collaboration. So we created the Image Similarity 2021 Data Set (DISC21) and shared it with the broader research community. With 1 million reference images and an additional 50,000 query images, it is the largest publicly available data set on image similarity. One-fifth of the query images were derived from a reference image, using human and automated edits that mimic real-world behavior. The rest have no match in the reference collection.

We worked with Giorgos Tolias, Tomas Jenicek, and Ondrej Chum, image-matching experts at the Czech Technical University in Prague, to calibrate the difficulty of the transformations. We purposefully made the copies harder to detect than the ones we typically observe on our platforms, both to inspire researchers and to prioritize cases that are not already solved in production.

For more information on the data set and competition, find our paper here.

The Image Similarity Challenge

With the Image Similarity Challenge, as with the data set, we hope to advance the industry closer to at-scale detection of copied images. The competition, which ended October 28, ran in two tracks. In the matching track, participants tried to create models — optimized for accuracy — that could directly detect whether a given query image was derived from any of the reference images. The VisionForce team, composed of researchers Wenhao Wang, Yifan Sun, Weipu Zhang, Yi Yang, beat the 96 other entries in the matching track.

For contestants in the descriptor track, on the other hand, the goal was to generate vector representations for all the images in the data set. Out of 46 entries, the best open-source model was from Shuhei Yokoo, and a closed source approach from the titanshield2 team scored the highest overall. The models in the descriptor track reflect techniques that could be used in practice at Meta, because vector representations can be indexed and searched at the scale that Meta systems need.

What’s next?

The approaches used in the competition provide inspiration for approaches that could be applied in production at Meta. However, this is a long journey because they are often too computationally demanding to be applied directly at full scale.

We are announcing the winners of the challenge at NeurIPS 2021. Looking ahead, we are planning a video similarity challenge — the natural next step in our effort to work with others in the AI research community to root out harmful content.