August 06, 2018
In this paper we present a deployed, scalable optical character recognition (OCR) system, which we call Rosetta , designed to process images uploaded daily at Facebook scale. Sharing of image content has become one of the primary ways to communicate information among internet users within social networks such as Facebook, and the understanding of such media, including its textual information, is of paramount importance to facilitate search and recommendation applications. We present modeling techniques for efficient detection and recognition of text in images and describe Rosetta ‘s system architecture. We perform extensive evaluation of presented technologies, explain useful practical approaches to build an OCR system at scale, and provide insightful intuitions as to why and how certain components work based on the lessons learnt during the development and deployment of the system.
Written by
Fedor Borisyuk
Albert Gordo
Viswanath Sivakumar
Publisher
KDD
Research Topics
April 18, 2024
Jonas Kohler, Albert Pumarola, Edgar Schoenfeld, Artsiom Sanakoyeu, Roshan Sumbaly, Peter Vajda, Ali Thabet
April 18, 2024
March 20, 2024
Armen Avetisyan, Chris Xie, Henry Howard-Jenkins, Tsun-Yi Yang, Samir Aroudj, Suvam Patra, Fuyang Zhang, Duncan Frost, Luke Holland, Campbell Orme, Jakob Julian Engel, Edward Miller, Richard Newcombe, Vasileios Balntas
March 20, 2024
February 13, 2024
Luke Melas-Kyriazi, Iro Laina, Christian Rupprecht, Natalia Neverova, Andrea Vedaldi, Oran Gafni, Filippos Kokkinos
February 13, 2024
January 25, 2024
Felix Xu, Di Lin, Jianjun Zhao, Jianlang Chen, Lei Ma, Qing Guo, Wei Feng, Xuhong Ren
January 25, 2024
Product experiences
Foundational models
Product experiences
Latest news
Foundational models