Marcus Rohrbach

Marcus is a research scientist at Facebook AI Research. Previously, he was a PostDoc at the University of California, Berkeley at EECS and ICSI with Trevor Darrell (2014-2017). Marcus did his PhD at the Max Planck Institute for Informatics with Bernt Schiele (2010-2014). His interests include computer vision, computational linguistics, and machine learning and how these areas can collaborate best.

Marcus's Publications

October 27, 2020

COMPUTER VISION

ML APPLICATIONS

Adversarial Continual Learning

Continual learning aims to learn new tasks without forgetting previously learned ones. We hypothesize that representations learned to solve each task in a sequence have a shared…

Sayna Ebrahimi, Franziska Meier, Roberto Calandra, Trevor Darrell, Marcus Rohrbach

October 27, 2020

October 27, 2020

RESEARCH

COMPUTER VISION

Probabilistic Neural-symbolic Models for Interpretable Visual Question Answering

We propose a new class of probabilistic neural-symbolic models, that have symbolic functional programs as a latent, stochastic variable. Instantiated in the context of visual question answering, our probabilistic formulation offers two key…

Ramakrishna Vedantam, Karan Desai, Stefan Lee, Marcus Rohrbach, Dhruv Batra, Devi Parikh

October 27, 2020

October 27, 2020

RESEARCH

COMPUTER VISION

Iterative Answer Prediction with Pointer-Augmented Multimodal Transformers for TextVQA

Many visual scenes contain text that carries crucial information, and it is thus essential to understand text in images for downstream reasoning tasks. For example, a deep water label on a warning sign warns people about the danger in the…

Ronghang Hu, Amanpreet Singh, Trevor Darrell, Marcus Rohrbach

October 27, 2020

October 27, 2020

RESEARCH

COMPUTER VISION

Decoupling Representation and Classifier for Long-Tailed Recognition

The long-tail distribution of the visual world poses great challenges for deep learning based classification models on how to handle the class imbalance problem.

Bingyi Kang, Saining Xie, Marcus Rohrbach, Zhicheng Yan, Albert Gordo, Jiashi Feng, Yannis Kalantidis

October 27, 2020

October 27, 2020

RESEARCH

COMPUTER VISION

In Defense of Grid Features for Visual Question Answering

Popularized as ‘bottom-up’ attention, bounding box (or region) based visual features have recently surpassed vanilla grid-based convolutional features as the de facto standard for vision and language tasks like visual question answering (VQA).

Huaizu Jiang, Ishan Misra, Marcus Rohrbach, Erik Learned-Miller, Xinlei Chen

October 27, 2020

October 27, 2020

COMPUTER VISION

RESEARCH

12-in-1: Multi-Task Vision and Language Representation Learning

Much of vision-and-language research focuses on a small but diverse set of independent tasks and supporting datasets often studied in isolation; however, the visually grounded language understanding skills required for success at these tasks overlap significantly.…

Jiasen Lu, Vedanuj Goswami, Marcus Rohrbach, Devi Parikh, Stefan Lee

October 27, 2020

October 27, 2020

COMPUTER VISION

NLP

TextCaps: a Dataset for Image Captioning with Reading Comprehension

Image descriptions can help visually impaired people to quickly understand the image content. While we made significant progress in automatically describing images…

Oleksii Sidorov, Ronghang Hu, Marcus Rohrbach, Amanpreet Singh

October 27, 2020

October 27, 2020

COMPUTER VISION

Learning to Generate Grounded Visual Captions without Localization Supervision

When automatically generating a sentence description for an image or video, it often remains unclear how well the generated caption is grounded, that is whether the model…

Chih-Yao Ma, Yannis Kalantidis, Ghassan AlRegib, Peter Vajda, Marcus Rohrbach, Zsolt Kira

October 27, 2020

October 27, 2020

RESEARCH

COMPUTER VISION

Selfless Sequential Learning

Sequential learning, also called lifelong learning, studies the problem of learning tasks in a sequence with access restricted to only the data of the current task. In this paper we look at a scenario with fixed model capacity, and postulate…

Rahaf Aljundi, Marcus Rohrbach, Tinne Tuytelaars

October 27, 2020

October 27, 2020

RESEARCH

NLP

CoDraw: Collaborative Drawing as a Testbed for Grounded Goal-driven Communication

In this work, we propose a goal-driven collaborative task that combines language, perception, and action. Specifically, we develop a Collaborative image-Drawing game between two agents, called CoDraw. Our game is grounded in a virtual world…

Jin-Hwa Kim, Nikita Kitaev, Xinlei Chen, Marcus Rohrbach, Byoung-Tak Zhang, Yuandong Tian, Dhruv Batra, Devi Parikh

October 27, 2020