A new generative QA model that learns to answer the whole question

July 02, 2019

What the research is:

A new question answering (QA) model that determines the correct response by reverse-engineering the question. Current state-of-the-art models are trained discriminatively, which means they stop learning when any clue lets them predict the right answer. We instead train our model to generate the question from an answer, which teaches it to explain all the clues. For example, if shown a picture and asked what color the grass is, most current models will simply memorize the answer green without looking at the image. Instead, our model must predict the question, which requires it to look at the image and learn to recognize grass. If given the color green, the model is more likely to generate the question of grass color than if it was given the color blue.

How it works:

Our QA model is implemented by learning a prior distribution over possible answers and a model for generating the question. We use a conditional language model to generate the question given the answer — allowing scalable and interpretable multi-hop reasoning as the question is generated word by word.

The image above maps word probabilities during question generation on a validation question when predicting the highlighted word. In the example above, for instance, when predicting the word "object," the model has written the words "Is there a rubber" and therefore knows that the next word has to be something rubber. Then the model looks only at rubber material as opposed to metal.

Why it matters:

This is the first model to perform well on both language understanding and question answering tasks focused on difficult reasoning. It outperforms previous work at adversarial question answering, in which documents are intentionally modified by adding sentences specifically designed to be adversarial. More specifically, our model achieves competitive performance with specialized discriminative models on key benchmarks, proving that it is a more general architecture for language understanding and reasoning than previous work. This behavior is important for our research not only in natural language processing but also in building more intuitive AI systems.