A new method to improve the performance of long-form question answering (QA) systems by enabling them to search relevant text more efficiently. This method builds on Facebook AI’s work on long-form QA a natural language processing (NLP) research task where models must answer a natural language question, such as “What is Albert Einstein famous for?” by using the top 100 web search results. Though the answer is typically present in those results, sequence-to-sequence (seq2seq) models struggle with analyzing such a large amount of data, which requires processing hundreds of thousands of words. By compressing that text into a knowledge graph and introducing a more fine-grained attention mechanism, our technique allows models to use the entirety of web search results to interpret relevant information.
Our method is a two-step process that condenses the available training data and then extracts the most relevant information. The first step uses an open information extraction system to analyze the results of a web search. The system identifies “triples,” which contain a subject, relationship, and one or more objects within a given sentence. For example, the sentence “General relativity is a theory of gravitation” is converted to “general relativity” (the subject), “is” (the relationship), and “a theory of gravitation” (the object). The system turns those triples into a local knowledge graph, with subjects and objects represented as nodes, and the relationship that links them acting as an edge. As the system analyzes additional search results, it discards irrelevant and redundant information. This approach reduces the amount of text that model has to process by nearly 30x — from 300,000 words to 10,000 — while also improving accuracy.
That knowledge graph is then linearized into a sequence and passed to a standard seq2seq model to generate an answer to the original question. We modify the attention mechanism of seq2seq models to further allow models to focus on the most salient information rather than on all the positions within a sequence. This process of knowledge condensation and distillation enables the system to turn a number of facts drawn from a variety of sources into one coherent multisentence answer, such as “Albert Einstein made many contributions to the field of physics, including the general theory of relativity. General relativity is the theory that refines the law of universal gravitation to describe gravity as a property of space and time.”
By more efficiently and effectively searching the internet to answer long-form questions, this work could lead to AI assistants whose large-scale text processing mirrors how humans derive answers or summaries from multiple sources, with answers that are more nuanced and useful than what current technology is capable of producing. Our results could also make other kinds of online text, particularly long text sources that include a variety of both relevant and irrelevant details, more useful for training NLP models and potentially lead to better reading comprehension skills for a wide range of AI systems.
The paper is below, and those attending EMNLP 2019 can learn more at conference session 8 (“Machine Learning”) on Wednesday, November 6, from 4:30 to 6:00 p.m. local time.
Using local knowledge graph construction to scale seq2seq models to multi-document inputs