Research Area

Natural Language Processing

Our team advances the state of the art in natural language understanding and generation, and deploys these systems at scale to break down language barriers, enable people to understand and communicate with anyone, and to provide a safe experience—no matter what language they speak.

The opportunities and challenges of this work are immense. Billions of people use our services to connect and communicate in their preferred language, but many of these languages lack traditional NLP resources and our systems need to be robust to the informal tone, slang and typos often found in daily communication.

Our research spans multiple areas across NLP and machine learning, including deep learning/neural networks, machine translation, natural language understanding and generation, low-resource NLP, question answering, dialogue, and cross-lingual and cross-domain transfer learning.

Latest Publications


Cross-lingual Transfer Learning for Multilingual Task Oriented Dialog (NAACL 2019)

In this paper, we present a new multilingual intent and slot filling data set for task oriented dialog of around 57,000 utterances and evaluate the performance of different methods for cross-lingual transfer learning, including a novel method using cross-lingual contextual word representations.


Pay less attention with Lightweight and Dynamic Convolutions

In this paper, we show that a very lightweight convolution can perform competitively to the best reported self-attention results.


Phrase-Based & Neural Unsupervised Machine Translation (EMNLP 2018, best paper)

In this work, we propose two methods for training translation models using only large monolingual corpora in each language, achieving state of the art results for both high-resource and low-resource languages.


Hierarchical Neural Story Generation. (ACL 2018, best paper honorable mention)

We explore story generation: creative systems that can build coherent and fluent passages of text about a topic through hierarchical story generation, where the model first generates a premise, and then transforms it into a passage of text.


XNLI: Evaluating Cross-lingual Sentence Representations

We introduce a dataset, called XNLI, that will catalyze research in cross-lingual sentence understanding by providing an informative standard evaluation task in 15 languages, including low-resource languages such as Swahili and Urdu.

Join Us

Tackle the world's most complex technology challenges.

Join Our Team

Latest News

Visit the AI Blog for updates on recent publications, new tools, and more.

Visit Blog