RESEARCH

NLP

On the Idiosyncrasies of the Mandarin Chinese Classifier System

June 16, 2019

Abstract

While idiosyncrasies of the Chinese classifier system have been a richly studied topic among linguists (Adams and Conklin, 1973; Erbaugh, 1986; Lakoff, 1986), not much work has been done to quantify them with statistical methods. In this paper, we introduce an information-theoretic approach to measuring idiosyncrasy; we examine how much the uncertainty in Mandarin Chinese classifiers can be reduced by knowing semantic information about the nouns that the classifiers modify. Using the empirical distribution of classifiers from the parsed Chinese Gigaword corpus (Graff et al., 2005), we compute the mutual information (in bits) between the distribution over classifiers and distributions over other linguistic quantities. We investigate whether semantic classes of nouns and adjectives differ in how much they reduce uncertainty in classifier choice, and find that it is not fully idiosyncratic; while there are no obvious trends for the majority of semantic classes, shape nouns reduce uncertainty in classifier choice the most.

Download the Paper

Related Publications

August 01, 2019

NLP

Simple and Effective Curriculum Pointer-Generator Networks for Reading Comprehension over Long Narratives | Facebook AI Research

Yi Tay, Shuohang Wang, Luu Anh Tuan, Jie Fu, Minh C. Phan, Xingdi Yuan, Jinfeng Rao, Siu Cheung Hui, Aston Zhang

August 01, 2019

July 27, 2019

NLP

Unsupervised Question Answering by Cloze Translation | Facebook AI Research

Patrick Lewis, Ludovic Denoyer, Sebastian Riedel

July 27, 2019

September 10, 2019

NLP

Bridging the Gap Between Relevance Matching and Semantic Matching for Short Text Similarity Modeling | Facebook AI Research

Jinfeng Rao, Linqing Liu, Yi Tay, Wei Yang, Peng Shi, Jimmy Lin

September 10, 2019

May 17, 2019

NLP

Unsupervised Hyper-alignment for Multilingual Word Embeddings | Facebook AI Research

Jean Alaux, Edouard Grave, Marco Cuturi, Armand Joulin

May 17, 2019

Help Us Pioneer The Future of AI

We share our open source frameworks, tools, libraries, and models for everything from research exploration to large-scale production deployment.