Research

Human & Machine Intelligence

AI names colors much as humans do

March 24, 2021

What the research is:

Across the thousands of different languages spoken by humans, the way we use words to represent different colors is remarkably consistent. For example, many languages have two distinct words for red and orange, but no language has many distinct commonly used words for various tonalities of orange. (Of course, if you visit a paint store, you’ll see dozens of esoteric names for different shades of orange. But these are rarely used in daily conversation.)

Using mathematical tools, linguistic researchers have shown this consistency in color names is because humans optimize language to balance the need for accurate communication with a general biological drive toward minimizing effort. Having extra color words — cantaloupe or burnt sienna, for example — adds complexity without significantly improving how effectively people communicate with each other.

Facebook AI has now shown that cutting-edge AI systems behave similarly. When two artificial neural networks are tasked with creating a way to communicate with each other about what colors they see, they develop systems that balance complexity and accuracy much as people do.

The images on the left show two color-naming systems created entirely by neural networks. Each cluster represents a color denoted by a single word. (The color shown is the average of the RGB values of all color chips in the cluster.) The top-left panel illustrates a three-word AI naming system, while the top-right panel shows a three-word human naming system — namely that of Wobé, a language spoken in Niger-Congo. The bottom-left panel illustrates a five-word AI naming system and the bottom-right panel shows the five-word naming system of the Papuan language Bauzi.

We also found that in order for the color “language” used by these neural networks to be an optimal solution, it must use discrete symbols rather than continuous sounds. This leads to a fascinating speculation about how we communicate. Is it possible that our languages can be optimally structured only if they are made up of discrete symbols rather than, say, continuous whistling?

How it works:

We built two neural networks, a Speaker and a Listener, and tasked them with playing the “communication game” illustrated below. In each round of the game, the Speaker sees one color chip from a continuous color space and then produces a symbol (which can be considered a “word”). The Listener sees the same color chip but also a different one, known as a distractor.

This graphic shows a successful round of the communication game. Based on a given color chip, the Speaker neural network selects a word (in this case, “blap”) from its vocabulary. The Listener neural network receives the selected word, and must decide which color sample it refers to. In this figure, the Listener correctly chooses the chip in position one.

The Listener receives the word produced by the Speaker and then tries to point to the correct color chip. Initially, the Speaker produces words at random, but eventually these naturally come to denote areas of the color space. We repeated this experiment many times while varying the difficulty of the task by making target and distractor chips more similar or less so. These variations produced a number of different color-naming “vocabularies.”

At the end of training, we analyzed these vocabularies and consistently found that the neural networks developed color terms with properties similar to those of human languages. In particular, when organizing the resulting systems according to quantitative measures of complexity and accuracy—as we do in the chart below—we find that the distribution of the neural-network languages is virtually identical to that of real human languages. Moreover, both types of languages are near the boundary that formally defines the set of possible optimal balances between complexity and accuracy (the black line in the figure).

This chart shows color-naming systems created by human languages (shown in blue) and by neural networks (shown in orange). The black curve defines the theoretical limit on accuracy given complexity, as computed in “Efficient compression in color naming and its evolution.” Both human languages and neural network color-naming systems achieve near-optimal efficiency.

In further experiments, we removed various components of the simulation. We found that, crucially, when we allowed neural networks to communicate through continuous symbols instead of discrete ones, the optimal trade-off between complexity and accuracy no longer emerged. The networks still succeeded at the communication game, but their systems became highly inefficient.

Why it matters:

Language is perhaps humanity’s most unique feature, but we still have a poor understanding of many of its core properties. Our study shows that advanced AI models, such as the ones developed at Facebook, are useful not only for practical applications, but also as experimental tools to answer scientific questions about human language (and cognition in general).

Recent research in linguistics and cognitive science points to the fact that language is a highly efficient system—but how did it evolve to that state, and why? By studying and dissecting computational models that mimic natural behavior, as in our research, we can shed light on the precise conditions under which such an efficient communication system is likely to arise in nature.

The results are also exciting from the perspective of building AI systems that are able to communicate with us through natural language, as it shows that neural networks trained to collaborate to accomplish a common task can develop communication systems that share core properties of human language.

Read the full paper:

Communicating artificial neural networks develop efficient color-naming systems

Written By

Marco Baroni

Research Scientist

Rahma Chaabouni

Research Assistant

Evgeny Kharitonov

Research Engineer

Emmanuel Dupoux

Research Scientist