Introducing the Habitat-Matterport 3D research data set for training embodied AI

June 30, 2021

Something Went Wrong
We're having trouble playing this video.

AI models in computer vision and natural language processing are typically trained with text, images, audios, and videos curated from the internet. But embodied AI, the study and development of intelligent systems with a physical or virtual embodiment (robots and egocentric personal assistants), has different needs.

Imagine walking up to a home robot and asking, “Hey, robot, can you go check if my laptop is on my desk, and if so, bring it to me?” Or asking an AI assistant operating on your AR glasses: “Hey, where did I last see my keys?” These tasks require AI to understand and interact with the physical world much as people do: recognizing different objects from any viewpoint, distinguishing between a countertop and a desk, and much more. In order to develop such robots and egocentric personal assistants safely and at scale, we have to train them in rich, realistic 3D spaces in simulation.

Unfortunately, such 3D data is scarce. While 2D image data sets have grown over the years to include billions of images, data sets of 3D spaces are limited to dozens of buildings. As a result, progress in embodied AI has been stunted.

Today, Facebook AI, in collaboration with Matterport, are making available a Matterport open source licensed data set, the largest ever data set of indoor 3D scans, to benefit the research community. Habitat-Matterport 3D Research Dataset (HM3D) is a collection of 1,000 Habitat-compatible 3D scans made up of accurately scaled residential spaces such as apartments, multifamily housing, and single-family homes, as well as commercial spaces like office buildings and retail stores.

We believe that HM3D will play a significant role in advancing research in embodied AI. With this data set, embodied AI agents like home robots and AI assistants can be trained to understand the complexities of real-world environments, recognizing objects, rooms, and spaces, or learning how to navigate and follow instructions — all in contexts that are vastly different from one another. To carry out complex tasks like finding misplaced objects or retrieving physical ones, an embodied AI agent needs to construct maps and episodic memory representations (to recall what it already has observed), understand speech and audio cues, and exhibit sophisticated motor control if it has to go up and down stairs.

Facebook AI is partnering with Matterport to provide access to Matterport’s open source HM3D data set out of a shared interest in addressing this area of critical need in embodied AI: data to spur greater innovation. Matterport is the spatial data company leading the digital transformation of the built world. Its all-in-one 3D data platform has transformed millions of physical spaces into immersive digital twins. AI Habitat is Facebook AI’s state-of-the-art open simulation platform for training embodied agents for a range of tasks. It allows researchers to easily and efficiently get a large number of repeated trials due to its faster-than-real-time speed. Thanks to our partnership with Matterport, researchers can now gain the critical scale required to train robots and other intelligent systems.

In the future, we hope to expand the data set to include scans from many more countries; augment the data set with semantic annotations so that we can take on high-level understanding tasks like object retrieval; and study dynamic and changing environments so that our simulations will be fluid rather than static. This would bring simulated training environments closer to the real world, where people and pets freely move around and where everyday objects such as mobile phones, wallets, and shoes are not always in the same spot throughout the day.

We believe that advancements in embodied AI could help developers build and train assistants with deep contextual understanding and give them the ability to navigate the world around them. One day, embodied agents will unlock entirely new experiences that improve the quality of life for the people who use them — on any device, no matter where they are.

HM3D is free and available now for academic, non-commercial research. Access it here:

We’d like to acknowledge the contributions of Aaron Gokaslan, Alexander William Clegg, Erik Wijmans, John Turner, Oleksandr Maksymets, Wojtek Galuba, Yili Zhao, Eric Undersander, Santhosh Kumar Ramakrishnan, and Jitendra Malik, as well as many parters at UT-Austin, Simon Fraser University, and Georgia Tech to helping build HM3D.

Written By

Research Scientist

Andrew Westbury

Research Program Manager