Research

Computer Vision

Habitat 2.0: Training home assistant robots with faster simulation and new benchmarks

June 30, 2021

One of the most powerful ways to train robots to accomplish useful tasks in the real world is to teach them in simulation. Exploring virtual worlds allows AI agents to practice a task thousands or even millions of times faster than they could in a real physical space. Today, we are announcing Habitat 2.0, a next-generation simulation platform that lets AI researchers teach machines not only to navigate through photo-realistic 3D virtual environments but also to interact with objects just as they would in an actual kitchen, dining room, or other commonly used space.

Something Went Wrong
We're having trouble playing this video.

Habitat 2.0 builds on our original open source release of AI Habitat with even faster speeds as well as interactivity, so AI agents can easily perform the equivalent of many years of real world actions — a billion or more frames of experience — such as picking items up, opening and closing drawers and doors, and much more. We believe Habitat 2.0 is the fastest publicly available simulator of its kind available to AI researchers.

Habitat 2.0 also includes a new fully interactive 3D data set of indoor spaces and new benchmarks for training virtual robots in these complex physics-enabled scenarios. With this new data set and platform, AI researchers can go beyond just building virtual agents in static 3D environments and move closer to creating robots that can easily and reliably perform useful tasks like stocking the fridge, loading the dishwasher, or fetching objects on command and returning them to their usual place.

Introducing ReplicaCAD: Interactive Digital Twins of Real Spaces

Habitat 2.0’s new data set, ReplicaCAD, is a mirror of Replica, Facebook Reality Lab’s data set previously released for 3D environments, now rebuilt to support the movement and manipulation of objects. In ReplicaCAD, previously static 3D scans have been converted to individual 3D models with physical parameters, collision proxy shapes, and semantic annotations that can enable training for movement and manipulation for the first time. To create the new data set, we relied on a team of 3D artists to reproduce identical renderings of spaces within Replica, but with full attention to specifications relating to their material composition, geometry, and texture. The interactive recreations also incorporated information about size and friction, whether an object (such as a refrigerator or door) has compartments that could open or close, and how those mechanisms worked, among other considerations.

We also created a number of probable variations of each scene and developed a pipeline for introducing realistic clutter such as kitchen utensils, books, and furniture. ReplicaCAD features 111 unique layouts of a single living space and 92 artist-created objects. It was created with the consent of and compensation to artists, and will be shared under a Creative Commons license for non-commercial use with attribution (CC-BY-NC). Rich indoor layouts like those in ReplicaCAD enable researchers to build and train agents capable of entirely new interactive household tasks.

Habitat 2.0 Simulator: Faster speeds open up more research possibilities

Habitat 2.0 builds on and expands the capabilities of the Habitat-Sim simulation engine by supporting piecewise-rigid objects such as cabinets and drawers that can rotate on an axis or slide; articulated robots that include mobile manipulators like Fetch, fixed-base arms like Franka, and quadrupeds like AlienGo; and rigid-body physics (via Bullet physics engine).

In building Habitat 2.0, we prioritized speed/performance over a wider range of simulation capabilities because it will allow the research community to test new approaches and iterate more effectively. For instance, rather than simulating wheel-ground contact, we use a navigation mesh to move the robot. The platform also does not currently support non-rigid dynamics such as deformables, liquids, films, cloths, and ropes, as well as audio or tactile sensing. This streamlined focus makes the Habitat 2.0 simulator two orders of magnitude faster than most 3D simulators available to academics and industry professionals.

Habitat 2.0’s speed and enhanced capabilities mean that researchers and scientists are now able to use the platform to perform complex tasks not possible at slower speeds, such as cleaning up the kitchen or setting the table. The platform can simulate a Fetch robot interacting in ReplicaCAD scenes at 1,200 steps per second (SPS) while existing platforms typically run at 10 to 400 SPS. Habitat 2.0 also scales well, achieving 8,200 SPS (273× real-time) multi-process on a single GPU and nearly 26,000 SPS (850× real-time) on a single node with 8 GPUs. Such speeds significantly cut down on experimentation time, allowing researchers to complete experiments that would typically run over 6 months in as little as 2 days. Reduced experiment times will allow researchers to try new ideas much more early and often, leading to the completion of a far greater number of simulations and paving the way for more advances in the field.

Habitat 2.0 simulator is open-sourced under the MIT license.

Home Assistant Benchmark: New milestones for home assistant training tasks

The ReplicaCAD dataset and Habitat 2.0 simulator make it possible to create a new library of household assistive tasks called Home Assistive Benchmark (HAB). Tasks in HAB can include general tasks like setting the table, cleaning the fridge, and cleaning the house; robot skills like navigation, pick, place, open cabinet drawer, open fridge door, and others; and agent configuration around common household errands. HAB requires that robots not assume any prior knowledge about the environment (to allow it to deal with new environments and radical changes in known environments) and operate exclusively from onboard sensors such as RGBD cameras, egomotion and joint-position sensors.

What's next?

In the future, Habitat will seek to model living spaces in more places around the world, enabling more varied training that takes into account cultural- and region-specific layouts of furniture, types of furniture, and types of objects. We acknowledge these representational challenges and are working to improve the diversity and geographic inclusion of the 3D environments currently available for research. Also, while Habitat 2.0 is a fast simulator, we are working to speed it up even more by addressing potential bottlenecks, such as its handling of synchronized parallel environments and need to reload assets when an episode resets. Holistically reorganizing the rendering+physics+reinforcement learning interplay would be an exciting direction for future work.

Our experiments suggest that complex, multi-step tasks such as setting the table or taking out the trash are significantly challenging. Although we were able to train individual skills (pick, place, navigate, open drawer, etc) with large-scale model-free reinforcement learning to reasonable degrees of success, training a single agent that is able to accomplish all such skills and chain them without cascading errors remains an open challenge. We believe that HAB presents a research agenda for interactive embodied AI for years to come.

We hope that the reconstructed ReplicaCAD data set, significantly improved speeds, and new Home Assistive Benchmarks contained in Habitat 2.0 will empower other research teams to train next-generation AI agents with even greater success. Although some of our experiments showed open challenges, we know that the new capabilities of Habitat 2.0 will give more researchers tools to begin tackling them. We share this work for all to use, aiming for a collaborative, open, and responsible approach to advancing the state of the art in embodied AI. We hope that the ability to perform more complex tasks in simulation will bring us closer to the AI that can help make our everyday lives easier and better.

Read the full paper

ReplicaCAD, the Habitat 2.0 simulation platform, and HAB represent the work of a multi-disciplinary team of researchers with expertise in computer graphics, rendering, physics simulation, video game design, 3D art, computer vision, robotics, and deep learning — from Facebook AI Research, Georgia Tech, Intel Labs, Simon Fraser University, and the University of California, Berkeley.

Written By

Research Scientist