RESEARCH

COMPUTER VISION

Single-Network Whole-Body Pose Estimation

October 27, 2019

Abstract

We present the first single-network approach for 2D whole-body pose estimation, which entails simultaneous localization of body, face, hands, and feet keypoints. Due to the bottom-up formulation, our method maintains constant real-time performance regardless of the number of people in the image. The network is trained in a single stage using multi-task learning, through an improved architecture which can handle scale differences between body/foot and face/hand keypoints. Our approach considerably improves upon OpenPose [9], the only work so far capable of whole-body pose estimation, both in terms of speed and global accuracy. Unlike [9], our method does not need to run an additional network for each hand and face candidate, making it substantially faster for multi-person scenarios. This work directly results in a reduction of computational complexity for applications that require 2D whole-body information (e.g., VR/AR, re-targeting). In addition, it yields higher accuracy, especially for occluded, blurry, and low resolution faces and hands. For code, trained models, and validation benchmarks, visit our project page.

Download the Paper

Related Publications

May 17, 2019

COMPUTER VISION

SPEECH & AUDIO

GLoMo: Unsupervised Learning of Transferable Relational Graphs | Facebook AI Research

Modern deep transfer learning approaches have mainly focused on learning generic feature vectors from one task that are transferable to other tasks, such as word embeddings in language and pretrained convolutional features in vision. However,…

Zhilin Yang, Jake (Junbo) Zhao, Bhuwan Dhingra, Kaiming He, William W. Cohen, Ruslan Salakhutdinov, Yann LeCun

May 17, 2019

May 06, 2019

COMPUTER VISION

NLP

No Training Required: Exploring Random Encoders for Sentence Classification | Facebook AI Research

We explore various methods for computing sentence representations from pre-trained word embeddings without any training, i.e., using nothing but random parameterizations. Our aim is to put sentence embeddings on more solid footing by 1) looking…

John Wieting, Douwe Kiela

May 06, 2019

May 06, 2019

NLP

COMPUTER VISION

Efficient Lifelong Learning with A-GEM | Facebook AI Research

In lifelong learning, the learner is presented with a sequence of tasks, incrementally building a data-driven prior which may be leveraged to speed up learning of a new task. In this work, we investigate the efficiency of current lifelong…

Arslan Chaudhry, Marc'Aurelio Ranzato, Marcus Rohrbach, Mohamed Elhoseiny

May 06, 2019

May 06, 2019

COMPUTER VISION

Learning Exploration Policies for Navigation | Facebook AI Research

Numerous past works have tackled the problem of task-driven navigation. But, how to effectively explore a new environment to enable a variety of down-stream tasks has received much less attention. In this work, we study how agents can…

Tao Chen, Saurabh Gupta, Abhinav Gupta

May 06, 2019

Help Us Pioneer The Future of AI

We share our open source frameworks, tools, libraries, and models for everything from research exploration to large-scale production deployment.