May 02, 2018
Artificial intelligence has the potential to transform lives, creating new digital experiences while safeguarding our existing online interactions. Balancing rapid and responsible progress requires foundational investment, as well as a strategic vision for how to build and use this evolving technology.
On Day 2 of F8 — Facebook’s annual developer’s conference — we detailed the company’s approach to advancing AI through our open development frameworks and collaborative relationship with the wider AI community. Our keynote speakers, including Facebook’s Chief Technology Officer Mike Schroepfer (pictured above), also unveiled cutting-edge research in computer vision, natural language understanding, and reinforcement learning, and shared our commitment to the ethical development and deployment of AI.
The path for taking AI from research to production has historically involved paid licenses, multiple development tools, and other hurdles. Closed and scattered development frameworks can make it complicated and time-intensive to test new approaches, deploy them, and iterate to improve accuracy and performance. To help accelerate and optimize this process, we are announcing our plan for the next version of our open source AI framework, PyTorch 1.0, which will be available in beta in the coming months. With PyTorch 1.0, developers will be able to experiment rapidly in a flexible, immediate execution mode, and then seamlessly transition to a highly optimizable, graph-based mode for deployment. The PyTorch framework has quickly become one of the most popular frameworks for AI researchers. Since we released the original version (0.1.6) just over a year ago, it has taken off on GitHub, and it was the second-most cited framework in papers at ICLR.
We also announced the expansion of ONNX (Open Neural Network Exchange), an open format for representing deep learning models, to allow AI engineers to more easily move models between frameworks without having to do resource-intensive custom engineering. ONNX has been open source since it was released in 2017, and it now supports additional tools, including a production-ready Core ML converter, Baidu’s PaddlePaddle platform, and Qualcomm SNPE in addition to Amazon Web Services’ Apache MXNet; Facebook’s Caffe2, PyTorch, and PyTorch 1.0; Google’s TensorFlow, and Microsoft’s Cognitive Toolkit. ONNX is tightly woven into PyTorch 1.0.
In addition to these broadly applicable development frameworks, Facebook is releasing resources for more specific uses of AI. Our ResNext3D model, with state-of-the-art accuracy and efficiency for video understanding, will be available in June (a collection of other video models, including Res 2+1 D, is being released today). We’re also open-sourcing Translate — a PyTorch language library — for fast, flexible machine translations that scale, and sharing our early work on a project called Multilingual Unsupervised and Supervised Embeddings (MUSE), to help expand the number of languages available for translation.
All of these open-source frameworks and models will be available on Facebook.ai. This new site — launched today — is intended to help researchers and engineers outside of Facebook take their AI work from research to production more quickly, employing the exact same tools that we use to serve 2 billion people. By sharing this growing suite of open source tools, we’re underscoring our commitment to work collaboratively with the developer community to build meaningful new products and services.
Facebook’s AI researchers pursue a wide range of topics and questions, but at F8 we showcased recent work in computer vision, natural language understanding, and reinforcement learning. In a collaborative effort by the Applied Machine Learning (AML) and Facebook AI Research (FAIR) teams, we trained a new image recognition model on an unprecedented 3.5 billion publicly available photos, using hashtags to classify the images. The researchers had to create a new technique that makes hashtags — which are noisy and imprecise labels — useful for AI training. Using a 1-billion-image version of this dataset enabled our model to score the highest mark ever, 85.4 percent accuracy, on the widely used ImageNet image recognition benchmark. This performance reveals the long-term potential of not only improving image recognition by training on larger datasets, but also using existing labels (rather than annotations applied specifically for the purposes of AI training). We plan to open-source the embeddings of these models in the near future.
In the field of 3D image mapping, Facebook researchers used the PyTorch toolkit to generate full 3D surfaces that can be applied, in real time, to footage of human bodies in motion. The new tool, which we call DensePose, can enable compelling new augmented reality effects, such as transferring a texture onto a moving body. More important, this foundational research points to greater understanding of scenes for computer vision systems. We hope to publish this work as an open source library in the coming weeks.
To promote the development of more useful autonomous agents — whether that means virtual assistants or robotic systems — FAIR has collaborated with researchers at Georgia Tech to develop a new, multistage AI task, called EmbodiedQA, that pushes the limits of reinforcement learning and natural language understanding. The team created virtual agents that must learn how to navigate computer-generated indoor spaces and how to understand and use natural language in order to answer questions about their environment. FAIR also built the collection of virtual environments — called House3D — used in this research, allowing agents to train dramatically faster than a physical robot would in physical spaces. We’ve open-sourced House3D, and will be releasing the data and model related to EmbodiedQA in the near future, to help the entire AI research community to make faster progress toward creating smarter, more genuinely autonomous intelligent assistants.
FAIR is exploring other learning environments for agents, with two projects that focus on systems that play games. Our StarCraft bot uses reinforcement learning to operate in the cluttered, complex, and partially observable environments of the popular real-time strategy video game StarCraft. We’ve developed a state-of-the-art rule set as a baseline for the bot, which we plan to open-source for use in other StarCraft-related AI research. And while DeepMind’s AlphaGo system has shown impressive performance in beating humans at the game Go and teaching itself without human supervision, the model itself remains a closely held secret. To make this work both reproducible and available to AI researchers around the world, we created an open source Go bot, called ELF OpenGo, that performs well enough to answer some of the key questions unanswered by AlphaGo. Despite a relatively limited amount of training and no supervision, ELF OpenGo has achieved professional status and a 14-0 win record against four different top 30 human Go players.
Our primary goal with these bots, and with most of our research, is to create a foundation for future work by the larger AI community. That’s why we open-source the datasets and models for as much of our research as possible, including the tools and datasets used to create our StarCraft bot (which will be available soon) and both the code and a pretrained model for ELF OpenGo (available now).
As we work to shape the future of AI with our research and our open development tools, it’s important to apply questions of ethics, fairness, and trust to every effort in this field. That includes using existing AI techniques in a responsible way, such as in the blood donations feature that we released last year. More than 8 million people in India, Bangladesh, and Pakistan have chosen to use the feature to identify themselves as blood donors, in order to better connect with people and organizations requesting blood.
But this issue is bigger than a single project, so we shared Facebook’s perspective on the ethical use of AI. Our efforts to eliminate algorithmic bias and irresponsible AI deployment include careful consideration of the code we generate as well as the people we hire to write, manage, and approve those AI systems. Promoting a diverse workforce might prevent some problems in the planning stages of a project by bringing more points of view to the table. Other problems could be caught during development, by our research and privacy reviews. And even after production, we can search for bias with internal tools, such as Fairness Flow, which measures how an algorithm interacts with specific groups of people.
As AI evolves, so will the ethical concerns it raises. So while Facebook continues to push for the open development and research of this technology, we are also investing in people, processes, and tools to help ensure that we are maximizing the positive impacts of these AI systems and minimizing the negative ones.