Data-Centric AI Infra 2.0

July 06, 2022

Abstract

In this position paper for the DataPerf2022 workshop at ICML2022 (paper 0891), we share our considerations for an end-to-end Data-Centric AI infrastructure vision to implement Artificial Intelligence (AI). AI is trained and evaluated using datasets that undergo various changes as part of their lifecycle (privacy, drift, errors, transformations, etc). Data-Centric AI Infra helps practitioners understand and iterate on datasets for ML Models. By adopting Data-Centric AI infrastructure our customers could improve their model performance through faster, resource efficient access to AI data. We hope to connect the scientific community with the AI data problems faced in a real production environment at an exabyte scale.

Download the Paper

Help Us Pioneer The Future of AI

We share our open source frameworks, tools, libraries, and models for everything from research exploration to large-scale production deployment.