Lightning Talk
in
Workshop: Data Centric AI
YMIR: A Rapid Data Development Platform for Long-tailed Vision Applications
This paper introduces an open source platform for rapid development of long-tailed computer vision applications. The platform puts efficient dataset development at the center of the machine learning development process, integrates active learning methods, data and model version control, and uses concepts such as projects to enable fast iteration of multiple task specific datasets in parallel. We make it an open platform by abstracting the development process into core states and operations, and design open APIs to integrate third party tools as implementations of the operations. This open design reduces our development cost and at the same time reduces adoption cost for ML teams with existing tools for part of the development process. The platform is targeted to open source in the coming weeks and is already used internally to meet the ever increasing demand of custom computer vision applications from customers.