System performance is often the sole focus of success metrics for Machine Learning (ML) in both academic and industrial settings. The results of engineers’ work are typically distilled in a few polished charts that show the superiority of the newfound solution. In reality, however, a model’s performance is just one of many factors that contribute to its viability as a product.
To get even more clarity, we must examine the full Model Development Life Cycle (MDLC). In particular, our 15 years of experience with AI/ML systems at Bloomberg has demonstrated that the traceability of three elements in a model – code, the training environment and its settings, and the data that went into training and testing a model – can impact its viability. In this talk, we will discuss the best practices and tooling (and any relevant research) that we adopted at Bloomberg to ensure traceability and reproducibility of our models and systems throughout the MDLC. We will also illustrate how these principles were followed as we developed various AI-powered products at the firm.