Poster
in
Workshop: Interpretable AI: Past, Present and Future
Evaluating Machine Learning Models with NERO: Non-Equivariance Revealed on Orbits
Zhuokai Zhao · Takumi Matsuzawa · William Irvine · Michael Maire · Gordon Kindlmann
Traditional scalar-based error metrics, while quick for assessing machine learning (ML) model performance, often fail to expose weaknesses or offer fair evaluations, particularly with limited test data. To address this growing issue, we introduce "Non-Equivariance Revealed on Orbits" (NERO), a novel evaluation procedure that enhances model analysis through assessing equivariance and robustness. NERO combines a task-agnostic interactive interface with a suite of visualizations to deeply analyze and improve model interpretability. We validate the effectiveness of NERO across various applications, including 2D digit recognition, object detection, particle image velocimetry (PIV), and 3D point cloud classification. Our case studies demonstrate the ability of NERO to clearly depict model equivariance and provide detailed insights into model outputs. Additionally, we introduce "consensus" as an alternative to traditional ground truths, expanding NERO to unlabeled datasets and enabling broader applications in diverse ML contexts.