Skip to yearly menu bar Skip to main content


Spotlight Poster

Noisy Label Learning with Instance-Dependent Outliers: Identifiability via Crowd Wisdom

Tri Nguyen · Shahana Ibrahim · Xiao Fu

[ ]
Thu 12 Dec 11 a.m. PST — 2 p.m. PST

Abstract:

The generation of label noise is often modeled as a process involving a probability transition matrix (often interpreted as the {\it annotator confusion matrix}) imposed onto the ground-truth label distribution.Under this model, rectifying the label noise and learning the target classifier boil down to identifying the confusion matrix. This line of work demonstrated appealing empirical performance, yet identifiability of the model was mostly established by assuming an instance-invariant confusion matrix. Having an (occasionally) instance-dependent confusion matrix across data samples is apparently more realistic, but inevitably introduces outliers to the model.Our interest lies in confusion matrix-based noisy label learning with such outliers taken into consideration.We begin with pointing out that under the model of interest, detecting the outliers in the presence of a single confusion matrix is fundamentally insufficient.Then, we prove that by employing a crowdsourcing strategy involving multiple annotators, a carefully designed loss function can detect the outliers and identify the desired classifier under reasonable conditions.Our development builds upon a link between the noisy label model and a column-corrupted matrix factorization model---which turns out attesting to the importance of crowdsourced data annotation. Experiments show that our learning scheme substantially improves the outlier detection probability and the learned neural systems' testing accuracy.

Live content is unavailable. Log in and register to view live content