Skip to yearly menu bar Skip to main content


Poster
in
Workshop: Medical Imaging meets NeurIPS

Performance-based Wisdom of the Crowd Algorithms for Medical Image Dataset Labeling

Eeshan Hasan · Erik Duhaime · Jennifer Trueblood


Abstract:

A crucial bottleneck in medical artificial intelligence is high-quality labeled medical datasets. In this paper, we test a large variety of wisdom of the crowd algorithms to label medical images that were initially classified by individuals recruited through an app-based platform. Individuals classified skin lesions from the International Skin Lesion Challenge 2018 into 7 different categories. There was a large dispersion in the geographical location, experience, training, and performance of the recruited individuals. We test 168 wisdom of the crowd algorithms of varying complexity from a simple unweighted average to more complex Bayesian models that account for individual patterns of errors. Using a switchboard analysis, we observe that the best-performing algorithms rely on selecting top performers, weighting decisions by training accuracy, and considering the task environment. These algorithms also far exceed expert performance. We conclude by discussing the implications of these approaches for the development of medical AI.

Chat is not available.