Poster
in
Affinity Workshop: Women in Machine Learning
Evaluation of Active Learning and Domain Adaptation on Health Data
Kristina Holsapple · Haoran Zhang · Marzyeh Ghassemi
Machine learning (ML) uses data to make decisions and predictions. Labeled data is necessary for ML to understand how previous decisions and predictions have been made, particularly in healthcare settings. Unfortunately, such data is prohibitively expensive and requires subject-specific expertise. Active learning poses the possibility of achieving accurate ML models with a lower requirement of labeled data. Dataset shifts also pose a challenge to the performance of ML systems for healthcare. Domain adaptation aims to mitigate the effects of dataset shifts. This work applies existing active learning and domain adaptation techniques in the context of healthcare data to evaluate the specific accuracy of general solutions. eICU is a labeled dataset from intensive care units across the United States, and MIMIC-III and MIMIC-IV are both labeled datasets from hospital admissions to Beth Israel Deaconess Medical Center in Boston, MA.All three of these datasets have shifts that we investigate. Overall, this research reports on a series of tests with existing active learning and domain adaptation techniques to evaluate appropriate future uses of these methods in the field of ML for healthcare.