Poster
in
Workshop: Learning Meaningful Representations of Life
Learning representations of cell populations for image-based profiling using contrastive learning
Robert van Dijk · John Arevalo · Shantanu Singh · Anne Carpenter
Image-based cell profiling is a powerful tool that compares differently perturbed cell populations by measuring thousands of single-cell features and summarizing them into vectors (or profiles). Despite its simplicity, so-called average profiling, where all single-cell features are averaged using measures of center, is still the most commonly used approach. However, this method fails to capture cell populations’ heterogeneity, which has been shown to improve the phenotypic strength of profiles. A recent study proposed a method that did capture cell population heterogeneity, but their method is difficult to use in practice. Therefore, we propose a Deep Sets based method that learns the most effective way of aggregating single-cell feature data into a profile that better predicts a compound’s mechanism of action compared to average profiling. This is achieved by applying weakly supervised contrastive learning in a multiple instance learning setting. Our proposed model provides a more accessible and better performing method for aggregating single-cell feature data than previously published strategies and the average profiling baseline. It is likely that the model achieves this by performing some form of quality control by filtering out noisy cells and prioritizing less noisy cells. The model cannot be directly transferred to unseen batch data; however, it can readily be used by training on new data and inferring the improved profiles directly after because the labels required for training are naturally available in cell profiling experiments. The application of this method could help improve the effectiveness of future cell profiling studies.