Poster
in
Workshop: Learning Meaningful Representations of Life
Machine Learning enabled Pooled Optical Screening in Human Lung Cancer Cells
Srinivasan Sivanandan · Max Salick · Bobby Leitmann · Kara Liu · Navpreet Ranu · Cynthia Hao · Owen Chen · John Bisognano · Eric Lubeck · Ajamete Kaykas · Eilon Sharon · Ci Chu
Pooled CRISPR-based gene knockout (KO) screening has emerged as a powerful method to uncover gene effects on various phenotypes [1, 2]. Recently, an optical pooled CRISPR screening method was developed [3] in which gene targeting guide-RNA (gRNA) are determined using in situ sequencing coupled with microscopy imaging of cellular structure and spatial features [3-6]. Pooled optical screening is very scalable and cost-effective. It can be coupled with different imaging assays to perform large-scale high-content image-based CRISPR-based KO screens. However, development of automated and general approaches for data processing and analysis are required to unlock its full potential as a tool for drug target discovery. Here, we introduce a machine-learning enabled computational framework for in situ sequencing, segmentation and feature representations of cell morphology from pooled optical screens and apply it to human lung cancer cells (A549). We develop a convolutional neural network (CNN) method for gRNA sequence calling, and show that it increases the cell yield by 10% and enables automation. We suggest self-supervised single-cell embeddings as a method to create informative representations of cell morphology, moderately improving upon commonly used engineered features. We demonstrate that such embeddings, aggregated for each gene KO, are more similar for gene pairs that are known to interact and cluster genetic perturbations by their cellular components, biological pathways, and molecular functions. We also highlight ways to use the perturbation clusters to generate hypotheses about gene functions, which are consistent with results from orthogonal studies. Put together, we develop a scalable and general computational approach to process and analyze pooled CRISPR-based morphological screens that can be applied to screen for various disease relevant phenotypes.