Skip to yearly menu bar Skip to main content


Poster
in
Workshop: AI for New Drug Modalities

Learning multi-cellular representations of single-cell transcriptomics data enables characterization of patient-level disease states

Tianyu Liu · Edward De Brouwer · Tony Kuo · Nathaniel Diamant · Missarova Alsu · Minsheng Hao · Hanchen Wang · Hector Corrada Bravo · Gabriele Scalia · Aviv Regev · Graham Heimberg


Abstract:

Over the years, single-cell transcriptomics has emerged as a prominent tool for understanding the mechanisms of human disease. The availability of extensive single-cell RNA sequencing (scRNA-seq) datasets, combined with advanced machine learning techniques, has driven the development of single-cell foundation models that provide informative and versatile cell representations based on gene expression. In this work, we propose to venture to the next step by generating patient-level representations derived from multi-cellular expression context measured with scRNA-seq. Our study leverages large-scale, publicly available single-cell transcriptomics studies, encompassing over 5000 patients and 24.3 million cells. Our model, PaSCient, employs a multi-level representation learning paradigm and provides importance scores at the individual cell and gene levels. This enables a fine-grained analysis of the cell types and gene signatures characteristic of a given disease. Comprehensive and rigorous benchmarking demonstrates the superiority of PaSCient in disease classification and underscores its multiple downstream applications, including dimensionality reduction, gene/cell type prioritization, and patient subgroup discovery.

Chat is not available.