Skip to yearly menu bar Skip to main content


Poster
in
Workshop: AI for New Drug Modalities

Signals in the Cells: Multimodal and Contextualized Machine Learning Foundations for Therapeutics

Alejandro Velez-Arce · Kexin Huang · Michelle Li · xiang lin · Wenhao Gao · Bradley Pentelute · Tianfan Fu · Manolis Kellis · Marinka Zitnik


Abstract:

Drug discovery AI datasets and benchmarks have not traditionally included single-cell analysis biomarkers. While numerous benchmarking efforts in single-cell analysis have recently released collections of single-cell tasks, they have yet to comprehensively release datasets, models, and benchmarks that integrate a broad range of therapeutic discovery tasks with cell-type-specific biomarkers. Therapeutics Commons (TDC-2) presents a collection of datasets, tools, models, and benchmarks integrating cell-type-specific contextual features with ML tasks across the range of therapeutics. In this paper, we present four tasks across contextual AI in therapeutics at single-cell resolution: drug-target nomination, genetic perturbation response prediction, chemical perturbation response prediction, and protein-peptide interaction prediction. We introduce a collection of datasets, models, and benchmarks for these four tasks. Finally, we detail the advancements and challenges in machine learning and biology that drove the implementation of TDC-2 and how they are reflected in its architecture, collection of datasets and benchmarks, and foundation model tooling.

Chat is not available.