Poster
in
Workshop: Learning Meaningful Representations of Life
Biological Cartography: Building and Benchmarking Representations of Life
Safiye Celik · Jan-Christian Huetter · Sandra Melo · Nathan Lazar · Rahul Mohan · Conor Tillinghast · Tommaso Biancalani · Marta Fay · Berton Earnshaw · Imran Haque
The continued scaling of genetic perturbation technologies combined with high-dimensional assays (microscopy and RNA-sequencing) has enabled genome-scale reverse-genetics experiments that go beyond single-endpoint measurements of growth or lethality. Datasets emerging from these experiments can be combined to construct “maps of biology”, in which perturbation readouts are placed in unified, relatable embedding spaces to capture known biological relationships and discover new ones. Construction of maps involves many technical choices in both experimental and computational protocols, motivating the design of benchmark procedures by which to evaluate map quality in a systematic, unbiased manner.In this work, we propose a framework for the steps involved in map building and demonstrate key classes of benchmarks to assess the quality of a map. We describe univariate benchmarks assessing perturbation quality and multivariate benchmarks assessing recovery of known biological relationships from large-scale public data sources. We demonstrate the application and interpretation of these benchmarks through example maps of scRNA-seq and phenomic imaging data.