Poster
in
Workshop: Learning Meaningful Representations of Life
Generative model for Pseudomonad genomes
Manasa Kesapragada
Recent advances in genomic sequencing have resulted in several thousands of full genomes of pseudomonads, a genera of bacteria important in many science areas ranging from biogeochemical cycling in the environment to bacterial pneumonia in humans. With these high-quality data sets, combined with tens of thousands of somewhat lower quality metagenomically assembled genomes, we create a generative model for pseudomonad genomes. We present a GAN model that generates gene family presence absence list for a genome. We also demonstrate that the discriminator of this model can be used as a binary classifier to identify incorrect genomes. In the future, our desired model can be used to generate genomes within a given set of parameters such as, “Generate a genome that is root associated, drought resistant, salt tolerant that will produce this natural product”.