Poster
in
Workshop: Generative AI and Biology (GenBio@NeurIPS2023)
regLM: Designing realistic regulatory DNA with autoregressive language models
Avantika Lal · Tommaso Biancalani · Gokcen Eraslan
Keywords: [ hyenaDNA ] [ DNA sequence modeling ] [ generative sequence modeling ] [ CRE design ] [ GPT ] [ Autoregressive language modeling ] [ enhancer design ]
Designing cis-regulatory DNA elements (CREs) with desired properties is a challenging task with many therapeutic applications. Here, we used autoregressive language models trained on yeast and human putative CREs, in conjunction with supervised sequence-to-function models, to design regulatory DNA with desired patterns of activity. We showed that our framework, regLM, compares favorably to existing design approaches. regLM facilitates the design of realistic and diverse regulatory DNA while providing insights into the cis-regulatory code.