Poster
in
Workshop: AIM-FM: Advancements In Medical Foundation Models: Explainability, Robustness, Security, and Beyond
Research Journey of Generative Protein Modeling
Xinhui Chen · Yiwen Yuan · Joseph Liu · Chak Tou Leong · Zhen Xie · Xiaoye Zhu · Ying Chen · Songyue Chen · Chenyi Wang · Kun Li · Jie Zhang · Zuchao Li · Jiaqi Chen
Proteins are fundamental molecules performing diverse functions in living organisms. Protein engineering, which studies how to give proteins new or improved functions, has therefore become a research focus in the fields of biotechnology and medicine. A primary challenge in protein engineering is to efficiently discover and design new proteins with desired functions. Traditional approaches like directed evolution and rational design, though widely used, are limited by high computational costs and restricted exploration of potential protein structures. The recent success of generative models in efficiently synthesizing high-quality data across various domains has inspired researchers to investigate their potential applications in protein engineering. In this survey, we systematically summarize recent works on generative models for protein engineering, with a particular focus on protein design. Specifically, we categorize three main frameworks in existing generative protein design methods: sequence-based, structure-based, and joint sequence-structure generation. Besides, we provide a detailed review of representative generative models, including autoregressive models and diffusion models, and their application in protein sequence prediction and structure generation. Finally, we pinpoint existing challenges and propose future directions, such as leveraging large datasets, improving complex structure validation, and integrating advanced modeling techniques.