Skip to yearly menu bar Skip to main content


Poster
in
Workshop: AI for New Drug Modalities

Mixture of Experts Enable Efficient and Effective Protein Understanding and Design

Ning Sun · Shuxian Zou · Tianhua Tao · Sazan Mahbub · Xingyi Cheng · Yonghao Zhuang · Hongyi Wang · Eric Xing · Le Song


Abstract:

Proteins play a fundamental role in life. Understanding the language of proteins offers significant potential for gaining mechanistic insights into biological systems and introduces new avenues for treating diseases, enhancing agriculture, and safeguarding the environment. While large protein language models (PLMs) like ESM2-15B and xTrimoPGLM-100B have achieved remarkable performance in diverse protein understanding and design tasks, these models, being dense transformer models, pose challenges due to their computational inefficiency during training and deployment. In this work, we introduce Protein-MoE, the first mixture-of-experts (MoE) model in the protein domain, with model size scales to 16 billion parameters.Leveraging a sparse MoE architecture with 8 experts within each transformer block and selectively activating 2 experts for each input token, our model is significantly more efficient in training and inference. Through pre-training on 1.2 trillion amino acids collected from UniRef90 and ColabfoldDB, our model achieves state-of-the-art results across most tasks in the xTrimoPGLM benchmark. Furthermore, on over 280 ProteinGym Deep Mutational Scanning (DMS) assays, our model achieves nearly 99\% of the overall performance of the best MSA-based model and significantly outperforms the previously reported state-of-the-art model that also does not utilize MSA.We also adapted this model for structure-conditioned protein sequence generation task and achieved new SOTA in this domain. These results indicate that Protein-MoE can serve as a strong foundation model for protein understanding and design.

Chat is not available.