Poster
DePLM: Denoising Protein Language Models for Property Optimization
Zeyuan Wang · Qiang Zhang · Keyan Ding · Ming Qin · Xiaotong Li · Xiang Zhuang · Yu Zhao · Jianhua Yao · Huajun Chen
Protein optimization is a fundamental biological task aimed at enhancing the performance of proteins by modifying their sequences. Computational methods primarily rely on evolutionary information (EI) encoded by protein language models (PLMs) to carry out this optimization. However, these methods suffer from a few limitations. (1) Evolutionary processes involve the simultaneous consideration of multiple functional properties, often overshadowing the specific property of interest. (2) Measurements of these properties tend to be tailored to experimental conditions, leading to reduced generalizability of trained models to novel proteins. To address these limitations, we introduce Denoising Protein Language Models (DePLM), a novel approach that refines the evolutionary information embodied in PLMs for improved protein optimization. Specifically, we conceptualize EI as comprising both property-relevant and irrelevant information, with the latter acting as “noise” for the optimization task at hand. Our approach involves denoising this EI in PLMs through a diffusion process conducted in the rank space of property values, thereby enhancing model generalization and ensuring dataset-agnostic learning. Extensive experimental results have demonstrated that DePLM not only surpasses the state-of-the-art in mutation effect prediction but also exhibits strong generalization capabilities for novel proteins.
Live content is unavailable. Log in and register to view live content