Skip to yearly menu bar Skip to main content


Poster
in
Workshop: New Frontiers of AI for Drug Discovery and Development

MoleculeGPT: Instruction Following Large Language Models for Molecular Property Prediction

Weitong ZHANG · Xiaoyun Wang · Weili Nie · Joe Eaton · Brad Rees · Quanquan Gu

Keywords: [ Drug Discovery ] [ Large Language Models (LLM) ] [ Multi-Modal Training ]


Abstract:

Harnessing textual information offers significant advantages in the drug design process, providing invaluable insights into complex molecular structures and facilitating molecule design based on textual instructions. With recent advancements in the utilization of Large Language Models (LLMs) for multi-modal data applications, we aim to leverage the capabilities of LLM for molecule property prediction tasks. We introduce MoleculeGPT, which is designed to provide answers to queries concerning molecular properties on the basis of molecular structure inputs. To train the MoleculeGPT, we have curated a new dataset from the raw molecule description in PubChem for instruction-following tasks. We evaluate the performance of MoleculeGPT on multiple-choice questions and several downstream tasks on molecule property prediction for drug design. Experimental results show that MoleculeGPT can generate responses that closely resemble human-level performance and demonstrate exceptional capabilities across diverse downstream tasks.

Chat is not available.