Poster
in
Workshop: AI for Accelerated Materials Design (AI4Mat-2023)
MTENCODER: A Multi-task Pretrained Transformer Encoder for Materials Representation Learning
Thorben Prein · Elton Pan · Tom Doerr · Elsa Olivetti · Jennifer Rupp
Keywords: [ Representation Learning ] [ Materials Informatics ] [ Inorganic materials ] [ Representation learning ] [ Materials informatics ]
Given the vast spectrum of material properties characterizing each compound, learning representations for inorganic materials is intricate. The prevailing trend within the materials informatics community leans towards designing specialized models that predict single properties. We introduce a \textit{multi-task} learning framework, wherein a transformer-based encoder is co-trained across diverse materials properties and a denoising objective, resulting in robust and generalizable materials representations. Our method not only improves over the performance observed in single-dataset pretraining, but also showcases scalability and adaptability toward multi-dataset pretraining. Experiments demonstrate that the trained encoder \textsc{MTEncoder} captures chemically meaningful representations, surpassing the performance of currrent structure-agnostic materials encoders. This approach paves the way to improvements in a multitude of materials informatics tasks, prominently including materials property prediction and synthesis planning for materials discovery.