Poster
in
Workshop: AI4Mat-2024: NeurIPS 2024 Workshop on AI for Accelerated Materials Design
Dimension Deficit: Is 3D a Step Too Far for Optimizing Molecules?
Andres Guzman-Cordero · Luca Thiede · Gary Tom · Alan Aspuru-Guzik · Felix Strieth-Kalthoff · Agustinus Kristiadi
Keywords: [ Graph Neural Network ] [ Materials Discovery ] [ Bayesian Optimization ] [ Equivariance ]
The discovery of new materials with desirable properties is essential for technological advancements, from pharmaceuticals to renewable energy. Traditional simulation methods like Density Functional Theory (DFT) provide ab initio quantum calculation estimates of common properties but are computationally expensive, prompting the need for carefully selecting candidates for the calculation. Bayesian optimization (BO) is commonly used to efficiently find and screen candidates. However, choosing the right vector representations for a Bayesian regressor is challenging: while molecules are 3-dimensional, obtaining 3D features is computationally intensive, so 1D and 2D features are typically used. In this work, we study this discrepancy. Are 3D features worth considering for BO over molecules despite their computational complexity? To this end, we evaluate the molecular fingerprint representation, 2D message-passing neural networks, and 3D equivariant attention-based graph neural networks. We evaluate their performance on four datasets, considering both low- and high-data regimes and different types of Bayesian regressors. Finally, we explore the transfer learning capabilities of 2D and 3D graph features by treating the graph networks as foundation models.