Poster
in
Workshop: Foundation Models for Science: Progress, Opportunities, and Challenges
Can we pre-train ICL-based SFMs for the zero-shot inference of the 1D CDR problem with noisy data?
Mingu Kang · Dongseok Lee · Woojin Cho · Kookjin Lee · Anthony Gruber · Nathaniel Trask · Youngjoon Hong · Noseong Park
Keywords: [ zero-shot ] [ in-context learning ] [ scientific founcation model ] [ noisy prior data ]
Recent advancements in scientific machine learning have begun to explore the potential of scientific foundation models (SFMs). Inspired by the in-context learning (ICL) framework of large language models (LLMs), we leverage prior data and pre-training techniques to construct our SFM. It has been demonstrated that ICL in LLMs can perform Bayesian inference, resulting in strong generalization capabilities. Furthermore, LLMs do not exhibit intrinsic inductive bias; rather, they inherit bias from the prior data, as confirmed experimentally.Building upon these insights, our methodology is structured as follows: (i) we collect prior data in the form of solutions of partial differential equations (PDEs) constructed by an arbitrary linear combination of mathematical dictionaries, (ii) we utilize Transformer architectures with self-attention and cross-attention mechanisms to predict PDE solutions without knowledge of the governing equations in a zero-shot setting, and (iii) we provide experimental evidence on the one dimensional convection-diffusion-reaction equation, which demonstrate that pre-training remains robust even with noisy prior data, with only marginal impacts on test accuracy. Notably, this finding opens the path to pre-training SFMs with realistic, low-cost data instead of, or in conjunction with, numerical high-cost data. These results support the conjecture that SFMs can improve in a manner similar to LLMs, where it is nearly impossible to fully clean the vast set of sentences crawled from the Internet.