Skip to yearly menu bar Skip to main content


Poster
in
Workshop: Foundation Models for Science: Progress, Opportunities, and Challenges

Provable in-context learning of linear systems and linear elliptic PDEs with transformers

Frank Cole · Yulong Lu · Tianhao Zhang · Riley O'Neill

Keywords: [ transformer ] [ In-context learning ] [ elliptic PDE ] [ scientific foundation model ]


Abstract:

Foundation models for natural language processing, empowered by the transformer architecture, exhibit remarkable {\em in-context learning} (ICL) capabilities: pre-trained models can adapt to a downstream task by only conditioning on few-shot prompts without updating the weights of the models. Recently, transformer-based foundation models also emerged as universal tools for solving scientific problems, including especially partial differential equations (PDEs). However, the theoretical underpinnings of ICL-capabilities of these models still remain elusive. This work develops rigorous error analysis for transformer-based ICL of the solution operators associated to a family of linear elliptic PDEs. Specifically, we show that a linear transformer defined by a linear self-attention layer can provably learn in-context to invert linear systems arsing from the spatial discretization of the PDEs. We derive theoretical scaling laws for the proposed linear transformers in terms of the size of the spatial discretization, the number of training tasks, the lengths of prompts used during training and inference, under both the in-domain generalization setting and various settings of distribution shifts. Empirically, we validate the ICL-capabilities of transformers through extensive numerical experiments.

Chat is not available.