Poster
in
Workshop: Machine Learning and the Physical Sciences
Fine-tuning Vision Transformers for the Prediction of State Variables in Ising Models
Onur Kara · Arijit Sehanobish · HECTOR CORZO
Transformers are state-of-the-art deep learning models that are composed of stacked attention and point-wise, fully connected layers designed for handling sequential data. Transformers are not only ubiquitous throughout Natural Language Processing (NLP), but also have recently inspired a new wave of Computer Vision (CV) applications research. In this work, a Vision Transformer (ViT) is fine-tuned to predict the state variables of 2-dimensional Ising model simulations. Our experiments show that ViT outperforms state-of-the-art Convolutional Neural Networks (CNN) when using a small number of microstate images from the Ising model corresponding to various boundary conditions and temperatures. This work explores the possible of applications of ViT to other simulations and introduces interesting research directions on how attention maps can learn the underlying physics governing different phenomena.