Poster
in
Workshop: Foundation Models for Science: Progress, Opportunities, and Challenges
Self-supervised Multimodal Model for Astronomy
Mariia Rizhko · Joshua Bloom
Keywords: [ Astronomy ] [ Multimodal ] [ Self-supervised ]
While machine-learned models are now routinely applied to facilitate astronomical inquiry, model inputs tend to be limited to a primary data source (namely images or a time series) and, in the most sophisticated approaches, some metadata. Yet with the growing use of wide-field, multiplexed observational resources, individual sources of interest often have a broad range of observational modes available. Here we construct an astronomical multimodal dataset and propose a self-supervised pre-training approach that enables a model to learn from multiple modalities simultaneously. Specifically, we extend the CLIP (Contrastive Language-Image Pretraining) model to a trimodal setting, allowing the integration of time-series photometry data, spectra, and astrophysical metadata. Our results demonstrate that CLIP pre-training improves performance for time-series photometry, where accuracy increases from 84.642% to 91.468%. Furthermore, CLIP boosts classification accuracy by up to 26.8% when labeled data is limited, showing the effectiveness of leveraging unlabeled data. To our knowledge this is the first construction of an n>2 mode model in astronomy. Extensions to n>3 modes with this approach is naturally anticipated with our approach.