Workshop
Foundation Models for Science: Progress, Opportunities, and Challenges
Wuyang Chen · Pu Ren · Elena Massara · Yongji Wang · N. Benjamin Erichson · Laurence Perreault-Levasseur · Bo Li · Swarat Chaudhuri
West Meeting Room 202-204
Sun 15 Dec, 8:30 a.m. PST
The integration of artificial intelligence (AI) and machine learning (ML) into scientific discovery represents a pivotal shift in traditional methodologies. Historically, scientific exploration has been systematic and logical, but AI and ML promise to transform fundamental discoveries. This shift enhances interdisciplinary dialogue and stimulates innovative problem-solving, enriching the scientific community's ability to tackle complex problems. Foundation models, such as GPT-3 and CLIP, have revolutionized computer vision and natural language processing, providing versatile, pre-trained bases for various applications. Leveraging these models addresses critical challenges like long-term planning and multi-modal reasoning, essential for applications in robotics and dialogue systems. The integration of AI-for-Science and foundation models offers a transformative force in scientific domains, solving complex problems and enabling domain-specific adaptations. This synergy is poised to radically improve the modeling of complex phenomena, making it a crucial investment for future scientific advancements. This workshop aims to bring together experts to discuss and collaborate on transformative questions and challenges in advancing scientific problems through foundation models.
Schedule
Sun 8:30 a.m. - 9:10 a.m.
|
Invited Talk 1: Paris Perdikaris
(
Invited Talk
)
>
SlidesLive Video |
Paris Perdikaris 🔗 |
Sun 9:45 a.m. - 10:25 a.m.
|
Invited Talk 2: Michael Mahoney
(
Invited Talk
)
>
SlidesLive Video |
Michael Mahoney 🔗 |
Sun 10:30 a.m. - 11:10 a.m.
|
Invited Talk 3: Laure Zanna
(
Invited Talk
)
>
SlidesLive Video |
Laure Zanna 🔗 |
Sun 11:15 a.m. - 11:55 a.m.
|
Invited Talk 4: Shirley Ho
(
Invited Talk
)
>
SlidesLive Video |
Shirley Ho 🔗 |
Sun 12:00 p.m. - 12:10 p.m.
|
Molphenix: A Multimodal Foundation Model for PhenoMolecular Retrieval
(
Oral
)
>
link
SlidesLive Video |
Philip Fradkin · Puria Azadi Moghadam · Karush Suri · Frederik Wenkel · Maciej Sypetkowski · Dominique Beaini 🔗 |
Sun 12:10 p.m. - 12:20 p.m.
|
ViTally Consistent: Scaling Biological Representation Learning for Cell Microscopy
(
Oral
)
>
link
SlidesLive Video |
13 presentersKian Kenyon-Dean · Jerry Wang · John Urbanik · Konstantin Donhauser · Jason Hartford · Saber Saberian · Nil Sahin · Ihab Bendidi · Safiye Celik · Marta Fay · Juan Rodriguez · Imran Haque · Oren Kraus |
Sun 12:20 p.m. - 12:30 p.m.
|
GFlowNet Pretraining with Inexpensive Rewards
(
Oral
)
>
link
SlidesLive Video |
Mohit Pandey · Gopeshh Subbaraj · Emmanuel Bengio 🔗 |
Sun 2:20 p.m. - 3:00 p.m.
|
Invited Talk 5: Max Welling
(
Invited Talk
)
>
SlidesLive Video |
Max Welling 🔗 |
Sun 3:30 p.m. - 4:10 p.m.
|
Invited Talk 6: Danielle Maddix Robinson
(
Invited Talk
)
>
SlidesLive Video |
Danielle Maddix 🔗 |
Sun 4:15 p.m. - 4:25 p.m.
|
Towards Interpretable Scientific Foundation Models: Sparse Autoencoders for Disentangling Dense Embeddings of Scientific Concepts
(
Oral
)
>
link
SlidesLive Video |
Charles O'Neill · Christine Ye · Kartheik Iyer · John Wu 🔗 |
Sun 4:25 p.m. - 4:35 p.m.
|
Extralonger: Toward a Unified Perspective of Spatial-Temporal Factors for Extra-Long-Term Traffic Forecasting
(
Oral
)
>
link
SlidesLive Video |
Zhiwei Zhang · Shaojun E · Fandong Meng · Jie Zhou · Wenjuan Han 🔗 |
Sun 4:35 p.m. - 4:45 p.m.
|
Learning temperature-aware representations from millions of annotated protein sequences
(
Oral
)
>
link
SlidesLive Video |
Mingchen Li · Liang Zhang · Zilan Wang · Bozitao Zhong · Pan Tan · Jiabei Cheng · Bingxin Zhou · Liang Hong · Huiqun Yu 🔗 |
-
|
Provable in-context learning of linear systems and linear elliptic PDEs with transformers ( Poster ) > link | Frank Cole · Yulong Lu · Tianhao Zhang · Riley O'Neill 🔗 |
-
|
Specialized Foundation Models Struggle to Beat Traditional Supervised Learning Baselines ( Poster ) > link | Ritvik Gupta · Zongzhe Xu · Wenduo Cheng · Alexander Shen · Junhong Shen · Ameet Talwalkar · Misha Khodak 🔗 |
-
|
Uncertainty and Generalizability in Foundation Models for Earth Observation ( Poster ) > link | Raul Ramos-Pollán · Freddie Kalaitzis · Karthick Panner Selvam 🔗 |
-
|
Self-supervised Multimodal Model for Astronomy ( Poster ) > link | Mariia Rizhko · Joshua Bloom 🔗 |
-
|
In-Context Learning for Function Approximation with DeepSet-ONet ( Poster ) > link | Shao-Ting Chiu · Junyuan Hong · Ulisses M. Braga-Neto 🔗 |
-
|
Vision foundation models: can they be applied to astrophysics data? ( Poster ) > link | Erica Lastufka · Mariia Drozdova · Vitaliy Kinakh · Slava Voloshynovskiy 🔗 |
-
|
A COMPARATIVE STUDY OF NEURAL ODE AND UNIVERSAL ODE MODELS IN SOLVING CHANDRASEKHAR’S WHITE DWARF EQUATION. ( Poster ) > link | Raymundo Vazquez Martinez · Raj Dandekar · Rajat Dandekar · Sreedath Panat 🔗 |
-
|
Leveraging foundation models for data-limited ecological applications ( Poster ) > link | Kyle Doherty · Max Gurinas · Erik Samsoe · Charles Casper · Beau Larkin · Philip Ramsey · Brandon Trabucco · Ruslan Salakhutdinov 🔗 |
-
|
Bio-xLSTM: Generative modeling, representation and in-context learning of biological and chemical sequences ( Poster ) > link | Niklas Schmidinger · Lisa Schneckenreiter · Philipp Seidl · Johannes Schimunek · Sohvi Luukkonen · Pieter-Jan Hoedt · Johannes Brandstetter · Andreas Mayr · Sepp Hochreiter · Günter Klambauer 🔗 |
-
|
Generating and Validating Agent and Environment Code for Simulating Realistic Personality Profiles with Large Language Models ( Poster ) > link | Nathan Cloos · M Ganesh Kumar · Adam Manoogian · Christopher Cueva · Shawn Rhoads 🔗 |
-
|
SciRIFF: A Resource to Enhance Language Model Instruction-Following over Scientific Literature ( Poster ) > link |
13 presentersDavid Wadden · Kejian Shi · Jacob Morrison · Aakanksha Naik · Shruti Singh · Nitzan Barzilay · Kyle Lo · Tom Hope · Luca Soldaini · Zejiang Shen · Doug Downey · Hannaneh Hajishirzi · Arman Cohan |
-
|
VSMNO: Solving PDE by Utilizing Spectral Patterns of Different Neural Operators ( Poster ) > link | Fengrui Jing · Hongzhen Ding · Taosong 🔗 |
-
|
Scalable Universal T-Cell Receptor Embeddings from Adaptive Immune Repertoires ( Poster ) > link | Paidamoyo Chapfuwa · Ilker Demirel · Lorenzo Pisani · Javier Zazo · Elon Portugaly · H. Zahid · Julia Greissl 🔗 |
-
|
Enhancing Detail Recovery in ICF Radiographs: A Transformer-based Approach with ViXReg ( Poster ) > link | Nga T Nguyen-Fotiadis · Bradley Wolfe · Zhehui Wang 🔗 |
-
|
Small Molecule Optimization with Large Language Models ( Poster ) > link | Menua Bedrosian · Philipp Guevorguian · Tigran Fahradyan · Gayane Chilingaryan · Hrant Khachatrian · Armen Aghajanyan 🔗 |
-
|
Scientific Knowledge Graph and Ontology Generation using Open Large Language Models ( Poster ) > link | Alexandru Oarga · Matthew Hart · Andres M Bran · Magdalena Lederbauer · Philippe Schwaller 🔗 |
-
|
Metalic: Meta-Learning In-Context with Protein Language Models ( Poster ) > link | Jacob Beck · Shikha Surana · Manus McAuliffe · Oliver Bent · Tom Barrett · Juan Jose Garau-Luis · Paul Duckworth 🔗 |
-
|
SciLitLLM: How to Adapt LLMs for Scientific Literature Understanding ( Poster ) > link | Sihang Li · Jin Huang · Jiaxi Zhuang · Yaorui Shi · Cai Xiaochen · Mingjun Xu · Xiang Wang · Linfeng Zhang · Guolin Ke · Hengxing Cai 🔗 |
-
|
CLOUD: A Scalable Scientific Foundation Model for Crystal Representation Learning ( Poster ) > link | Changwen Xu · Zhu · Venkatasubramanian Viswanathan 🔗 |
-
|
A Mamba-Based Foundation Model for Chemistry ( Poster ) > link | Emilio Vital Brazil · Eduardo Soares · Victor Yukio Shirasuna · Renato Cerqueira · Dmitry Zubarev · Kristin Schmidt 🔗 |
-
|
MAMORX: Multi-agent Multi-Modal Scientific Review Generation with External Knowledge ( Poster ) > link | Guanchao Wang · Pawin Taechoyotin · Tong Zeng · Bradley Sides · Daniel Acuna 🔗 |
-
|
ChemDFM: A Large Language Foundation Model for Chemistry ( Poster ) > link |
13 presentersZihan Zhao · Da Ma · Lu Chen · Liangtai Sun · Zihao Li · Yi Xia · Hongshen Xu · Zichen Zhu · Su Zhu · Shuai Fan · Guodong Shen · Kai Yu · Xin Chen |
-
|
Bridging biomolecular modalities for knowledge transfer in bio-language models ( Poster ) > link | Mangal Prakash · Artem Moskalev · Peter DiMaggio · Steven Combs · Tommaso Mansi · Justin Scheer · Rui Liao 🔗 |
-
|
Improving generalisability of 3D binding affinity models in low data regimes ( Poster ) > link | Julia Milena Buhmann · Ward Haddadin · Alan Bilsland · Lukáš Pravda · Hagen Triendl 🔗 |
-
|
AtmosArena: Benchmarking Foundation Models for Atmospheric Sciences ( Poster ) > link | Tung Nguyen · Prateik Sinha · Advit Deepak · Karen A McKinnon · Aditya Grover 🔗 |
-
|
SysCaps: Language Interfaces for Simulation Surrogates of Complex Systems ( Poster ) > link | Patrick Emami · Zhaonan Li · Saumya Sinha · Truc Nguyen 🔗 |
-
|
SciDFM: A Large Language Model with Mixture-of-Experts for Science ( Poster ) > link | Liangtai Sun · Danyu Luo · Da Ma · Zihan Zhao · BaocaiChen · Zhennan Shen · Su Zhu · Lu Chen · Xin Chen · Kai Yu 🔗 |
-
|
Survey: Adaptive Physics-informed Neural Networks ( Poster ) > link | Edgar Torres Rios · Mathias Niepert 🔗 |
-
|
Contextualizing biological perturbation experiments through language ( Poster ) > link | Menghua Wu · Russell Littman · Jacob Levine · Lin Qiu · Tommaso Biancalani · David Richmond · Jan-Christian Huetter 🔗 |
-
|
Assessing interaction recovery of predicted protein-ligand poses ( Poster ) > link | David Errington · Constantin Schneider · Cédric Bouysset · Frédéric Dreyer 🔗 |
-
|
Generative Models in Protein Engineering: A Comprehensive Survey ( Poster ) > link | Xinhui Chen · Yiwen Yuan · Joseph Liu · Chak Tou Leong · Xiaoye Zhu · Jiaqi Chen 🔗 |
-
|
IgBlend: Unifying 3D Structure and Sequence for Antibody LLMs ( Poster ) > link | Cédric Malherbe · Talip Ucar 🔗 |
-
|
SeisLM: a Foundation Model for Seismic Waveforms ( Poster ) > link | Tianlin Liu · Jannes Münchmeyer · Laura Laurenti · Chris Marone · Maarten V. de Hoop · Ivan Dokmanić 🔗 |
-
|
Agnostic Causality-Driven Enhancement of Chemical Foundation Models on Downstream Tasks ( Poster ) > link | Victor Yukio Shirasuna · Eduardo Soares · Emilio Vital Brazil · Karen Fiorella Gutierrez · Renato Cerqueira · Dmitry Zubarev · Kristin Schmidt 🔗 |
-
|
Can we pre-train ICL-based SFMs for the zero-shot inference of the 1D CDR problem with noisy data? ( Poster ) > link | Mingu Kang · Dongseok Lee · Woojin Cho · Kookjin Lee · Anthony Gruber · Nathaniel Trask · Youngjoon Hong · Noseong Park 🔗 |
-
|
Maven: A Multimodal Foundation Model for Supernova Science ( Poster ) > link | Gemma Zhang · Thomas Helfer · Alex Gagliano · Siddharth Mishra-Sharma · V Villar 🔗 |
-
|
ProtDiff: Function-Conditioned Masked Diffusion Models for Robust Directed Protein Generation ( Poster ) > link | Vishrut Thoutam 🔗 |
-
|
Understanding Protein-DNA Interactions by Paying Attention to Protein and Genomics Foundation Models ( Poster ) > link | Dhruva Rajwade · Erica Wang · Aryan Satpathy · Alexander Brace · Hongyu Guo · Arvind Ramanathan · Shengchao Liu · Animashree Anandkumar 🔗 |
-
|
Multi-View Mixture-of-Experts for Predicting Molecular Properties Using SMILES, SELFIES, and Graph-Based Representations ( Poster ) > link | Eduardo Soares · Indra Priyadarsini S · Emilio Vital Brazil · Victor Yukio Shirasuna · Seiji Takeda 🔗 |
-
|
A Foundation Model for Metagenomic Sequences ( Poster ) > link | Ollie Liu · sami jaghouar · Johannes Hagemann · Jeff Kaufman · Willie Neiswanger 🔗 |
-
|
A Large Encoder-Decoder Polymer-Based Foundation Model ( Poster ) > link | Eduardo Soares · Nathaniel Park · Emilio Vital Brazil · Victor Yukio Shirasuna 🔗 |
-
|
OPI: An Open Instruction Dataset for Adapting Large Language Models to Protein-Related Tasks ( Poster ) > link | Hongwang Xiao · wenjun lin · Hui Wang · Zheng Liu · Qiwei Ye 🔗 |
-
|
BiRNA-BERT: Adaptive Tokenization for Efficient RNA Language Modeling ( Poster ) > link | Toki Tahmid · Haz Sameen Shahgir · Sazan Mahbub · Yue Dong · Md. Shamsuzzoha Bayzid 🔗 |
-
|
Solaris: A Foundation Model for the Sun ( Poster ) > link | Harris Abdul Majid · Pietro Sittoni · Francesco Tudisco 🔗 |
-
|
Is Tokenization Needed for Masked Particle Modelling? ( Poster ) > link | Matthew Leigh · Samuel Klein · Francois Charton · Tobias Golling · Lukas Heinrich · Michael Kagan · Margarita Osadchy 🔗 |
-
|
DiffBatt: A Diffusion Model for Battery Degradation Prediction and Synthesis ( Poster ) > link | Hamidreza Eivazi Kourabbaslou · André Hebenbrock · Raphael Ginster · Steffen Blömeke · Stefan Wittek · Christoph Hermann · Thomas Spengler · Thomas Turek · Andreas Rausch 🔗 |
-
|
ChatCite: LLM Agent with Human Workflow Guidance for Comparative Literature Summary ( Poster ) > link | Yutong Li · Lu Chen · Aiwei Liu · Kai Yu · Lijie Wen 🔗 |
-
|
Understanding Drought through Spatial-Temporal Learning ( Poster ) > link | Xuwei Tan · Qian Zhao · Yanlan Liu · Xueru Zhang 🔗 |
-
|
LLM Agent for Fire Dynamics Simulations ( Poster ) > link | Leidong Xu · Danyal Mohaddes · Yi Wang 🔗 |
-
|
Language Models for Text-guided Protein Evolution ( Poster ) > link | Zhanghan Ni · Shengchao Liu · Animashree Anandkumar 🔗 |
-
|
Cell ontology guided transcriptome foundation model ( Poster ) > link | XINYU YUAN · Zhihao Zhan · Zuobai Zhang · Manqi Zhou · Jianan Zhao · Boyu Han · Yue Li · Jian Tang 🔗 |
-
|
Developing a Foundation Model for Predicting Material Failure ( Poster ) > link |
13 presentersAgnese Marcato · Javier E. Santos · Aleksandra Pachalieva · Kai Gao · Ryley Hill · Esteban Rougier · Qinjun Kang · Jeffrey Hyman · Abigail Hunter · Janel Chua · Earl Lawrence · Hari Viswanathan · Daniel O'Malley |
-
|
A Safety-aware Framework for Generative Enzyme Design with Foundation Models ( Poster ) > link | Xiaoyi Fu · Tao Han · Yuan Yao · Song Guo 🔗 |
-
|
Scale-consistent learning with neural operators ( Poster ) > link | Zongyi Li · Samuel Lanthaler · Catherine Deng · Yixuan Wang · Kamyar Azizzadenesheli · Animashree Anandkumar 🔗 |
-
|
Solving Out-of-Distribution Challenges in Optical Foundation Models using Self-Improving Data Augmentation ( Poster ) > link | Mingqian Ma · Taigao Ma · L. Jay Guo 🔗 |
-
|
Pulsar Candidate Classification with Multimodal Large Language Models ( Poster ) > link | Fuyong Zhao · Yuyang Li · Yanhao Wang · Hui Li · Mei Chen · Panfeng Chen · Ningchen Sun · Cunshi Wang · Jifeng Liu 🔗 |
-
|
PROSE-FD: A Multimodal PDE Foundation Model for Learning Multiple Operators for Forecasting Fluid Dynamics ( Poster ) > link | Yuxuan Liu · Jingmin Sun · Xinjie He · Griffin Pinney · Zecheng Zhang · Hayden Schaeffer 🔗 |
-
|
Stylish and Functional: Guided Interpolation Subject to Physical Constraints ( Poster ) > link | Yan-Ying Chen · Nikos Arechiga · Chenyang Yuan · Matthew Hong · Matt Klenk · Charlene C. Wu 🔗 |
-
|
BarcodeMamba: State Space Models for Biodiversity Analysis ( Poster ) > link | Tiancheng Gao · Graham Taylor 🔗 |
-
|
Weighted Diversified Sampling for Efficient Data-Driven Single-Cell Gene-Gene Interaction Discovery ( Poster ) > link | Yifan Wu · Yuntao Yang · Zirui Liu · Zhao Li · Khushbu Pahwa · Rongbin Li · W. Jim Zheng · Xia Hu · Zhaozhuo Xu 🔗 |
-
|
Adapting Segment Anything Model (SAM) to Experimental Datasets via Fine-Tuning on GAN-based Simulation: A Case Study in Additive Manufacturing ( Poster ) > link | Anika Tabassum · Amir K Ziabari 🔗 |
-
|
Learning temperature-aware representations from millions of annotated protein sequences ( Poster ) > link | Mingchen Li · Liang Zhang · Zilan Wang · Bozitao Zhong · Pan Tan · Jiabei Cheng · Bingxin Zhou · Liang Hong · Huiqun Yu 🔗 |
-
|
Towards Interpretable Scientific Foundation Models: Sparse Autoencoders for Disentangling Dense Embeddings of Scientific Concepts ( Poster ) > link | Charles O'Neill · Christine Ye · Kartheik Iyer · John Wu 🔗 |
-
|
Extralonger: Toward a Unified Perspective of Spatial-Temporal Factors for Extra-Long-Term Traffic Forecasting ( Poster ) > link | Zhiwei Zhang · Shaojun E · Fandong Meng · Jie Zhou · Wenjuan Han 🔗 |
-
|
GFlowNet Pretraining with Inexpensive Rewards ( Poster ) > link | Mohit Pandey · Gopeshh Subbaraj · Emmanuel Bengio 🔗 |
-
|
ViTally Consistent: Scaling Biological Representation Learning for Cell Microscopy ( Poster ) > link |
13 presentersKian Kenyon-Dean · Jerry Wang · John Urbanik · Konstantin Donhauser · Jason Hartford · Saber Saberian · Nil Sahin · Ihab Bendidi · Safiye Celik · Marta Fay · Juan Rodriguez · Imran Haque · Oren Kraus |
-
|
Molphenix: A Multimodal Foundation Model for PhenoMolecular Retrieval ( Poster ) > link | Philip Fradkin · Puria Azadi Moghadam · Karush Suri · Frederik Wenkel · Maciej Sypetkowski · Dominique Beaini 🔗 |
-
|
SpectraFM: Tuning into Stellar Foundation Models ( Poster ) > link | Nolan Koblischke · Jo Bovy 🔗 |