Towards Foundation Models for Scientific Machine Learning: Characterizing Scaling and Transfer Behavior

by   Shashank Subramanian, et al.

Pre-trained machine learning (ML) models have shown great performance for a wide range of applications, in particular in natural language processing (NLP) and computer vision (CV). Here, we study how pre-training could be used for scientific machine learning (SciML) applications, specifically in the context of transfer learning. We study the transfer behavior of these models as (i) the pre-trained model size is scaled, (ii) the downstream training dataset size is scaled, (iii) the physics parameters are systematically pushed out of distribution, and (iv) how a single model pre-trained on a mixture of different physics problems can be adapted to various downstream applications. We find that-when fine-tuned appropriately-transfer learning can help reach desired accuracy levels with orders of magnitude fewer downstream examples (across different tasks that can even be out-of-distribution) than training from scratch, with consistent behavior across a wide range of downstream examples. We also find that fine-tuning these models yields more performance gains as model size increases, compared to training from scratch on new downstream tasks. These results hold for a broad range of PDE learning tasks. All in all, our results demonstrate the potential of the "pre-train and fine-tune" paradigm for SciML problems, demonstrating a path towards building SciML foundation models. We open-source our code for reproducibility.


page 2

page 5

page 12


SMART: Robust and Efficient Fine-Tuning for Pre-trained Natural Language Models through Principled Regularized Optimization

Transfer learning has fundamentally changed the landscape of natural lan...

Statistical Foundations of Prior-Data Fitted Networks

Prior-data fitted networks (PFNs) were recently proposed as a new paradi...

The Lottery Ticket Hypothesis for Pre-trained BERT Networks

In natural language processing (NLP), enormous pre-trained models like B...

Modular Deep Learning

Transfer learning has recently become the dominant paradigm of machine l...

SHiFT: An Efficient, Flexible Search Engine for Transfer Learning

Transfer learning can be seen as a data- and compute-efficient alternati...

Pre-Train Your Loss: Easy Bayesian Transfer Learning with Informative Priors

Deep learning is increasingly moving towards a transfer learning paradig...

Scalable Transfer Learning with Expert Models

Transfer of pre-trained representations can improve sample efficiency an...

Please sign up or login with your details

Forgot password? Click here to reset