On the choice of training data for machine learning of geostrophic mesoscale turbulence

by   F. E. Yan, et al.

'Data' plays a central role in data-driven methods, but is not often the subject of focus in investigations of machine learning algorithms as applied to Earth System Modeling related problems. Here we consider the case of eddy-mean interaction in rotating stratified turbulence in the presence of lateral boundaries, a problem of relevance to ocean modeling, where the eddy fluxes contain dynamically inert rotational components that are expected to contaminate the learning process. An often utilized choice in the literature is to learn from the divergence of the eddy fluxes. Here we provide theoretical arguments and numerical evidence that learning from the eddy fluxes with the rotational component appropriately filtered out results in models with comparable or better skill, but substantially improved robustness. If we simply want a data-driven model to have predictive skill then the choice of data choice and/or quality may not be critical, but we argue it is highly desirable and perhaps even necessary if we want to leverage data-driven methods to aid in discovering unknown or hidden physical processes within the data itself.


page 9

page 13


Hybridizing Physical and Data-driven Prediction Methods for Physicochemical Properties

We present a generic way to hybridize physical and data-driven methods f...

Choice modelling in the age of machine learning

Since its inception, the choice modelling field has been dominated by th...

Living in the Physics and Machine Learning Interplay for Earth Observation

Most problems in Earth sciences aim to do inferences about the system, w...

Advancing Reacting Flow Simulations with Data-Driven Models

The use of machine learning algorithms to predict behaviors of complex s...

On Using Retrained and Incremental Machine Learning for Modeling Performance of Adaptable Software: An Empirical Comparison

Given the ever-increasing complexity of adaptable software systems and t...

Machine learning for advancing low-temperature plasma modeling and simulation

Machine learning has had an enormous impact in many scientific disciplin...

Please sign up or login with your details

Forgot password? Click here to reset