Optimally Weighted Ensembles of Regression Models: Exact Weight Optimization and Applications

06/22/2022
by   Patrick Echtenbruck, et al.
0

Automated model selection is often proposed to users to choose which machine learning model (or method) to apply to a given regression task. In this paper, we show that combining different regression models can yield better results than selecting a single ('best') regression model, and outline an efficient method that obtains optimally weighted convex linear combination from a heterogeneous set of regression models. More specifically, in this paper, a heuristic weight optimization, used in a preceding conference paper, is replaced by an exact optimization algorithm using convex quadratic programming. We prove convexity of the quadratic programming formulation for the straightforward formulation and for a formulation with weighted data points. The novel weight optimization is not only (more) exact but also more efficient. The methods we develop in this paper are implemented and made available via github-open source. They can be executed on commonly available hardware and offer a transparent and easy to interpret interface. The results indicate that the approach outperforms model selection methods on a range of data sets, including data sets with mixed variable type from drug discovery applications.

READ FULL TEXT
research
03/28/2012

A Multi-objective Exploratory Procedure for Regression Model Selection

Variable selection is recognized as one of the most critical steps in st...
research
06/04/2020

Model selection criteria for regression models with splines and the automatic localization of knots

In this paper we propose a model selection approach to fit a regression ...
research
06/13/2021

An Extended Multi-Model Regression Approach for Compressive Strength Prediction and Optimization of a Concrete Mixture

Due to the significant delay and cost associated with experimental tests...
research
08/08/2020

Scalable model selection for spatial additive mixed modeling: application to crime analysis

A rapid growth in spatial open datasets has led to a huge demand for reg...
research
01/26/2023

The Automated Discovery of Kinetic Rate Models – Methodological Frameworks

The industrialization of catalytic processes is of far more importance t...
research
06/17/2020

Landscape-Aware Fixed-Budget Performance Regression and Algorithm Selection for Modular CMA-ES Variants

Automated algorithm selection promises to support the user in the decisi...
research
09/13/2023

On the uses and abuses of regression models: a call for reform of statistical practice and teaching

When students and users of statistical methods first learn about regress...

Please sign up or login with your details

Forgot password? Click here to reset