Sensitivity Analysis of RF+clust for Leave-one-problem-out Performance Prediction

05/30/2023
by   Ana Nikolikj, et al.
0

Leave-one-problem-out (LOPO) performance prediction requires machine learning (ML) models to extrapolate algorithms' performance from a set of training problems to a previously unseen problem. LOPO is a very challenging task even for state-of-the-art approaches. Models that work well in the easier leave-one-instance-out scenario often fail to generalize well to the LOPO setting. To address the LOPO problem, recent work suggested enriching standard random forest (RF) performance regression models with a weighted average of algorithms' performance on training problems that are considered similar to a test problem. More precisely, in this RF+clust approach, the weights are chosen proportionally to the distances of the problems in some feature space. Here in this work, we extend the RF+clust approach by adjusting the distance-based weights with the importance of the features for performance regression. That is, instead of considering cosine distance in the feature space, we consider a weighted distance measure, with weights depending on the relevance of the feature for the regression model. Our empirical evaluation of the modified RF+clust approach on the CEC 2014 benchmark suite confirms its advantages over the naive distance measure. However, we also observe room for improvement, in particular with respect to more expressive feature portfolios.

READ FULL TEXT

page 1

page 5

research
01/23/2023

RF+clust for Leave-One-Problem-Out Performance Prediction

Per-instance automated algorithm configuration and selection are gaining...
research
02/25/2022

MUC-driven Feature Importance Measurement and Adversarial Analysis for Random Forest

The broad adoption of Machine Learning (ML) in security-critical fields ...
research
04/22/2021

Personalizing Performance Regression Models to Black-Box Optimization Problems

Accurately predicting the performance of different optimization algorith...
research
05/17/2023

Optimal Weighted Random Forests

The random forest (RF) algorithm has become a very popular prediction me...
research
05/11/2023

How to out-perform default random forest regression: choosing hyperparameters for applications in large-sample hydrology

Predictions are a central part of water resources research. Historically...
research
10/08/2020

Exploring Sensitivity of ICF Outputs to Design Parameters in Experiments Using Machine Learning

Building a sustainable burn platform in inertial confinement fusion (ICF...
research
03/08/2023

A path in regression Random Forest looking for spatial dependence: a taxonomy and a systematic review

Random Forest (RF) is a well-known data-driven algorithm applied in seve...

Please sign up or login with your details

Forgot password? Click here to reset