Hyperparameters and Tuning Strategies for Random Forest

by   Philipp Probst, et al.

The random forest algorithm (RF) has several hyperparameters that have to be set by the user, e.g., the number of observations drawn randomly for each tree and whether they are drawn with or without replacement, the number of variables drawn randomly for each split, the splitting rule, the minimum number of samples that a node must contain and the number of trees. In this paper, we first provide a literature review on the parameters' influence on the prediction performance and on variable importance measures, also considering interactions between hyperparameters. It is well known that in most cases RF works reasonably well with the default values of the hyperparameters specified in software packages. Nevertheless, tuning the hyperparameters can improve the performance of RF. In the second part of this paper, after a brief overview of tuning strategies we demonstrate the application of one of the most established tuning strategies, model-based optimization (MBO). To make it easier to use, we provide the tuneRanger R package that tunes RF with MBO automatically. In a benchmark study on several datasets, we compare the prediction performance and runtime of tuneRanger with other tuning implementations in R and RF with default hyperparameters.


page 1

page 2

page 3

page 4


How to out-perform default random forest regression: choosing hyperparameters for applications in large-sample hydrology

Predictions are a central part of water resources research. Historically...

Effect of hyperparameters on variable selection in random forests

Random forests (RFs) are well suited for prediction modeling and variabl...

A path in regression Random Forest looking for spatial dependence: a taxonomy and a systematic review

Random Forest (RF) is a well-known data-driven algorithm applied in seve...

Tunability: Importance of Hyperparameters of Machine Learning Algorithms

Modern machine learning algorithms for classification or regression such...

Optimizing Prediction Intervals by Tuning Random Forest via Meta-Validation

Recent studies have shown that tuning prediction models increases predic...

MindOpt Tuner: Boost the Performance of Numerical Software by Automatic Parameter Tuning

Numerical software is usually shipped with built-in hyperparameters. By ...

To tune or not to tune the number of trees in random forest?

The number of trees T in the random forest (RF) algorithm for supervised...

Please sign up or login with your details

Forgot password? Click here to reset