Histogram Transform Ensembles for Density Estimation

by   Hanyuan Hang, et al.

We investigate an algorithm named histogram transform ensembles (HTE) density estimator whose effectiveness is supported by both solid theoretical analysis and significant experimental performance. On the theoretical side, by decomposing the error term into approximation error and estimation error, we are able to conduct the following analysis: First of all, we establish the universal consistency under L_1(μ)-norm. Secondly, under the assumption that the underlying density function resides in the Hölder space C^0,α, we prove almost optimal convergence rates for both single and ensemble density estimators under L_1(μ)-norm and L_∞(μ)-norm for different tail distributions, whereas in contrast, for its subspace C^1,α consisting of smoother functions, almost optimal convergence rates can only be established for the ensembles and the lower bound of the single estimators illustrates the benefits of ensembles over single density estimators. In the experiments, we first carry out simulations to illustrate that histogram transform ensembles surpass single histogram transforms, which offers powerful evidence to support the theoretical results in the space C^1,α. Moreover, to further exert the experimental performances, we propose an adaptive version of HTE and study the parameters by generating several synthetic datasets with diversities in dimensions and distributions. Last but not least, real data experiments with other state-of-the-art density estimators demonstrate the accuracy of the adaptive HTE algorithm.


Histogram Transform Ensembles for Large-scale Regression

We propose a novel algorithm for large-scale regression problems named h...

Best-scored Random Forest Density Estimation

This paper presents a brand new nonparametric density estimation strateg...

GBHT: Gradient Boosting Histogram Transform for Density Estimation

In this paper, we propose a density estimation algorithm called Gradient...

Local Adaptivity of Gradient Boosting in Histogram Transform Ensemble Learning

In this paper, we propose a gradient boosting algorithm called adaptive ...

Smoothing and adaptation of shifted Pólya Tree ensembles

Recently, S. Arlot and R. Genuer have shown that a model of random fores...

Beyond Smoothness: Incorporating Low-Rank Analysis into Nonparametric Density Estimation

The construction and theoretical analysis of the most popular universall...

Two-level histograms for dealing with outliers and heavy tail distributions

Histograms are among the most popular methods used in exploratory analys...

Please sign up or login with your details

Forgot password? Click here to reset