Tweedie Gradient Boosting for Extremely Unbalanced Zero-inflated Data

11/26/2018
by   He Zhou, et al.
0

Tweedie's compound Poisson model is a popular method to model insurance premiums with probability mass at zero and nonnegative, highly right-skewed distribution. But for extremely unbalanced zero-inflated insurance data, we propose the alternative zero-inflated Tweedie model, assuming that with probability q, the claim loss is 0, and with probability 1-q, the Tweedie insurance amount is claimed. It is straightforward to fit the mixture model using the EM algorithm. We make a nonparametric assumption on the logarithmic mean of the Tweedie part and propose a gradient tree-boosting algorithm to fit it, being capable of capturing nonlinearities, discontinuities, complex and higher order interactions among predictors. A simulaiton study comfirms the excellent prediction performance of our method on zero-inflated data sets. As an application, we apply our method to zero-inflated auto-insurance claim data and show that the new method is superior to the existing gredient boosting methods in the sense that it generates more accurate premium predictions. A heurestic hypothesis score testing with threshold is presented to tell whether the Tweedie model should be inflated to the zero-inflated Tweedie model.

READ FULL TEXT
research
07/15/2023

CatBoost Versus XGBoost and LightGBM: Developing Enhanced Predictive Models for Zero-Inflated Insurance Claim Data

In the property and casualty insurance industry, some challenges are pre...
research
03/26/2022

Estimating the Ratio of Means in a Zero-inflated Poisson Mixture Model

The problem of estimating the ratio of the means of a two-component Pois...
research
03/06/2018

Accelerated Gradient Boosting

Gradient tree boosting is a prediction algorithm that sequentially produ...
research
03/30/2020

Exponential Dispersion Models for Overdispersed Zero-Inflated Count Data

We consider three new classes of exponential dispersion models of discre...
research
12/12/2012

Staged Mixture Modelling and Boosting

In this paper, we introduce and evaluate a data-driven staged mixture mo...
research
08/27/2022

Modelling structural zeros in compositional data via a zero-censored multivariate normal model

We present a new model for analyzing compositional data with structural ...
research
07/26/2020

Iterative Boosting Deep Neural Networks for Predicting Click-Through Rate

The click-through rate (CTR) reflects the ratio of clicks on a specific ...

Please sign up or login with your details

Forgot password? Click here to reset