Bayesian additive regression trees for probabilistic programming
Bayesian additive regression trees (BART) is a non-parametric method to approximate functions. It is a black-box method based on the sum of many trees where priors are used to regularize inference, mainly by restricting trees' learning capacity so that no individual tree is able to explain the data, but rather the sum of trees. We discuss BART in the context of probabilistic programming languages (PPLs), i.e. we present BART as a primitive that can be used to build probabilistic models rather than a standalone model. Specifically, we introduce a BART implementation extending PyMC, a Python library for probabilistic programming. We present a few examples of models that can be built using this probabilistic programming-oriented version of BART, discuss recommendations for sample diagnostics and selection of model hyperparameters, and finally we close with limitations of the current approach and future extensions.
READ FULL TEXT