Ranking variables and interactions using predictive uncertainty measures
For complex nonlinear supervised learning models, assessing the relevance of input variables or their interactions is not straightforward due to the lack of a direct measure of relevance, such as the regression coefficients in generalized linear models. One can assess the relevance of input variables locally by using the mean prediction or its derivative, but this disregards the predictive uncertainty. In this work, we present a Bayesian method for identifying relevant input variables with main effects and interactions by differentiating the Kullback-Leibler divergence of predictive distributions. The method averages over local measures of relevance and has a conservative property that takes into account the uncertainty in the predictive distribution. Our empirical results on simulated and real data sets with nonlinearities demonstrate accurate and efficient identification of relevant main effects and interactions compared to alternative methods.
READ FULL TEXT