Bayesian variable selection in hierarchical difference-in-differences models
A popular method for estimating a causal treatment effect with observational data is the difference-in-differences (DiD) model. In this work, we consider an extension of the classical DiD setting to the hierarchical context in which data cannot be matched at the most granular level (e.g., individual-level differences are unobservable). We propose a Bayesian hierarchical difference-in-differences (HDiD) model which estimates the treatment effect by regressing the treatment on a latent variable representing the mean change in group-level outcome. We present theoretical and empirical results showing that an HDiD model that fails to adjust for a particular class of confounding variables, or confounding with the baseline (pre-treatment) outcomes, biases the treatment effect estimate. We propose and implement various approaches to perform variable selection using a structured Bayesian spike-and-slab model in the HDiD context. Our proposed methods leverage the temporal structure within the DiD context to select those covariates that lead to unbiased and efficient estimation of the causal treatment effect. We evaluate the methods' properties through theoretical results and simulation, and we use them to assess the impact of primary care redesign of clinics in Minnesota on the management of diabetes outcomes from 2008 to 2017.
READ FULL TEXT