Joint Mean-Covariance Estimation via the Horseshoe with an Application in Genomic Data Analysis
Seemingly unrelated regression is a natural framework for regressing multiple correlated responses on multiple predictors. The model is very flexible, with multiple linear regression and covariance selection models being special cases. However, its practical deployment in genomic data analysis under a Bayesian framework is limited due to both statistical and computational challenges. The statistical challenge is that one needs to infer both the mean vector and the covariance matrix, a problem inherently more complex than separately estimating each. The computational challenge is due to the dimensionality of the parameter space that routinely exceeds the sample size. We propose the use of horseshoe priors on both the mean vector and the inverse covariance matrix. This prior has demonstrated excellent performance when estimating a mean vector or covariance matrix separately. The current work shows these advantages are also present when addressing both simultaneously. A full Bayesian treatment is proposed, with a sampling algorithm that is linear in the number of predictors. MATLAB code implementing the algorithm is freely available from github at https://github.com/liyf1988/HS_GHS. Extensive performance comparisons are provided with both frequentist and Bayesian alternatives, and both estimation and prediction performances are verified on a genomic data set.
READ FULL TEXT