An Empirical Bayes Regression for Multi-tissue eQTL Data Analysis
The Genotype-Tissue Expression (GTEx) project collects samples from multiple human tissues to study the relationship between genetic variation or single nucleotide polymorphisms (SNPs) and gene expression in each tissue. However, most existing eQTL analyses only focus on single tissue information. In this paper, we develop a multi-tissue eQTL analysis that improves the single tissue cis-SNP gene expression association analysis by borrowing information across tissues. Specifically, we propose an empirical Bayes regression model for SNP-expression association analysis using data across multiple tissues. To allow the effects of SNPs to vary greatly among tissues, we use a mixture distribution as the prior, which is a mixture of a multivariate Gaussian distribution and a Dirac mass at zero. The model allows us to assess the cis-SNP gene expression association in each tissue by calculating the Bayes factors. We show that the proposed estimator of the cis-SNP effects on gene expression achieves the minimum Bayes risk among all estimators. Analyses of the GTEx data show that our proposed method is superior to traditional simple regression methods in terms of predicting accuracy for gene expression levels using cis-SNPs in testing data sets. Moreover, we find that although genetic effects on expression are extensively shared among tissues, effect sizes still vary greatly across tissues.
READ FULL TEXT