Distance-based regression analysis for measuring associations
Distance-based regression model, as a nonparametric multivariate method, has been widely used to detect the association between variations in a distance or dissimilarity matrix for outcomes and predictor variables of interest. Based on it, a pseudo-F statistic which partitions the variation in distance matrices is often constructed to achieve the aim. To the best of our knowledge, the statistical properties of the pseudo-F statistic has not yet been well established in the literature. To fill this gap, we study the asymptotic null distribution of the pseudo-F statistic and show that it is asymptotically equivalent to a mixture of chi-squared random variables. Given that the pseudo-F test statistic has unsatisfactory power when the correlations of the response variables are large, we propose a square-root F-type test statistic which replaces the similarity matric with its square root. The asymptotic null distribution of the new test statistic and power of both tests are also investigated. Simulation studies are conducted to validate the asymptotic distributions of the tests and demonstrate that the proposed test has more robust power than the pseudo-F test. Both test statistics are exemplified with a gene expression dataset for a prostate cancer pathway. Keywords: Asymptotic distribution, Chi-squared-type mixture, Nonparametric test, Pseudo-F test, Similarity matrix.
READ FULL TEXT