Edgeworth correction for the largest eigenvalue in a spiked PCA model
We study improved approximations to the distribution of the largest eigenvalue ℓ̂ of the sample covariance matrix of n zero-mean Gaussian observations in dimension p+1. We assume that one population principal component has variance ℓ > 1 and the remaining `noise' components have common variance 1. In the high dimensional limit p/n →γ > 0, we begin study of Edgeworth corrections to the limiting Gaussian distribution of ℓ̂ in the supercritical case ℓ > 1 + √(γ). The skewness correction involves a quadratic polynomial as in classical settings, but the coefficients reflect the high dimensional structure. The methods involve Edgeworth expansions for sums of independent non-identically distributed variates obtained by conditioning on the sample noise eigenvalues, and limiting bulk properties and fluctuations of these noise eigenvalues.
READ FULL TEXT