Optimal Eigenvalue Shrinkage in the Semicircle Limit
Recent studies of high-dimensional covariance estimation often assume the proportional growth asymptotic, where the sample size n and dimension p are comparable, with n, p →∞ and γ_n ≡ p/n →γ > 0. Yet, many datasets have very different numbers of rows and columns. Consider instead disproportional growth, where n, p →∞ and γ_n → 0 or γ_n →∞. With far fewer dimensions than observations, the disproportional limit γ_n → 0 may seem similar to classical fixed-p asymptotics. In fact, either disproportional limit induces novel phenomena distinct from the proportional and fixed-p limits. We study the spiked covariance model, finding for each of 15 different loss functions optimal shrinkage and thresholding rules. Readers who initially view the disproportionate limit γ_n → 0 as similar to classical fixed-p asymptotics may expect, given the dominance in that setting of the sample covariance estimator, that there is no alternative here. On the contrary, our optimal procedures demand extensive eigenvalue shrinkage and offer substantial performance benefits. The sample covariance is similarly improvable in the disproportionate limit γ_n →∞. Practitioners may worry how to choose between proportional and disproportional growth frameworks in practice. Conveniently, under the spiked covariance model there is no conflict between the two and no choice is needed; one unified set of closed forms (used with the aspect ratio γ_n of the practitioner's data) offers full asymptotic optimality in both regimes. At the heart of these phenomena is the spiked Wigner model. Via a connection to the spiked covariance model as γ_n → 0, we derive optimal shrinkers for the Wigner setting.
READ FULL TEXT