Sketching for Two-Stage Least Squares Estimation
When there is so much data that they become a computation burden, it is not uncommon to compute quantities of interest using a sketch of data of size m instead of the full sample of size n. This paper investigates the implications for two-stage least squares (2SLS) estimation when the sketches are obtained by a computationally efficient method known as CountSketch. We obtain three results. First, we establish conditions under which given the full sample, a sketched 2SLS estimate can be arbitrarily close to the full-sample 2SLS estimate with high probability. Second, we give conditions under which the sketched 2SLS estimator converges in probability to the true parameter at a rate of m^-1/2 and is asymptotically normal. Third, we show that the asymptotic variance can be consistently estimated using the sketched sample and suggest methods for determining an inference-conscious sketch size m. The sketched 2SLS estimator is used to estimate returns to education.
READ FULL TEXT