Behavior of Lasso and Lasso-based inference under limited variability
We study the nonasymptotic behavior of Lasso and Lasso-based inference when the covariates exhibit limited variability, which does not appear to have been considered in the literature, despite its prevalence in applied research. In settings that are generally considered favorable to Lasso, we show that, if the absolute value of a nonzero regression coefficient is smaller or equal to a threshold, Lasso fails to select the corresponding covariate with high probability (approaching to 1 asymptotically). In particular, limited variability can render Lasso unable to select even those covariates with coefficients that are well-separated from zero. Moreover, based on simple theoretical examples, we show that post double Lasso and debiased Lasso can exhibit size distortions under limited variability. Monte Carlo simulations corroborate our theoretical results and further demonstrate that, under limited variability, the performance of Lasso and Lasso-based inference methods is very sensitive to the choice of the penalty parameter. This begs the question of how to make statistical inference (e.g., constructing confidence intervals) under limited variability. In moderately high-dimensional problems, where the number of covariates is large but still smaller than the sample size, OLS constitutes a natural alternative to Lasso-based inference methods. In empirically relevant settings, our simulation results show that, under limited variability, OLS with recently developed standard errors, which are proven robust to many covariates, demonstrates a superior finite sample performance relative to Lasso-based inference methods.
READ FULL TEXT