De-Biasing The Lasso With Degrees-of-Freedom Adjustment
This paper studies schemes to de-bias the Lasso in sparse linear regression where the goal is to estimate and construct confidence intervals for a low-dimensional projection of the unknown coefficient vector in a preconceived direction a_0. We assume that the design matrix has iid Gaussian rows with known covariance matrix Σ. Our analysis reveals that previous propositions to de-bias the Lasso require a modification in order to enjoy asymptotic efficiency in a full range of the level of sparsity. This modification takes the form of a degrees-of-freedom adjustment that accounts for the dimension of the model selected by the Lasso. Let s_0 denote the number of nonzero coefficients of the true coefficient vector. The unadjusted de-biasing schemes proposed in previous studies enjoys efficiency if s_0 n^2/3, up to logarithmic factors. However, if s_0 n^2/3, the unadjusted scheme cannot be efficient in certain directions a_0. In the latter regime, it it necessary to modify existing procedures by an adjustment that accounts for the degrees-of-freedom of the Lasso. The proposed degrees-of-freedom adjustment grants asymptotic efficiency for any direction a_0. This holds under a Sparse Riecz Condition on the covariance matrix Σ and the sample size requirement s_0/p→0 and s_0(p/s_0)/n→0. Our analysis also highlights that the degrees-of-freedom adjustment is not necessary when the initial bias of the Lasso in the direction a_0 is small, which is granted under more stringent conditions on Σ^-1. This explains why the necessity of degrees-of-freedom adjustment did not appear in some previous studies. The main proof argument involves a Gaussian interpolation path similar to that used to derive Slepian's lemma. It yields a sharp ℓ_∞ error bound for the Lasso under Gaussian design which is of independent interest.
READ FULL TEXT