On the sign recovery given by the thresholded LASSO and thresholded Basis Pursuit

12/13/2018
by   Patrick J. C. Tardivel, et al.
0

We consider the regression model, when the number of observations is smaller than the number of explicative variables. It is well known that the popular Least Absolute Shrinkage and Selection Operator (LASSO) can recover the sign of regression coefficients only if a very stringent irrepresentable condition is satisfied. In this article, in a first step, we provide a new result about the irrepresentable condition: the probability to recover the sign with LASSO is smaller than 1/2 once the irrepresentable condition does not hold. Next, we revisit properties of thresholded LASSO and provide new theoretical results in the asymptotic setup under which the design matrix is fixed and the magnitudes of nonzero regression coefficients tend to infinity. Apart from LASSO, our results cover also basis pursuit, which can be thought of as a limiting case of LASSO when the tuning parameter tends to 0. Compared to the classical asymptotics, our approach allows for reduction of the technical burden. We formulate an easy identifiability condition which turns out to be sufficient and necessary for thresholded LASSO to recover the sign of the sufficiently large signal. Our simulation study illustrates the difference between the irrepresentable and the identifiability condition. Interestingly, while irrepresentable condition becomes more difficult to be satisfied for strongly correlated designs, it does not seem to be true for identifiability condition. Actually, when the correlations are positive and the nonzero coefficients are of the same sign, the identifiability condition allows the number of nonzero coefficients to be larger than in case where the regressors are independent.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset