A Power Analysis for Knockoffs with the Lasso Coefficient-Difference Statistic

07/30/2020
by   Asaf Weinstein, et al.
0

In a linear model with possibly many predictors, we consider variable selection procedures given by {1≤ j≤ p: |β_j(λ)| > t}, where β(λ) is the Lasso estimate of the regression coefficients, and where λ and t may be data dependent. Ordinary Lasso selection is captured by using t=0, thus allowing to control only λ, whereas thresholded-Lasso selection allows to control both λ and t. The potential advantages of the latter over the former in terms of power—figuratively, opening up the possibility to look further down the Lasso path—have been quantified recently leveraging advances in approximate message-passing (AMP) theory, but the implications are actionable only when assuming substantial knowledge of the underlying signal. In this work we study theoretically the power of a knockoffs-calibrated counterpart of thresholded-Lasso that enables us to control FDR in the realistic situation where no prior information about the signal is available. Although the basic AMP framework remains the same, our analysis requires a significant technical extension of existing theory in order to handle the pairing between original variables and their knockoffs. Relying on this extension we obtain exact asymptotic predictions for the true positive proportion achievable at a prescribed type I error level. In particular, we show that the knockoffs version of thresholded-Lasso can perform much better than ordinary Lasso selection if λ is chosen by cross-validation on the augmented matrix.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset