Robust adaptive variable selection in ultra-high dimensional regression models based on the density power divergence loss
We consider the problem of simultaneous model selection and the estimation of regression coefficients in high-dimensional linear regression models of non-polynomial order, an extremely important problem of the recent era. The adaptive penalty functions are used in this regard to achieve the oracle model selection property along with easier computational burden. However, since the usual adaptive procedures (e.g., adaptive LASSO) based on the squared error loss function is extremely non-robust in the presence of data contamination which are a common problem with large scale data, e.g., noisy gene expression data. In this paper, we present a regularization procedure for the ultra-high dimensional data using a robust loss function based on the popular density power divergence (DPD) measure along with the adaptive LASSO penalty. We theoretically study the robustness and large-sample properties of the proposed adaptive robust estimator for a general class of error distribution; in particular, we show that the proposed adaptive DPD-LASSO estimator is highly robust, satisfies the oracle model selection property, and the corresponding estimators of the regression coefficients are consistent and asymptotically normal under easily verifiable set of assumptions. Illustrations are also provided for the most useful normal error density.
READ FULL TEXT