Model-free Feature Screening with Projection Correlation and FDR Control with Knockoff Features

08/19/2019
by   Wanjun Liu, et al.
0

This paper proposes a model-free and data-adaptive feature screening method for ultra-high dimensional datasets. The proposed method is based on the projection correlation which measures the dependence between two random vectors. This projection correlation based method does not require specifying a regression model and applies to the data in the presence of heavy-tailed errors and multivariate response. It enjoys both sure screening and rank consistency properties under weak assumptions. Further, a two-step approach is proposed to control the false discovery rate (FDR) in feature screening with the help of knockoff features. It can be shown that the proposed two-step approach enjoys both sure screening and FDR control if the pre-specified FDR level α is greater or equal to 1/s, where s is the number of active features. The superior empirical performance of the proposed methods is justified by various numerical experiments and real data applications.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/19/2019

Model-free Feature Screening and FDR Control with Knockoff Features

This paper proposes a model-free and data-adaptive feature screening met...
research
12/26/2022

Robust distance correlation for variable screening

High-dimensional data are commonly seen in modern statistical applicatio...
research
04/17/2023

Grouped feature screening for ultrahigh-dimensional classification via Gini distance correlation

Gini distance correlation (GDC) was recently proposed to measure the dep...
research
10/07/2021

Distribution-free and Model-free Multivariate Feature Screening via Multivariate Rank Distance Correlation

Feature screening approaches are effective in selecting active features ...
research
05/08/2022

On Exact Feature Screening in Ultrahigh-dimensional Binary Classification

We propose a new model-free feature screening method based on energy dis...
research
06/04/2018

Data-driven Localization and Estimation of Disturbance in the Interconnected Power System

Identifying the location of a disturbance and its magnitude is an import...
research
08/26/2018

Doubly Robust Sure Screening for Elliptical Copula Regression Model

Regression analysis has always been a hot research topic in statistics. ...

Please sign up or login with your details

Forgot password? Click here to reset