Double-estimation-friendly inference for high-dimensional misspecified models

09/24/2019
by   Rajen D. Shah, et al.
0

All models may be wrong—but that is not necessarily a problem for inference. Consider the standard t-test for the significance of a variable X for predicting response Y whilst controlling for p other covariates Z in a random design linear model. This yields correct asymptotic type I error control for the null hypothesis that X is conditionally independent of Y given Z under an arbitrary regression model of Y on (X, Z), provided that a linear regression model for X on Z holds. An analogous robustness to misspecification, which we term the "double-estimation-friendly" (DEF) property, also holds for Wald tests in generalised linear models, with some small modifications. In this expository paper we explore this phenomenon, and propose methodology for high-dimensional regression settings that respects the DEF property. We advocate specifying (sparse) generalised linear regression models for both Y and the covariate of interest X; our framework gives valid inference for the conditional independence null if either of these hold. In the special case where both specifications are linear, our proposal amounts to a small modification of the popular debiased Lasso test. We also investigate constructing confidence intervals for the regression coefficient of X via inverting our tests; these have coverage guarantees even in partially linear models where the contribution of Z to Y can be arbitrary. Numerical experiments demonstrate the effectiveness of the methodology.

READ FULL TEXT
research
01/18/2021

Conditional Independence Testing in Hilbert Spaces with Applications to Functional Data Analysis

We study the problem of testing the null hypothesis that X and Y are con...
research
08/09/2019

Goodness-of-fit testing in high-dimensional generalized linear models

We propose a family of tests to assess the goodness-of-fit of a high-dim...
research
04/15/2023

Tests for ultrahigh-dimensional partially linear regression models

In this paper, we consider tests for ultrahigh-dimensional partially lin...
research
01/27/2020

Shapley value confidence intervals for variable selection in regression models

Multiple linear regression is a commonly used inferential and predictive...
research
05/07/2020

High-Dimensional Inference Based on the Leave-One-Covariate-Out LASSO Path

We propose a new measure of variable importance in high-dimensional regr...
research
10/17/2020

Markov Neighborhood Regression for High-Dimensional Inference

This paper proposes an innovative method for constructing confidence int...
research
06/22/2021

Inference in High-dimensional Linear Regression

We develop an approach to inference in a linear regression model when th...

Please sign up or login with your details

Forgot password? Click here to reset