Adaptive debiased machine learning using data-driven model selection techniques

07/24/2023
by   Lars van der Laan, et al.
0

Debiased machine learning estimators for nonparametric inference of smooth functionals of the data-generating distribution can suffer from excessive variability and instability. For this reason, practitioners may resort to simpler models based on parametric or semiparametric assumptions. However, such simplifying assumptions may fail to hold, and estimates may then be biased due to model misspecification. To address this problem, we propose Adaptive Debiased Machine Learning (ADML), a nonparametric framework that combines data-driven model selection and debiased machine learning techniques to construct asymptotically linear, adaptive, and superefficient estimators for pathwise differentiable functionals. By learning model structure directly from data, ADML avoids the bias introduced by model misspecification and remains free from the restrictions of parametric and semiparametric models. While they may exhibit irregular behavior for the target parameter in a nonparametric statistical model, we demonstrate that ADML estimators provides regular and locally uniformly valid inference for a projection-based oracle parameter. Importantly, this oracle parameter agrees with the original target parameter for distributions within an unknown but correctly specified oracle statistical submodel that is learned from the data. This finding implies that there is no penalty, in a local asymptotic sense, for conducting data-driven model selection compared to having prior knowledge of the oracle submodel and oracle parameter. To demonstrate the practical applicability of our theory, we provide a broad class of ADML estimators for estimating the average treatment effect in adaptive partially linear regression models.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/05/2019

Bias-aware model selection for machine learning of doubly robust functionals

While model selection is a well-studied topic in parametric and nonparam...
research
06/15/2020

Assumption-lean inference for generalised linear model parameters

Inference for the parameters indexing generalised linear models is routi...
research
06/27/2022

Entropy-based Characterization of Modeling Constraints

In most data-scientific approaches, the principle of Maximum Entropy (Ma...
research
08/14/2023

Locally Adaptive and Differentiable Regression

Over-parameterized models like deep nets and random forests have become ...
research
03/04/2020

Universal sieve-based strategies for efficient estimation using machine learning tools

Suppose that we wish to estimate a finite-dimensional summary of one or ...
research
09/10/2009

Data-driven calibration of linear estimators with minimal penalties

This paper tackles the problem of selecting among several linear estimat...
research
05/26/2022

Proximal Estimation and Inference

We build a unifying convex analysis framework characterizing the statist...

Please sign up or login with your details

Forgot password? Click here to reset