Debiased Machine Learning without Sample-Splitting for Stable Estimators

06/03/2022
by   Qizhao Chen, et al.
0

Estimation and inference on causal parameters is typically reduced to a generalized method of moments problem, which involves auxiliary functions that correspond to solutions to a regression or classification problem. Recent line of work on debiased machine learning shows how one can use generic machine learning estimators for these auxiliary problems, while maintaining asymptotic normality and root-n consistency of the target parameter of interest, while only requiring mean-squared-error guarantees from the auxiliary estimation algorithms. The literature typically requires that these auxiliary problems are fitted on a separate sample or in a cross-fitting manner. We show that when these auxiliary estimation algorithms satisfy natural leave-one-out stability properties, then sample splitting is not required. This allows for sample re-use, which can be beneficial in moderately sized sample regimes. For instance, we show that the stability properties that we propose are satisfied for ensemble bagged estimators, built via sub-sampling without replacement, a popular technique in machine learning practice.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/19/2020

Generalised regression estimation given imperfectly matched auxiliary data

Generalised regression estimation allows one to make use of available au...
research
07/30/2016

Double/Debiased Machine Learning for Treatment and Causal Parameters

Most modern supervised statistical/machine learning (ML) methods are exp...
research
07/06/2020

Cross-Fitting and Averaging for Machine Learning Estimation of Heterogeneous Treatment Effects

We investigate the finite sample performance of sample splitting, cross-...
research
11/17/2014

Influence Functions for Machine Learning: Nonparametric Estimators for Entropies, Divergences and Mutual Informations

We propose and analyze estimators for statistical functionals of one or ...
research
03/31/2019

Small Area Estimation with Linked Data

In Small Area Estimation data linkage can be used to combine values of t...
research
05/05/2021

Non-asymptotic analysis and inference for an outlyingness induced winsorized mean

Robust estimation of a mean vector, a topic regarded as obsolete in the ...
research
02/01/2018

Linearized Binary Regression

Probit regression was first proposed by Bliss in 1934 to study mortality...

Please sign up or login with your details

Forgot password? Click here to reset