Matching on What Matters: A Pseudo-Metric Learning Approach to Matching Estimation in High Dimensions

by   Gentry Johnson, et al.

When pre-processing observational data via matching, we seek to approximate each unit with maximally similar peers that had an alternative treatment status--essentially replicating a randomized block design. However, as one considers a growing number of continuous features, a curse of dimensionality applies making asymptotically valid inference impossible (Abadie and Imbens, 2006). The alternative of ignoring plausibly relevant features is certainly no better, and the resulting trade-off substantially limits the application of matching methods to "wide" datasets. Instead, Li and Fu (2017) recasts the problem of matching in a metric learning framework that maps features to a low-dimensional space that facilitates "closer matches" while still capturing important aspects of unit-level heterogeneity. However, that method lacks key theoretical guarantees and can produce inconsistent estimates in cases of heterogeneous treatment effects. Motivated by straightforward extension of existing results in the matching literature, we present alternative techniques that learn latent matching features through either MLPs or through siamese neural networks trained on a carefully selected loss function. We benchmark the resulting alternative methods in simulations as well as against two experimental data sets--including the canonical NSW worker training program data set--and find superior performance of the neural-net-based methods.


page 1

page 2

page 3

page 4


Deep Divergence Learning

Classical linear metric learning methods have recently been extended alo...

The Effect of Intrinsic Dimension on Metric Learning under Compression

Metric learning aims at finding a suitable distance metric over the inpu...

FILM: A Fast, Interpretable, and Low-rank Metric Learning Approach for Sentence Matching

Detection of semantic similarity plays a vital role in sentence matching...

Matching in Selective and Balanced Representation Space for Treatment Effects Estimation

The dramatically growing availability of observational data is being wit...

Combining observational and experimental data to find heterogeneous treatment effects

Every design choice will have different effects on different units. Howe...

Adaptive Hyper-box Matching for Interpretable Individualized Treatment Effect Estimation

We propose a matching method for observational data that matches units w...

Please sign up or login with your details

Forgot password? Click here to reset