Hyperparameter Tuning and Model Evaluation in Causal Effect Estimation

by   Damian Machlanski, et al.

The performance of most causal effect estimators relies on accurate predictions of high-dimensional non-linear functions of the observed data. The remarkable flexibility of modern Machine Learning (ML) methods is perfectly suited to this task. However, data-driven hyperparameter tuning of ML methods requires effective model evaluation to avoid large errors in causal estimates, a task made more challenging because causal inference involves unavailable counterfactuals. Multiple performance-validation metrics have recently been proposed such that practitioners now not only have to make complex decisions about which causal estimators, ML learners and hyperparameters to choose, but also about which evaluation metric to use. This paper, motivated by unclear recommendations, investigates the interplay between the four different aspects of model evaluation for causal effect estimation. We develop a comprehensive experimental setup that involves many commonly used causal estimators, ML methods and evaluation approaches and apply it to four well-known causal inference benchmark datasets. Our results suggest that optimal hyperparameter tuning of ML learners is enough to reach state-of-the-art performance in effect estimation, regardless of estimators and learners. We conclude that most causal estimators are roughly equivalent in performance if tuned thoroughly enough. We also find hyperparameter tuning and model evaluation are much more important than causal estimators and ML methods. Finally, from the significant gap we find in estimation performance of popular evaluation metrics compared with optimal model selection choices, we call for more research into causal model evaluation to unlock the optimum performance not currently being delivered even by state-of-the-art procedures.


Machine learning in policy evaluation: new tools for causal inference

While machine learning (ML) methods have received a lot of attention in ...

Can predictive models be used for causal inference?

Supervised machine learning (ML) and deep learning (DL) algorithms excel...

Out-of-sample scoring and automatic selection of causal estimators

Recently, many causal estimators for Conditional Average Treatment Effec...

High Per Parameter: A Large-Scale Study of Hyperparameter Tuning for Machine Learning Algorithms

Hyperparameters in machine learning (ML) have received a fair amount of ...

Empirical Analysis of Model Selection for Heterogenous Causal Effect Estimation

We study the problem of model selection in causal inference, specificall...

Optimization-based Causal Estimation from Heterogenous Environments

This paper presents a new optimization approach to causal estimation. Gi...

Why Do Machine Learning Practitioners Still Use Manual Tuning? A Qualitative Study

Current advanced hyperparameter optimization (HPO) methods, such as Baye...

Please sign up or login with your details

Forgot password? Click here to reset