Causal Data Integration

05/15/2023
by   Brit Youngmann, et al.
0

Causal inference is fundamental to empirical scientific discoveries in natural and social sciences; however, in the process of conducting causal inference, data management problems can lead to false discoveries. Two such problems are (i) not having all attributes required for analysis, and (ii) misidentifying which attributes are to be included in the analysis. Analysts often only have access to partial data, and they critically rely on (often unavailable or incomplete) domain knowledge to identify attributes to include for analysis, which is often given in the form of a causal DAG. We argue that data management techniques can surmount both of these challenges. In this work, we introduce the Causal Data Integration (CDI) problem, in which unobserved attributes are mined from external sources and a corresponding causal DAG is automatically built. We identify key challenges and research opportunities in designing a CDI system, and present a system architecture for solving the CDI problem. Our preliminary experimental results demonstrate that solving CDI is achievable and pave the way for future research.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/13/2017

Probabilistic Matching: Causal Inference under Measurement Errors

The abundance of data produced daily from large variety of sources has b...
research
12/13/2019

Network Data

Many economic activities are embedded in networks: sets of agents and th...
research
03/01/2023

Causalvis: Visualizations for Causal Inference

Causal inference is a statistical paradigm for quantifying causal effect...
research
12/10/2021

Causal Knowledge Guided Societal Event Forecasting

Data-driven societal event forecasting methods exploit relevant historic...
research
04/07/2020

Causal Relational Learning

Causal inference is at the heart of empirical research in natural and so...
research
05/22/2021

Post-Model-Selection Statistical Inference with Interrupted Time Series Designs: An Evaluation of an Assault Weapons Ban in California

There have been many claims in the media and a bit of respectable resear...
research
02/02/2022

Some Reflections on Drawing Causal Inference using Textual Data: Parallels Between Human Subjects and Organized Texts

We examine the role of textual data as study units when conducting causa...

Please sign up or login with your details

Forgot password? Click here to reset