From Observational Studies to Causal Rule Mining

by   Jiuyong Li, et al.

Randomised controlled trials (RCTs) are the most effective approach to causal discovery, but in many circumstances it is impossible to conduct RCTs. Therefore observational studies based on passively observed data are widely accepted as an alternative to RCTs. However, in observational studies, prior knowledge is required to generate the hypotheses about the cause-effect relationships to be tested, hence they can only be applied to problems with available domain knowledge and a handful of variables. In practice, many data sets are of high dimensionality, which leaves observational studies out of the opportunities for causal discovery from such a wealth of data sources. In another direction, many efficient data mining methods have been developed to identify associations among variables in large data sets. The problem is, causal relationships imply associations, but the reverse is not always true. However we can see the synergy between the two paradigms here. Specifically, association rule mining can be used to deal with the high-dimensionality problem while observational studies can be utilised to eliminate non-causal associations. In this paper we propose the concept of causal rules (CRs) and develop an algorithm for mining CRs in large data sets. We use the idea of retrospective cohort studies to detect CRs based on the results of association rule mining. Experiments with both synthetic and real world data sets have demonstrated the effectiveness and efficiency of CR mining. In comparison with the commonly used causal discovery methods, the proposed approach in general is faster and has better or competitive performance in finding correct or sensible causes. It is also capable of finding a cause consisting of multiple variables, a feature that other causal discovery methods do not possess.


page 1

page 2

page 3

page 4


Mining Combined Causes in Large Data Sets

In recent years, many methods have been developed for detecting causal r...

Discovering Context Specific Causal Relationships

With the increasing need of personalised decision making, such as person...

Causal Decision Trees

Uncovering causal relationships in data is a major objective of data ana...

Causal Inference in Observational Data

Our aging population increasingly suffers from multiple chronic diseases...

HNet: Graphical Hypergeometric Networks

Motivation: Real-world data often contain measurements with both continu...

Discovering Reliable Causal Rules

We study the problem of deriving policies, or rules, that when enacted o...

Searching for consistent associations with a multi-environment knockoff filter

This paper develops a method based on model-X knockoffs to find conditio...

Please sign up or login with your details

Forgot password? Click here to reset