Inference for a Large Directed Graphical Model with Interventions

10/07/2021
by   Chunlin Li, et al.
0

Inference of directed relations given some unspecified interventions, that is, the target of each intervention is not known, is important yet challenging. For instance, it is of high interest to unravel the regulatory roles of genes with inherited genetic variants like single-nucleotide polymorphisms (SNPs), which can be unspecified interventions because of their regulatory function on some unknown genes. In this article, we test hypothesized directed relations with unspecified interventions. First, we derive conditions to yield an identifiable model. Unlike classical inference, hypothesis testing requires identifying ancestral relations and relevant interventions for each hypothesis-specific primary variable, referring to as causal discovery. Towards this end, we propose a peeling algorithm to establish a hierarchy of primary variables as nodes, starting with leaf nodes at the hierarchy's bottom, for which we derive a difference-of-convex (DC) algorithm for nonconvex minimization. Moreover, we prove that the peeling algorithm yields consistent causal discovery, and the DC algorithm is a low-order polynomial algorithm capable of finding a global minimizer almost surely under the data generating distribution. Second, we propose a modified likelihood ratio test, eliminating nuisance parameters to increase power. To enhance finite-sample performance, we integrate the modified likelihood ratio test with a data perturbation scheme by accounting for the uncertainty of identifying ancestral relations and relevant interventions. Also, we show that the distribution of a data-perturbation test statistic converges to the target distribution in high dimensions. Numerical examples demonstrate the utility and effectiveness of the proposed methods, including an application to infer gene regulatory networks.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset