Single-cell gene regulatory network analysis for mixed cell populations with applications to COVID-19 single cell data
Gene regulatory network (GRN) refers to the complex network formed by regulatory interactions between genes in living cells. In this paper, we consider inferring GRNs in single cells based on single cell RNA sequencing (scRNA-seq) data. In scRNA-seq, single cells are often profiled from mixed populations and their cell identities are unknown. A common practice for single cell GRN analysis is to first cluster the cells and infer GRNs for every cluster separately. However, this two-step procedure ignores uncertainty in the clustering step and thus could lead to inaccurate estimation of the networks. To address this problem, we propose to model scRNA-seq by the mixture multivariate Poisson log-normal (MPLN) distribution. The precision matrices of the MPLN are the GRNs of different cell types and can be jointly estimated by maximizing MPLN's lasso-penalized log-likelihood. We show that the MPLN model is identifiable and the resulting penalized log-likelihood estimator is consistent. To avoid the intractable optimization of the MPLN's log-likelihood, we develop an algorithm called VMPLN based on the variational inference method. Comprehensive simulation and real scRNA-seq data analyses reveal that VMPLN performs better than the state-of-the-art single cell GRN methods.
READ FULL TEXT