Community Detection with Known, Unknown, or Partially Known Auxiliary Latent Variables
Empirical observations suggest that in practice, community membership does not completely explain the dependency between the edges of an observation graph. The residual dependence of the graph edges are modeled in this paper, to first order, by auxiliary node latent variables that affect the statistics of the graph edges but carry no information about the communities of interest. We then study community detection in graphs obeying the stochastic block model and censored block model with auxiliary latent variables. We analyze the conditions for exact recovery when these auxiliary latent variables are unknown, representing unknown nuisance parameters or model mismatch. We also analyze exact recovery when these secondary latent variables have been either fully or partially revealed. Finally, we propose a semidefinite programming algorithm for recovering the desired labels when the secondary labels are either known or unknown. We show that exact recovery is possible by semidefinite programming down to the respective maximum likelihood exact recovery threshold.
READ FULL TEXT