Community detection through likelihood optimization: in search of a sound model
Community detection has recently became one of the most important problems in network analysis and many algorithms have been proposed to solve this problem. Among others, methods based on statistical inference are of particular interest: such algorithms are mathematically sound and were shown to provide partitions of good quality. The main ingredient of statistical inference methods is to assume some underlying random graph model and then fit this model to the observed graph by maximizing the likelihood. The choice of this underlying model (a.k.a. null model) is very important for such methods and this choice is the focus of the current study. We provide an extensive theoretical and empirical analysis to compare several models: the stochastic block model, which is currently the most widely used model for likelihood optimization; recently proposed degree-corrected modification of this model, and a new one-parametric null model which has some desirable statistical properties. We also develop and compare two likelihood optimization algorithms suitable for the null models under consideration. An extensive empirical analysis on a variety of datasets shows, in particular, that, in contrast to synthetic datasets, real-world networks have diverse nature and there is no model suitable for all of them.
READ FULL TEXT