A Bootstrap Method for Goodness of Fit and Model Selection with a Single Observed Network
Network models are applied in numerous domains where data can be represented as a system of interactions among pairs of actors. While both statistical and mechanistic network models are increasingly capable of capturing various dependencies amongst these actors, these dependencies imply the lack of independence. This poses statistical challenges for analyzing such data, especially when there is only a single observed network, and often leads to intractable likelihoods regardless of the modeling paradigm, which limit the application of existing statistical methods for networks. We explore a subsampling bootstrap procedure to serve as the basis for goodness of fit and model selection with a single observed network that circumvents the intractability of such likelihoods. Our approach is based on flexible resampling distributions formed from the single observed network, allowing for finer and higher dimensional comparisons than simply point estimates of quantities of interest. We include worked examples for model selection, with simulation, and assessment of goodness of fit, with duplication-divergence model fits for yeast (S.cerevisiae) protein-protein interaction data from the literature. The proposed procedure produces a flexible resampling distribution that can be based on any statistics of one's choosing and can be employed regardless of choice of model.
READ FULL TEXT