On the Consistency of Graph-based Bayesian Learning and the Scalability of Sampling Algorithms
A popular approach to semi-supervised learning proceeds by endowing the input data with a graph structure in order to extract geometric information and incorporate it into a Bayesian framework. We introduce new theory that gives appropriate scalings of graph parameters that provably lead to a well-defined limiting posterior as the size of the unlabeled data set grows. Furthermore, we show that these consistency results have profound algorithmic implications. When consistency holds, carefully designed graph-based Markov chain Monte Carlo algorithms are proved to have a uniform spectral gap, independent of the number of unlabeled inputs. Several numerical experiments corroborate both the statistical consistency and the algorithmic scalability established by the theory.
READ FULL TEXT