Sampling unknown large networks restricted by low sampling rates
Graph sampling plays an important role in data mining for large networks. Specifically, larger networks often correspond to lower sampling rates. Under the situation, traditional traversal-based samplings for large networks usually have an excessive preference for densely-connected network core nodes. Aim at this issue, this paper proposes a sampling method for unknown networks at low sampling rates, called SLSR, which first adopts a random node sampling to evaluate a degree threshold, utilized to distinguish the core from periphery, and the average degree in unknown networks, and then runs a double-layer sampling strategy on the core and periphery. SLSR is simple that results in a high time efficiency, but experimental evaluation confirms that the proposed method can accurately preserve many critical structures of unknown large networks at sampling rates not exceeding 10
READ FULL TEXT