PP-DBLP: Modeling and Generating Attributed Public-Private Networks with DBLP
In many online social networks (e.g., Facebook, Google+, Twitter, and Instagram), users prefer to hide her/his partial or all relationships, which makes such private relationships not visible to public users or even friends. This leads to a new graph model called public-private networks, where each user has her/his own perspective of the network including the private connections. Recently, public-private network analysis has attracted significant research interest in the literature. A great deal of important graph computing problems (e.g., shortest paths, centrality, PageRank, and reachability tree) has been studied. However, due to the limited data sources and privacy concerns, proposed approaches are not tested on real-world datasets, but on synthetic datasets by randomly selecting vertices as private ones. Therefore, real-world datasets of public-private networks are essential and urgently needed for such algorithms in the evaluation of efficiency and effectiveness. In this paper, we generate four public-private networks from real-world DBLP records, called PPDBLP. We take published articles as public information and regard ongoing collaborations as the hidden information, which is only known by the authors. Our released datasets of PPDBLP offer the prospects for verifying various kinds of efficient public-private analysis algorithms in a fair way. In addition, motivated by widely existing attributed graphs, we propose an advanced model of attributed public-private graphs where vertices have not only private edges but also private attributes. We also discuss open problems on attributed public-private graphs. Preliminary experimental results on our generated real-world datasets verify the effectiveness and efficiency of public-private models and state-of-the-art algorithms.
READ FULL TEXT