Temporal Activity Path Based Character Correction in Social Networks
Vast amount of multimedia data contains massive and multifarious social information which is used to construct large-scale social networks. In a complex social network, a character should be ideally denoted by one and only one vertex. However, it is pervasive that a character is denoted by two or more vertices with different names, thus it is usually considered as multiple, different characters. This problem causes incorrectness of results in network analysis and mining. The factual challenge is that character uniqueness is hard to correctly confirm due to lots of complicated factors, e.g. name changing and anonymization, leading to character duplication. Early, limited research has shown that previous methods depended overly upon supplementary attribute information from databases. In this paper, we propose a novel method to merge the character vertices which refer to as the same entity but are denoted with different names. With this method, we firstly build the relationship network among characters based on records of social activities participated, which are extracted from multimedia sources. Then define temporal activity paths (TAPs) for each character over time. After that, we measure similarity of the TAPs for any two characters. If the similarity is high enough, the two vertices should be considered to the same character. Based on TAPs, we can determine whether to merge the two character vertices. Our experiments shown that this solution can accurately confirm character uniqueness in large-scale social network.
READ FULL TEXT