TwiBot-22: Towards Graph-Based Twitter Bot Detection

by   Shangbin Feng, et al.

Twitter bot detection has become an increasingly important task to combat misinformation, facilitate social media moderation, and preserve the integrity of the online discourse. State-of-the-art bot detection methods generally leverage the graph structure of the Twitter network, and they exhibit promising performance when confronting novel Twitter bots that traditional methods fail to detect. However, very few of the existing Twitter bot detection datasets are graph-based, and even these few graph-based datasets suffer from limited dataset scale, incomplete graph structure, as well as low annotation quality. In fact, the lack of a large-scale graph-based Twitter bot detection benchmark that addresses these issues has seriously hindered the development and evaluation of novel graph-based bot detection approaches. In this paper, we propose TwiBot-22, a comprehensive graph-based Twitter bot detection benchmark that presents the largest dataset to date, provides diversified entities and relations on the Twitter network, and has considerably better annotation quality than existing datasets. In addition, we re-implement 35 representative Twitter bot detection baselines and evaluate them on 9 datasets, including TwiBot-22, to promote a fair comparison of model performance and a holistic understanding of research progress. To facilitate further research, we consolidate all implemented codes and datasets into the TwiBot-22 evaluation framework, where researchers could consistently evaluate new models and datasets. The TwiBot-22 Twitter bot detection benchmark and evaluation framework are publicly available at


LMBot: Distilling Graph Knowledge into Language Model for Graph-less Deployment in Twitter Bot Detection

As malicious actors employ increasingly advanced and widespread bots to ...

BIC: Twitter Bot Detection with Text-Graph Interaction and Semantic Consistency

Twitter bot detection is an important and meaningful task. Existing text...

Interpreting Graph-based Sybil Detection Methods as Low-Pass Filtering

Online social networks (OSNs) are threatened by Sybil attacks, which cre...

Attacking Graph-based Classification via Manipulating the Graph Structure

Graph-based classification methods are widely used for security and priv...

BadLink: Combining Graph and Information-Theoretical Features for Online Fraud Group Detection

Frauds severely hurt many kinds of Internet businesses. Group-based frau...

LogoDet-3K: A Large-Scale Image Dataset for Logo Detection

Logo detection has been gaining considerable attention because of its wi...

Manga109Dialog A Large-scale Dialogue Dataset for Comics Speaker Detection

The expanding market for e-comics has spurred interest in the developmen...

Please sign up or login with your details

Forgot password? Click here to reset