Evaluating BERT-based Pre-training Language Models for Detecting Misinformation

by   Rini Anggrainingsih, et al.

It is challenging to control the quality of online information due to the lack of supervision over all the information posted online. Manual checking is almost impossible given the vast number of posts made on online media and how quickly they spread. Therefore, there is a need for automated rumour detection techniques to limit the adverse effects of spreading misinformation. Previous studies mainly focused on finding and extracting the significant features of text data. However, extracting features is time-consuming and not a highly effective process. This study proposes the BERT- based pre-trained language models to encode text data into vectors and utilise neural network models to classify these vectors to detect misinformation. Furthermore, different language models (LM) ' performance with different trainable parameters was compared. The proposed technique is tested on different short and long text datasets. The result of the proposed technique has been compared with the state-of-the-art techniques on the same datasets. The results show that the proposed technique performs better than the state-of-the-art techniques. We also tested the proposed technique by combining the datasets. The results demonstrated that the large data training and testing size considerably improves the technique's performance.


page 5

page 8

page 17


Incorporating Word Sense Disambiguation in Neural Language Models

We present two supervised (pre-)training methods to incorporate gloss de...

Spread Love Not Hate: Undermining the Importance of Hateful Pre-training for Hate Speech Detection

Pre-training large neural language models, such as BERT, has led to impr...

Are You Robert or RoBERTa? Deceiving Online Authorship Attribution Models Using Neural Text Generators

Recently, there has been a rise in the development of powerful pre-train...

Deep Entity Matching with Pre-Trained Language Models

We present Ditto, a novel entity matching system based on pre-trained Tr...

Improving negation detection with negation-focused pre-training

Negation is a common linguistic feature that is crucial in many language...

Extensive Evaluation of Transformer-based Architectures for Adverse Drug Events Extraction

Adverse Event (ADE) extraction is one of the core tasks in digital pharm...

Robin: A Novel Online Suicidal Text Corpus of Substantial Breadth and Scale

Suicide is a major public health crisis. With more than 20,000,000 suici...

Please sign up or login with your details

Forgot password? Click here to reset