Using Word Embeddings in Twitter Election Classification

06/22/2016
by   Xiao Yang, et al.
0

Word embeddings and convolutional neural networks (CNN) have attracted extensive attention in various classification tasks for Twitter, e.g. sentiment classification. However, the effect of the configuration used to train and generate the word embeddings on the classification performance has not been studied in the existing literature. In this paper, using a Twitter election classification task that aims to detect election-related tweets, we investigate the impact of the background dataset used to train the embedding models, the context window size and the dimensionality of word embeddings on the classification performance. By comparing the classification results of two word embedding models, which are trained using different background corpora (e.g. Wikipedia articles and Twitter microposts), we show that the background data type should align with the Twitter classification dataset to achieve a better performance. Moreover, by evaluating the results of word embeddings models trained using various context window sizes and dimensionalities, we found that large context window and dimension sizes are preferable to improve the performance. Our experimental results also show that using word embeddings and CNN leads to statistically significant improvements over various baselines such as random, SVM with TF-IDF and SVM with word embeddings.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/30/2020

Blind signal decomposition of various word embeddings based on join and individual variance explained

In recent years, natural language processing (NLP) has become one of the...
research
10/10/2016

A Dynamic Window Neural Network for CCG Supertagging

Combinatory Category Grammar (CCG) supertagging is a task to assign lexi...
research
10/17/2017

RETUYT in TASS 2017: Sentiment Analysis for Spanish Tweets using SVM and CNN

This article presents classifiers based on SVM and Convolutional Neural ...
research
05/03/2017

On the effectiveness of feature set augmentation using clusters of word embeddings

Word clusters have been empirically shown to offer important performance...
research
04/16/2018

A Deeper Look into Dependency-Based Word Embeddings

We investigate the effect of various dependency-based word embeddings on...
research
07/25/2020

Effect of Text Processing Steps on Twitter Sentiment Classification using Word Embedding

Processing of raw text is the crucial first step in text classification ...
research
03/03/2016

MGNC-CNN: A Simple Approach to Exploiting Multiple Word Embeddings for Sentence Classification

We introduce a novel, simple convolution neural network (CNN) architectu...

Please sign up or login with your details

Forgot password? Click here to reset