Sentiment Expression via Emoticons on Social Media
Emoticons (e.g., :) and :( ) have been widely used in sentiment analysis and other NLP tasks as features to ma- chine learning algorithms or as entries of sentiment lexicons. In this paper, we argue that while emoticons are strong and common signals of sentiment expression on social media the relationship between emoticons and sentiment polarity are not always clear. Thus, any algorithm that deals with sentiment polarity should take emoticons into account but extreme cau- tion should be exercised in which emoticons to depend on. First, to demonstrate the prevalence of emoticons on social media, we analyzed the frequency of emoticons in a large re- cent Twitter data set. Then we carried out four analyses to examine the relationship between emoticons and sentiment polarity as well as the contexts in which emoticons are used. The first analysis surveyed a group of participants for their perceived sentiment polarity of the most frequent emoticons. The second analysis examined clustering of words and emoti- cons to better understand the meaning conveyed by the emoti- cons. The third analysis compared the sentiment polarity of microblog posts before and after emoticons were removed from the text. The last analysis tested the hypothesis that removing emoticons from text hurts sentiment classification by training two machine learning models with and without emoticons in the text respectively. The results confirms the arguments that: 1) a few emoticons are strong and reliable signals of sentiment polarity and one should take advantage of them in any senti- ment analysis; 2) a large group of the emoticons conveys com- plicated sentiment hence they should be treated with extreme caution.
READ FULL TEXT