Characterizing Interconnections and Linguistic Patterns in Twitter

by   Johnnatan Messias, et al.

Social media is considered a democratic space in which people connect and interact with each other regardless of their gender, race, or any other demographic aspect. Despite numerous efforts that explore demographic aspects in social media, it is still unclear whether social media perpetuates old inequalities from the offline world. In this dissertation, we attempt to identify gender and race of Twitter users located in the United States using advanced image processing algorithms from Face++. We investigate how different demographic groups connect with each other and differentiate them regarding linguistic styles and also their interests. We quantify to what extent one group follows and interacts with each other and the extent to which these connections and interactions reflect in inequalities in Twitter. We also extract linguistic features from six categories (affective attributes, cognitive attributes, lexical density and awareness, temporal references, social and personal concerns, and interpersonal focus) in order to identify the similarities and the differences in the messages they share in Twitter. Furthermore, we extract the absolute ranking difference of top phrases between demographic groups. As a dimension of diversity, we use the topics of interest that we retrieve from each user. Our analysis shows that users identified as white and male tend to attain higher positions, in terms of the number of followers and number of times in another user's lists, in Twitter. There are clear differences in the way of writing across different demographic groups in both gender and race domains as well as in the topic of interest. We hope our effort can stimulate the development of new theories of demographic information in the online space. Finally, we developed a Web-based system that leverages the demographic aspects of users to provide transparency to the Twitter trending topics system.


page 1

page 2

page 3

page 4


Demographics in Social Media Data for Public Health Research: Does it matter?

Social media data provides propitious opportunities for public health re...

Evidence of Demographic rather than Ideological Segregation in News Discussion on Reddit

We evaluate homophily and heterophily among ideological and demographic ...

Using Noisy Self-Reports to Predict Twitter User Demographics

Computational social science studies often contextualize content analysi...

Gender and Racial Diversity in Commercial Brands' Advertising Images on Social Media

Gender and racial diversity in the mediated images from the media shape ...

English verb regularization in books and tweets

The English language has evolved dramatically throughout its lifespan, t...

The Secret Lives of Names? Name Embeddings from Social Media

Your name tells a lot about you: your gender, ethnicity and so on. It ha...

Patterns of gender-specializing query reformulation

Users of search systems often reformulate their queries by adding query ...

Please sign up or login with your details

Forgot password? Click here to reset