Multilingual Detection of Personal Employment Status on Twitter

by   Manuel Tonneau, et al.

Detecting disclosures of individuals' employment status on social media can provide valuable information to match job seekers with suitable vacancies, offer social protection, or measure labor market flows. However, identifying such personal disclosures is a challenging task due to their rarity in a sea of social media content and the variety of linguistic forms used to describe them. Here, we examine three Active Learning (AL) strategies in real-world settings of extreme class imbalance, and identify five types of disclosures about individuals' employment status (e.g. job loss) in three languages using BERT-based classification models. Our findings show that, even under extreme imbalance settings, a small number of AL iterations is sufficient to obtain large and significant gains in precision, recall, and diversity of results compared to a supervised baseline with the same number of labels. We also find that no AL strategy consistently outperforms the rest. Qualitative analysis suggests that AL helps focus the attention mechanism of BERT on core terms and adjust the boundaries of semantic expansion, highlighting the importance of interpretable models to provide greater control and visibility into this dynamic learning process.


page 1

page 2

page 3

page 4


Multi-task Learning for Personal Health Mention Detection on Social Media

Detecting personal health mentions on social media is essential to compl...

Building and Using Personal Knowledge Graph to Improve Suicidal Ideation Detection on Social Media

A large number of individuals are suffering from suicidal ideation in th...

Sense-Giving Strategies of Media Organisations in Social Media Disaster Communication: Findings from Hurricane Harvey

Media organisations are essential communication stakeholders in social m...

On Predicting Personal Values of Social Media Users using Community-Specific Language Features and Personal Value Correlation

Personal values have significant influence on individuals' behaviors, pr...

Integrating Crowdsourcing and Active Learning for Classification of Work-Life Events from Tweets

Social media, especially Twitter, is being increasingly used for researc...

Domain-Guided Task Decomposition with Self-Training for Detecting Personal Events in Social Media

Mining social media content for tasks such as detecting personal experie...

Socioeconomic Dependencies of Linguistic Patterns in Twitter: A Multivariate Analysis

Our usage of language is not solely reliant on cognition but is arguably...

Please sign up or login with your details

Forgot password? Click here to reset