Depression Symptoms Modelling from Social Media Text: An Active Learning Approach

by   Nawshad Farruque, et al.

A fundamental component of user-level social media language based clinical depression modelling is depression symptoms detection (DSD). Unfortunately, there does not exist any DSD dataset that reflects both the clinical insights and the distribution of depression symptoms from the samples of self-disclosed depressed population. In our work, we describe an Active Learning (AL) framework which uses an initial supervised learning model that leverages 1) a state-of-the-art large mental health forum text pre-trained language model further fine-tuned on a clinician annotated DSD dataset, 2) a Zero-Shot learning model for DSD, and couples them together to harvest depression symptoms related samples from our large self-curated Depression Tweets Repository (DTR). Our clinician annotated dataset is the largest of its kind. Furthermore, DTR is created from the samples of tweets in self-disclosed depressed users Twitter timeline from two datasets, including one of the largest benchmark datasets for user-level depression detection from Twitter. This further helps preserve the depression symptoms distribution of self-disclosed Twitter users tweets. Subsequently, we iteratively retrain our initial DSD model with the harvested data. We discuss the stopping criteria and limitations of this AL process, and elaborate the underlying constructs which play a vital role in the overall AL process. We show that we can produce a final dataset which is the largest of its kind. Furthermore, a DSD and a Depression Post Detection (DPD) model trained on it achieves significantly better accuracy than their initial version.


page 1

page 2

page 3

page 4


Semi-Supervised Approach to Monitoring Clinical Depressive Symptoms in Social Media

With the rise of social media, millions of people are routinely expressi...

No Rumours Please! A Multi-Indic-Lingual Approach for COVID Fake-Tweet Detection

The sudden widespread menace created by the present global pandemic COVI...

Decay No More: A Persistent Twitter Dataset for Learning Social Meaning

With the proliferation of social media, many studies resort to social me...

Monitoring Depression Trend on Twitter during the COVID-19 Pandemic

The COVID-19 pandemic has severely affected people's daily lives and cau...

Named Entity Recognition in Twitter: A Dataset and Analysis on Short-Term Temporal Shifts

Recent progress in language model pre-training has led to important impr...

Integrating Crowdsourcing and Active Learning for Classification of Work-Life Events from Tweets

Social media, especially Twitter, is being increasingly used for researc...

Misogynistic Tweet Detection: Modelling CNN with Small Datasets

Online abuse directed towards women on the social media platform Twitter...

Please sign up or login with your details

Forgot password? Click here to reset