belabBERT: a Dutch RoBERTa-based language model applied to psychiatric classification

06/02/2021
by   Joppe Wouts, et al.
0

Natural language processing (NLP) is becoming an important means for automatic recognition of human traits and states, such as intoxication, presence of psychiatric disorders, presence of airway disorders and states of stress. Such applications have the potential to be an important pillar for online help lines, and may gradually be introduced into eHealth modules. However, NLP is language specific and for languages such as Dutch, NLP models are scarce. As a result, recent Dutch NLP models have a low capture of long range semantic dependencies over sentences. To overcome this, here we present belabBERT, a new Dutch language model extending the RoBERTa architecture. belabBERT is trained on a large Dutch corpus (+32 GB) of web crawled texts. We applied belabBERT to the classification of psychiatric illnesses. First, we evaluated the strength of text-based classification using belabBERT, and compared the results to the existing RobBERT model. Then, we compared the performance of belabBERT to audio classification for psychiatric disorders. Finally, a brief exploration was performed, extending the framework to a hybrid text- and audio-based classification. Our results show that belabBERT outperformed the current best text classification network for Dutch, RobBERT. belabBERT also outperformed classification based on audio alone.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/29/2020

Text-based classification of interviews for mental health – juxtaposing the state of the art

Currently, the state of the art for classification of psychiatric illnes...
research
10/22/2019

Automatic Extraction of Personality from Text: Challenges and Opportunities

In this study, we examined the possibility to extract personality traits...
research
11/25/2022

Comparison Study Between Token Classification and Sequence Classification In Text Classification

Unsupervised Machine Learning techniques have been applied to Natural La...
research
02/18/2022

From FreEM to D'AlemBERT: a Large Corpus and a Language Model for Early Modern French

Language models for historical states of language are becoming increasin...
research
07/18/2022

Deep Sequence Models for Text Classification Tasks

The exponential growth of data generated on the Internet in the current ...
research
04/29/2020

GePpeTto Carves Italian into a Language Model

In the last few years, pre-trained neural architectures have provided im...
research
08/09/2022

DeepHider: A Multi-module and Invisibility Watermarking Scheme for Language Model

Natural language processing (NLP) technology has shown great economic va...

Please sign up or login with your details

Forgot password? Click here to reset