Natural language processing for achieving sustainable development: the case of neural labelling to enhance community profiling

by   Costanza Conforti, et al.

In recent years, there has been an increasing interest in the application of Artificial Intelligence - and especially Machine Learning - to the field of Sustainable Development (SD). However, until now, NLP has not been applied in this context. In this research paper, we show the high potential of NLP applications to enhance the sustainability of projects. In particular, we focus on the case of community profiling in developing countries, where, in contrast to the developed world, a notable data gap exists. In this context, NLP could help to address the cost and time barrier of structuring qualitative data that prohibits its widespread use and associated benefits. We propose the new task of Automatic UPV classification, which is an extreme multi-class multi-label classification problem. We release Stories2Insights, an expert-annotated dataset, provide a detailed corpus analysis, and implement a number of strong neural baselines to address the task. Experimental results show that the problem is challenging, and leave plenty of room for future research at the intersection of NLP and SD.


page 3

page 4


The Elephant in the Room: Analyzing the Presence of Big Tech in Natural Language Processing Research

Recent advances in deep learning methods for natural language processing...

Beqi: Revitalize the Senegalese Wolof Language with a Robust Spelling Corrector

The progress of Natural Language Processing (NLP), although fast in rece...

How Good Is NLP? A Sober Look at NLP Tasks through the Lens of Social Impact

Recent years have seen many breakthroughs in natural language processing...

RaFoLa: A Rationale-Annotated Corpus for Detecting Indicators of Forced Labour

Forced labour is the most common type of modern slavery, and it is incre...

Citizen Participation and Machine Learning for a Better Democracy

The development of democratic systems is a crucial task as confirmed by ...

HumSet: Dataset of Multilingual Information Extraction and Classification for Humanitarian Crisis Response

Timely and effective response to humanitarian crises requires quick and ...

Please sign up or login with your details

Forgot password? Click here to reset