Evaluating the Robustness of Machine Reading Comprehension Models to Low Resource Entity Renaming

04/06/2023
by   Clemencia Siro, et al.
0

Question answering (QA) models have shown compelling results in the task of Machine Reading Comprehension (MRC). Recently these systems have proved to perform better than humans on held-out test sets of datasets e.g. SQuAD, but their robustness is not guaranteed. The QA model's brittleness is exposed when evaluated on adversarial generated examples by a performance drop. In this study, we explore the robustness of MRC models to entity renaming, with entities from low-resource regions such as Africa. We propose EntSwap, a method for test-time perturbations, to create a test set whose entities have been renamed. In particular, we rename entities of type: country, person, nationality, location, organization, and city, to create AfriSQuAD2. Using the perturbed test set, we evaluate the robustness of three popular MRC models. We find that compared to base models, large models perform well comparatively on novel entities. Furthermore, our analysis indicates that entity type person highly challenges the MRC models' performance.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/12/2022

DTW at Qur'an QA 2022: Utilising Transfer Learning with Transformers for Question Answering in a Low-resource Domain

The task of machine reading comprehension (MRC) is a useful benchmark to...
research
08/27/2020

Relation/Entity-Centric Reading Comprehension

Constructing a machine that understands human language is one of the mos...
research
04/23/2020

DuReaderrobust: A Chinese Dataset Towards Evaluating the Robustness of Machine Reading Comprehension Models

Machine Reading Comprehension (MRC) is a crucial and challenging task in...
research
04/29/2020

Benchmarking Robustness of Machine Reading Comprehension Models

Machine Reading Comprehension (MRC) is an important testbed for evaluati...
research
09/30/2020

A Vietnamese Dataset for Evaluating Machine Reading Comprehension

Over 97 million inhabitants speak Vietnamese as the native language in t...
research
11/01/2021

Introspective Distillation for Robust Question Answering

Question answering (QA) models are well-known to exploit data bias, e.g....
research
06/28/2022

Collecting high-quality adversarial data for machine reading comprehension tasks with humans and models in the loop

We present our experience as annotators in the creation of high-quality,...

Please sign up or login with your details

Forgot password? Click here to reset