ParaShoot: A Hebrew Question Answering Dataset

09/23/2021
by   Omri Keren, et al.
0

NLP research in Hebrew has largely focused on morphology and syntax, where rich annotated datasets in the spirit of Universal Dependencies are available. Semantic datasets, however, are in short supply, hindering crucial advances in the development of NLP technology in Hebrew. In this work, we present ParaShoot, the first question answering dataset in modern Hebrew. The dataset follows the format and crowdsourcing methodology of SQuAD, and contains approximately 3000 annotated examples, similar to other question-answering datasets in low-resource languages. We provide the first baseline results using recently-released BERT-style models for Hebrew, showing that there is significant room for improvement on this task.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/18/2021

Cascading Adaptors to Leverage English Data to Improve Performance of Question Answering for Low-Resource Languages

Transformer based architectures have shown notable results on many down ...
research
04/26/2020

MATINF: A Jointly Labeled Large-Scale Dataset for Classification, Question Answering and Summarization

Recently, large-scale datasets have vastly facilitated the development i...
research
12/27/2021

A Survey on non-English Question Answering Dataset

Research in question answering datasets and models has gained a lot of a...
research
06/12/2023

When Do Annotator Demographics Matter? Measuring the Influence of Annotator Demographics with the POPQUORN Dataset

Annotators are not fungible. Their demographics, life experiences, and b...
research
03/30/2020

NukeBERT: A Pre-trained language model for Low Resource Nuclear Domain

Significant advances have been made in recent years on Natural Language ...
research
05/09/2023

MAUPQA: Massive Automatically-created Polish Question Answering Dataset

Recently, open-domain question answering systems have begun to rely heav...
research
08/19/2023

Breaking Language Barriers: A Question Answering Dataset for Hindi and Marathi

The recent advances in deep-learning have led to the development of high...

Please sign up or login with your details

Forgot password? Click here to reset