Neural information retrieval often adopts a retrieve-and-rerank framewor...
While human evaluation remains best practice for accurately judging the
...
Models trained via empirical risk minimization (ERM) are known to rely o...
Language models trained on massive prompted multitask datasets like T0 (...
Natural language processing models often exploit spurious correlations
b...
Readers of academic research papers often read with the goal of answerin...
When training most modern reading comprehension models, all the question...
Question Answering (QA) tasks requiring information from multiple docume...
Humans often have to read multiple documents to address their informatio...
High-quality and large-scale data are key to success for AI systems. How...
Standard test sets for supervised learning evaluate in-distribution
gene...
Machine comprehension of texts longer than a single sentence often requi...
Reading comprehension has recently seen rapid progress, with systems mat...
This paper describes AllenNLP, a platform for research on deep learning
...
Type-level word embeddings use the same set of parameters to represent a...
We propose a deep learning model for identifying structure within experi...