Towards Benchmarking and Improving the Temporal Reasoning Capability of Large Language Models

06/15/2023
by   Qingyu Tan, et al.
0

Reasoning about time is of fundamental importance. Many facts are time-dependent. For example, athletes change teams from time to time, and different government officials are elected periodically. Previous time-dependent question answering (QA) datasets tend to be biased in either their coverage of time spans or question types. In this paper, we introduce a comprehensive probing dataset to evaluate the temporal reasoning capability of large language models. Our dataset includes questions of three temporal reasoning levels. In addition, we also propose a novel learning framework to improve the temporal reasoning capability of large language models, based on temporal span extraction and time-sensitive reinforcement learning. We conducted experiments in closed book QA, open book QA, and reasoning QA settings and demonstrated the effectiveness of our approach. Our code and data are released on https://github.com/DAMO-NLP-SG/TempReason.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/13/2021

A Dataset for Answering Time-Sensitive Questions

Time is an important dimension in our physical world. Lots of facts can ...
research
07/24/2019

Careful Selection of Knowledge to solve Open Book Question Answering

Open book question answering is a type of natural language based QA (NLQ...
research
06/07/2021

Narrative Question Answering with Cutting-Edge Open-Domain QA Techniques: A Comprehensive Study

Recent advancements in open-domain question answering (ODQA), i.e., find...
research
06/05/2023

Benchmarking Large Language Models on CMExam – A Comprehensive Chinese Medical Exam Dataset

Recent advancements in large language models (LLMs) have transformed the...
research
05/08/2023

Event Knowledge Incorporation with Posterior Regularization for Event-Centric Question Answering

We propose a simple yet effective strategy to incorporate event knowledg...
research
02/17/2023

Complex QA and language models hybrid architectures, Survey

This paper reviews the state-of-the-art of language models architectures...
research
05/24/2023

Mitigating Temporal Misalignment by Discarding Outdated Facts

While large language models are able to retain vast amounts of world kno...

Please sign up or login with your details

Forgot password? Click here to reset