Improving Question Answering Performance Using Knowledge Distillation and Active Learning

09/26/2021
by   Yasaman Boreshban, et al.
0

Contemporary question answering (QA) systems, including transformer-based architectures, suffer from increasing computational and model complexity which render them inefficient for real-world applications with limited resources. Further, training or even fine-tuning such models requires a vast amount of labeled data which is often not available for the task at hand. In this manuscript, we conduct a comprehensive analysis of the mentioned challenges and introduce suitable countermeasures. We propose a novel knowledge distillation (KD) approach to reduce the parameter and model complexity of a pre-trained BERT system and utilize multiple active learning (AL) strategies for immense reduction in annotation efforts. In particular, we demonstrate that our model achieves the performance of a 6-layer TinyBERT and DistilBERT, whilst using only 2 approaches into the BERT framework, we show that state-of-the-art results on the SQuAD dataset can be achieved when we only use 20

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset