Building Accurate Simple Models with Multihop

by   Amit Dhurandhar, et al.

Knowledge transfer from a complex high performing model to a simpler and potentially low performing one in order to enhance its performance has been of great interest over the last few years as it finds applications in important problems such as explainable artificial intelligence, model compression, robust model building and learning from small data. Known approaches to this problem (viz. Knowledge Distillation, Model compression, ProfWeight, etc.) typically transfer information directly (i.e. in a single/one hop) from the complex model to the chosen simple model through schemes that modify the target or reweight training examples on which the simple model is trained. In this paper, we propose a meta-approach where we transfer information from the complex model to the simple model by dynamically selecting and/or constructing a sequence of intermediate models of decreasing complexity that are less intricate than the original complex model. Our approach can transfer information between consecutive models in the sequence using any of the previously mentioned approaches as well as work in 1-hop fashion, thus generalizing these approaches. In the experiments on real data, we observe that we get consistent gains for different choices of models over 1-hop, which on average is more than 2% and reaches up to 8% in a particular case. We also empirically analyze conditions under which the multi-hop approach is likely to be beneficial over the traditional 1-hop approach, and report other interesting insights. To the best of our knowledge, this is the first work that proposes such a multi-hop approach to perform knowledge transfer given a single high performing complex model, making it in our opinion, an important methodological contribution.


Avoiding Reasoning Shortcuts: Adversarial Evaluation, Training, and Model Development for Multi-Hop QA

Multi-hop question answering requires a model to connect multiple pieces...

Classification of Diabetic Retinopathy Using Unlabeled Data and Knowledge Distillation

Knowledge distillation allows transferring knowledge from a pre-trained ...

Memory Injections: Correcting Multi-Hop Reasoning Failures during Inference in Transformer-Based Language Models

Answering multi-hop reasoning questions requires retrieving and synthesi...

A Complex KBQA System using Multiple Reasoning Paths

Multi-hop knowledge based question answering (KBQA) is a complex task fo...

An Empirical Study on Few-shot Knowledge Probing for Pretrained Language Models

Prompt-based knowledge probing for 1-hop relations has been used to meas...

HOP, UNION, GENERATE: Explainable Multi-hop Reasoning without Rationale Supervision

Explainable multi-hop question answering (QA) not only predicts answers ...

Please sign up or login with your details

Forgot password? Click here to reset