Measuring and Reducing Model Update Regression in Structured Prediction for NLP

02/07/2022
by   Deng Cai, et al.
12

Recent advance in deep learning has led to rapid adoption of machine learning based NLP models in a wide range of applications. Despite the continuous gain in accuracy, backward compatibility is also an important aspect for industrial applications, yet it received little research attention. Backward compatibility requires that the new model does not regress on cases that were correctly handled by its predecessor. This work studies model update regression in structured prediction tasks. We choose syntactic dependency parsing and conversational semantic parsing as representative examples of structured prediction tasks in NLP. First, we measure and analyze model update regression in different model update settings. Next, we explore and benchmark existing techniques for reducing model update regression including model ensemble and knowledge distillation. We further propose a simple and effective method, Backward-Congruent Re-ranking (BCR), by taking into account the characteristics of structured output. Experiments show that BCR can better mitigate model update regression than model ensemble and knowledge distillation approaches.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/07/2021

Regression Bugs Are In Your Model! Measuring, Reducing and Analyzing Regressions In NLP Model Updates

Behavior of deep neural networks can be inconsistent between different v...
research
02/04/2023

Improving Prediction Backward-Compatiblility in NLP Model Upgrade with Gated Fusion

When upgrading neural models to a newer version, new errors that were no...
research
03/11/2019

Structured Knowledge Distillation for Semantic Segmentation

In this paper, we investigate the knowledge distillation strategy for tr...
research
09/16/2020

Mimic and Conquer: Heterogeneous Tree Structure Distillation for Syntactic NLP

Syntax has been shown useful for various NLP tasks, while existing work ...
research
05/29/2018

Distilling Knowledge for Search-based Structured Prediction

Many natural language processing tasks can be modeled into structured pr...
research
11/18/2020

Positive-Congruent Training: Towards Regression-Free Model Updates

Reducing inconsistencies in the behavior of different versions of an AI ...
research
05/12/2018

Backpropagating through Structured Argmax using a SPIGOT

We introduce the structured projection of intermediate gradients optimiz...

Please sign up or login with your details

Forgot password? Click here to reset