Understanding How BERT Learns to Identify Edits

11/28/2020
by   Samuel Stevens, et al.
0

Pre-trained transformer language models such as BERT are ubiquitous in NLP research, leading to work on understanding how and why these models work. Attention mechanisms have been proposed as a means of interpretability with varying conclusions. We propose applying BERT-based models to a sequence classification task and using the data set's labeling schema to measure each model's interpretability. We find that classification performance scores do not always correlate with interpretability. Despite this, BERT's attention weights are interpretable for over 70

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset