Probing for the Usage of Grammatical Number

by   Karim Lasri, et al.

A central quest of probing is to uncover how pre-trained models encode a linguistic property within their representations. An encoding, however, might be spurious-i.e., the model might not rely on it when making predictions. In this paper, we try to find encodings that the model actually uses, introducing a usage-based probing setup. We first choose a behavioral task which cannot be solved without using the linguistic property. Then, we attempt to remove the property by intervening on the model's representations. We contend that, if an encoding is used by the model, its removal should harm the performance on the chosen behavioral task. As a case study, we focus on how BERT encodes grammatical number, and on how it uses this encoding to solve the number agreement task. Experimentally, we find that BERT relies on a linear encoding of grammatical number to produce the correct behavioral output. We also find that BERT uses a separate encoding of grammatical number for nouns and verbs. Finally, we identify in which layers information about grammatical number is transferred from a noun to its head verb.


page 6

page 7

page 13

page 15


Open Sesame: Getting Inside BERT's Linguistic Knowledge

How and to what extent does BERT encode syntactically-sensitive hierarch...

Exploring the Role of BERT Token Representations to Explain Sentence Probing Results

Several studies have been carried out on revealing linguistic features c...

Counterfactual Interventions Reveal the Causal Effect of Relative Clause Representations on Agreement Prediction

When language models process syntactically complex sentences, do they us...

When Bert Forgets How To POS: Amnesic Probing of Linguistic Properties and MLM Predictions

A growing body of work makes use of probing in order to investigate the ...

Alzheimers Dementia Detection using Acoustic Linguistic features and Pre-Trained BERT

Alzheimers disease is a fatal progressive brain disorder that worsens wi...

Picking BERT's Brain: Probing for Linguistic Dependencies in Contextualized Embeddings Using Representational Similarity Analysis

As the name implies, contextualized representations of language are typi...

Abstraction not Memory: BERT and the English Article System

Article prediction is a task that has long defied accurate linguistic de...

Please sign up or login with your details

Forgot password? Click here to reset