Measuring and Manipulating Knowledge Representations in Language Models

by   Evan Hernandez, et al.

Neural language models (LMs) represent facts about the world described by text. Sometimes these facts derive from training data (in most LMs, a representation of the word banana encodes the fact that bananas are fruits). Sometimes facts derive from input text itself (a representation of the sentence "I poured out the bottle" encodes the fact that the bottle became empty). Tools for inspecting and modifying LM fact representations would be useful almost everywhere LMs are used: making it possible to update them when the world changes, to localize and remove sources of bias, and to identify errors in generated text. We describe REMEDI, an approach for querying and modifying factual knowledge in LMs. REMEDI learns a map from textual queries to fact encodings in an LM's internal representation system. These encodings can be used as knowledge editors: by adding them to LM hidden representations, we can modify downstream generation to be consistent with new facts. REMEDI encodings can also be used as model probes: by comparing them to LM representations, we can ascertain what properties LMs attribute to mentioned entities, and predict when they will generate outputs that conflict with background knowledge or input text. REMEDI thus links work on probing, prompting, and model editing, and offers steps toward general tools for fine-grained inspection and control of knowledge in LMs.


page 15

page 18


Evaluating the Ripple Effects of Knowledge Editing in Language Models

Modern language models capture a large body of factual knowledge. Howeve...

Crawling the Internal Knowledge-Base of Language Models

Language models are trained on large volumes of text, and as a result th...

Language Models as Agent Models

Language models (LMs) are trained on collections of documents, written b...

Direct Fact Retrieval from Knowledge Graphs without Entity Linking

There has been a surge of interest in utilizing Knowledge Graphs (KGs) f...

Toward a Thermodynamics of Meaning

As language models such as GPT-3 become increasingly successful at gener...

Factual Probing Is [MASK]: Learning vs. Learning to Recall

Petroni et al. (2019) demonstrated that it is possible to retrieve world...

Measuring and Modifying Factual Knowledge in Large Language Models

Large Language Models (LLMs) store an extensive amount of factual knowle...

Please sign up or login with your details

Forgot password? Click here to reset