Knowledge Unlearning for Mitigating Privacy Risks in Language Models

by   Joel Jang, et al.

Pretrained Language Models (LMs) memorize a vast amount of knowledge during initial pretraining, including information that may violate the privacy of personal lives and identities. Previous work addressing privacy issues for language models has mostly focused on data preprocessing and differential privacy methods, both requiring re-training the underlying LM. We propose knowledge unlearning as an alternative method to reduce privacy risks for LMs post hoc. We show that simply applying the unlikelihood training objective to target token sequences is effective at forgetting them with little to no degradation of general language modeling performances; it sometimes even substantially improves the underlying LM with just a few iterations. We also find that sequential unlearning is better than trying to unlearn all the data at once and that unlearning is highly dependent on which kind of data (domain) is forgotten. By showing comparisons with a previous data preprocessing method known to mitigate privacy risks for LMs, we show that unlearning can give a stronger empirical privacy guarantee in scenarios where the data vulnerable to extraction attacks are known a priori while being orders of magnitude more computationally efficient. We release the code and dataset needed to replicate our results at .


page 1

page 2

page 3

page 4


Selective Differential Privacy for Language Modeling

With the increasing adoption of language models in applications involvin...

Deduplicating Training Data Mitigates Privacy Risks in Language Models

Past work has shown that large language models are susceptible to privac...

Mitigating Approximate Memorization in Language Models via Dissimilarity Learned Policy

Large Language models (LLMs) are trained on large amounts of data, which...

Privacy Implications of Retrieval-Based Language Models

Retrieval-based language models (LMs) have demonstrated improved interpr...

Can Prompt Probe Pretrained Language Models? Understanding the Invisible Risks from a Causal View

Prompt-based probing has been widely used in evaluating the abilities of...

Enhancing Small Medical Learners with Privacy-preserving Contextual Prompting

Large language models (LLMs) demonstrate remarkable medical expertise, b...

Privately generating tabular data using language models

Privately generating synthetic data from a table is an important brick o...

Please sign up or login with your details

Forgot password? Click here to reset