Membership Inference on Word Embedding and Beyond

06/21/2021
by   Saeed Mahloujifar, et al.
0

In the text processing context, most ML models are built on word embeddings. These embeddings are themselves trained on some datasets, potentially containing sensitive data. In some cases this training is done independently, in other cases, it occurs as part of training a larger, task-specific model. In either case, it is of interest to consider membership inference attacks based on the embedding layer as a way of understanding sensitive information leakage. But, somewhat surprisingly, membership inference attacks on word embeddings and their effect in other natural language processing (NLP) tasks that use these embeddings, have remained relatively unexplored. In this work, we show that word embeddings are vulnerable to black-box membership inference attacks under realistic assumptions. Furthermore, we show that this leakage persists through two other major NLP applications: classification and text-generation, even when the embedding layer is not exposed to the attacker. We show that our MI attack achieves high attack accuracy against a classifier model and an LSTM-based language model. Indeed, our attack is a cheaper membership inference attack on text-generative models, which does not require the knowledge of the target model or any expensive training of text-generative models as shadow models.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/07/2019

Reconstruction and Membership Inference Attacks against Generative Models

We present two information leakage attacks that outperform previous work...
research
12/01/2020

Meta-Embeddings for Natural Language Inference and Semantic Similarity tasks

Word Representations form the core component for almost all advanced Nat...
research
03/31/2020

Information Leakage in Embedding Models

Embeddings are functions that map raw input data to low-dimensional vect...
research
05/04/2023

Sentence Embedding Leaks More Information than You Expect: Generative Embedding Inversion Attack to Recover the Whole Sentence

Sentence-level representations are beneficial for various natural langua...
research
05/17/2023

Are You Copying My Model? Protecting the Copyright of Large Language Models for EaaS via Backdoor Watermark

Large language models (LLMs) have demonstrated powerful capabilities in ...
research
03/29/2021

Be Careful about Poisoned Word Embeddings: Exploring the Vulnerability of the Embedding Layers in NLP Models

Recent studies have revealed a security threat to natural language proce...
research
02/24/2022

First is Better Than Last for Training Data Influence

The ability to identify influential training examples enables us to debu...

Please sign up or login with your details

Forgot password? Click here to reset