PromptAttack: Prompt-based Attack for Language Models via Gradient Search

09/05/2022
by   Yundi Shi, et al.
1

As the pre-trained language models (PLMs) continue to grow, so do the hardware and data requirements for fine-tuning PLMs. Therefore, the researchers have come up with a lighter method called Prompt Learning. However, during the investigations, we observe that the prompt learning methods are vulnerable and can easily be attacked by some illegally constructed prompts, resulting in classification errors, and serious security problems for PLMs. Most of the current research ignores the security issue of prompt-based methods. Therefore, in this paper, we propose a malicious prompt template construction method (PromptAttack) to probe the security performance of PLMs. Several unfriendly template construction approaches are investigated to guide the model to misclassify the task. Extensive experiments on three datasets and three PLMs prove the effectiveness of our proposed approach PromptAttack. We also conduct experiments to verify that our method is applicable in few-shot scenarios.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/19/2021

Probing for Bridging Inference in Transformer Language Models

We probe pre-trained transformer language models for bridging inference....
research
09/30/2022

What Makes Pre-trained Language Models Better Zero/Few-shot Learners?

In this paper, we propose a theoretical framework to explain the efficac...
research
04/29/2023

POUF: Prompt-oriented unsupervised fine-tuning for large pre-trained models

Through prompting, large-scale pre-trained models have become more expre...
research
09/28/2021

Template-free Prompt Tuning for Few-shot NER

Prompt-based methods have been successfully applied in sentence-level fe...
research
05/16/2023

UOR: Universal Backdoor Attacks on Pre-trained Language Models

Backdoors implanted in pre-trained language models (PLMs) can be transfe...
research
06/09/2023

COVER: A Heuristic Greedy Adversarial Attack on Prompt-based Learning in Language Models

Prompt-based learning has been proved to be an effective way in pre-trai...
research
10/29/2022

STPrompt: Semantic-guided and Task-driven prompts for Effective Few-shot Classification

The effectiveness of prompt learning has been demonstrated in different ...

Please sign up or login with your details

Forgot password? Click here to reset