Towards Crafting Text Adversarial Samples

by   Suranjana Samanta, et al.

Adversarial samples are strategically modified samples, which are crafted with the purpose of fooling a classifier at hand. An attacker introduces specially crafted adversarial samples to a deployed classifier, which are being mis-classified by the classifier. However, the samples are perceived to be drawn from entirely different classes and thus it becomes hard to detect the adversarial samples. Most of the prior works have been focused on synthesizing adversarial samples in the image domain. In this paper, we propose a new method of crafting adversarial text samples by modification of the original samples. Modifications of the original text samples are done by deleting or replacing the important or salient words in the text or by introducing new words in the text sample. Our algorithm works best for the datasets which have sub-categories within each of the classes of examples. While crafting adversarial samples, one of the key constraint is to generate meaningful sentences which can at pass off as legitimate from language (English) viewpoint. Experimental results on IMDB movie review dataset for sentiment analysis and Twitter dataset for gender detection show the efficiency of our proposed method.


page 1

page 2

page 3

page 4


Rewriting Meaningful Sentences via Conditional BERT Sampling and an application on fooling text classifiers

Most adversarial attack methods that are designed to deceive a text clas...

Black-box Generation of Adversarial Text Sequences to Evade Deep Learning Classifiers

Although various techniques have been proposed to generate adversarial s...

Towards a Robust Classifier: An MDL-Based Method for Generating Adversarial Examples

We address the problem of adversarial examples in machine learning where...

Generating Adversarial Samples For Training Wake-up Word Detection Systems Against Confusing Words

Wake-up word detection models are widely used in real life, but suffer f...

Detecting Adversarial Samples Using Density Ratio Estimates

Machine learning models, especially based on deep architectures are used...

Identifying Adversarial Sentences by Analyzing Text Complexity

Attackers create adversarial text to deceive both human perception and t...

Please sign up or login with your details

Forgot password? Click here to reset