Protecting Anonymous Speech: A Generative Adversarial Network Methodology for Removing Stylistic Indicators in Text

by   Rishi Balakrishnan, et al.

With Internet users constantly leaving a trail of text, whether through blogs, emails, or social media posts, the ability to write and protest anonymously is being eroded because artificial intelligence, when given a sample of previous work, can match text with its author out of hundreds of possible candidates. Existing approaches to authorship anonymization, also known as authorship obfuscation, often focus on protecting binary demographic attributes rather than identity as a whole. Even those that do focus on obfuscating identity require manual feedback, lose the coherence of the original sentence, or only perform well given a limited subset of authors. In this paper, we develop a new approach to authorship anonymization by constructing a generative adversarial network that protects identity and optimizes for three different losses corresponding to anonymity, fluency, and content preservation. Our fully automatic method achieves comparable results to other methods in terms of content preservation and fluency, but greatly outperforms baselines in regards to anonymization. Moreover, our approach is able to generalize well to an open-set context and anonymize sentences from authors it has not encountered before.


page 1

page 2

page 3

page 4


Anonymizing Speech with Generative Adversarial Networks to Preserve Speaker Privacy

In order to protect the privacy of speech data, speaker anonymization ai...

Black or White but never neutral: How readers perceive identity from yellow or skin-toned emoji

Research in sociology and linguistics shows that people use language not...

Cats and Captions vs. Creators and the Clock: Comparing Multimodal Content to Context in Predicting Relative Popularity

The content of today's social media is becoming more and more rich, incr...

Insta(nt) Pet Therapy: GAN-generated Images for Therapeutic Social Media Content

The positive therapeutic effect of viewing pet images online has been we...

A^4NT: Author Attribute Anonymity by Adversarial Training of Neural Machine Translation

Text-based analysis methods allow to reveal privacy relevant author attr...

Adversarial Conversational Shaping for Intelligent Agents

The recent emergence of deep learning methods has enabled the research c...

Please sign up or login with your details

Forgot password? Click here to reset