Robust Detection of Adaptive Spammers by Nash Reinforcement Learning

by   Yingtong Dou, et al.

Online reviews provide product evaluations for customers to make decisions. Unfortunately, the evaluations can be manipulated using fake reviews ("spams") by professional spammers, who have learned increasingly insidious and powerful spamming strategies by adapting to the deployed detectors. Spamming strategies are hard to capture, as they can be varying quickly along time, different across spammers and target products, and more critically, remained unknown in most cases. Furthermore, most existing detectors focus on detection accuracy, which is not well-aligned with the goal of maintaining the trustworthiness of product evaluations. To address the challenges, we formulate a minimax game where the spammers and spam detectors compete with each other on their practical goals that are not solely based on detection accuracy. Nash equilibria of the game lead to stable detectors that are agnostic to any mixed detection strategies. However, the game has no closed-form solution and is not differentiable to admit the typical gradient-based algorithms. We turn the game into two dependent Markov Decision Processes (MDPs) to allow efficient stochastic optimization based on multi-armed bandit and policy gradient. We experiment on three large review datasets using various state-of-the-art spamming and detection strategies and show that the optimization algorithm can reliably find an equilibrial detector that can robustly and effectively prevent spammers with any mixed spamming strategies from attaining their practical goal. Our code is available at


page 1

page 2

page 3

page 4


Robust Spammer Detection by Nash Reinforcement Learning

Online reviews provide product evaluations for customers to make decisio...

Securing Behavior-based Opinion Spam Detection

Reviews spams are prevalent in e-commerce to manipulate product ranking ...

Finding mixed-strategy equilibria of continuous-action games without gradients using randomized policy networks

We study the problem of computing an approximate Nash equilibrium of con...

An Effective and Robust Detector for Logo Detection

In recent years, intellectual property (IP), which represents literary, ...

Competitive Policy Optimization

A core challenge in policy optimization in competitive Markov decision p...

A Minimax Approach Against Multi-Armed Adversarial Attacks Detection

Multi-armed adversarial attacks, in which multiple algorithms and object...

Please sign up or login with your details

Forgot password? Click here to reset