Detecting Adversarial Examples Based on Steganalysis
Deep Neural Networks (DNNs) have recently led to significant improvement in many fields, such as image classification. However, these machine learning models are vulnerable to adversarial examples which can mislead machine learning classifiers to give incorrect classifications. Adversarial examples pose security concerns in areas where privacy requirements are strict, such as face recognition, autonomous cars and malware detection. What's more, they could be used to perform an attack on machine learning systems, even if the adversary has no access to the underlying model. In this paper, we focus on detecting adversarial examples. We propose to augment deep neural networks with a detector. The detector is constructed by modeling the differences between adjacent pixels in natural images. And then we identify deviations from this model and assume that such deviations are due to adversarial attack. We construct the detector based on steganalysis which can detect minor modifications to an image because the adversarial attack can be treated as a sort of accidental steganography.
READ FULL TEXT 
  
  
     share
 share