Exploring Adversarial Attacks on Neural Networks: An Explainable Approach

by   Justus Renkhoff, et al.

Deep Learning (DL) is being applied in various domains, especially in safety-critical applications such as autonomous driving. Consequently, it is of great significance to ensure the robustness of these methods and thus counteract uncertain behaviors caused by adversarial attacks. In this paper, we use gradient heatmaps to analyze the response characteristics of the VGG-16 model when the input images are mixed with adversarial noise and statistically similar Gaussian random noise. In particular, we compare the network response layer by layer to determine where errors occurred. Several interesting findings are derived. First, compared to Gaussian random noise, intentionally generated adversarial noise causes severe behavior deviation by distracting the area of concentration in the networks. Second, in many cases, adversarial examples only need to compromise a few intermediate blocks to mislead the final decision. Third, our experiments revealed that specific blocks are more vulnerable and easier to exploit by adversarial examples. Finally, we demonstrate that the layers Block4_conv1 and Block5_cov1 of the VGG-16 model are more susceptible to adversarial attacks. Our work could provide valuable insights into developing more reliable Deep Neural Network (DNN) models.


NoiseCAM: Explainable AI for the Boundary Between Noise and Adversarial Attacks

Deep Learning (DL) and Deep Neural Networks (DNNs) are widely used in va...

Exposing the Robustness and Vulnerability of Hybrid 8T-6T SRAM Memory Architectures to Adversarial Attacks in Deep Neural Networks

Deep Learning is able to solve a plethora of once impossible problems. H...

Comment on Transferability and Input Transformation with Additive Noise

Adversarial attacks have verified the existence of the vulnerability of ...

A Theoretical Perspective on Subnetwork Contributions to Adversarial Robustness

The robustness of deep neural networks (DNNs) against adversarial attack...

Explainable Adversarial Attacks in Deep Neural Networks Using Activation Profiles

As neural networks become the tool of choice to solve an increasing vari...

Adversarial TCAV – Robust and Effective Interpretation of Intermediate Layers in Neural Networks

Interpreting neural network decisions and the information learned in int...

Relationship between manifold smoothness and adversarial vulnerability in deep learning with local errors

Artificial neural networks can achieve impressive performances, and even...

Please sign up or login with your details

Forgot password? Click here to reset