Adversarial Examples and the Deeper Riddle of Induction: The Need for a Theory of Artifacts in Deep Learning

by   Cameron Buckner, et al.

Deep learning is currently the most widespread and successful technology in artificial intelligence. It promises to push the frontier of scientific discovery beyond current limits. However, skeptics have worried that deep neural networks are black boxes, and have called into question whether these advances can really be deemed scientific progress if humans cannot understand them. Relatedly, these systems also possess bewildering new vulnerabilities: most notably a susceptibility to "adversarial examples". In this paper, I argue that adversarial examples will become a flashpoint of debate in philosophy and diverse sciences. Specifically, new findings concerning adversarial examples have challenged the consensus view that the networks' verdicts on these cases are caused by overfitting idiosyncratic noise in the training set, and may instead be the result of detecting predictively useful "intrinsic features of the data geometry" that humans cannot perceive (Ilyas et al., 2019). These results should cause us to re-examine responses to one of the deepest puzzles at the intersection of philosophy and science: Nelson Goodman's "new riddle" of induction. Specifically, they raise the possibility that progress in a number of sciences will depend upon the detection and manipulation of useful features that humans find inscrutable. Before we can evaluate this possibility, however, we must decide which (if any) of these inscrutable features are real but available only to "alien" perception and cognition, and which are distinctive artifacts of deep learning-for artifacts like lens flares or Gibbs phenomena can be similarly useful for prediction, but are usually seen as obstacles to scientific theorizing. Thus, machine learning researchers urgently need to develop a theory of artifacts for deep neural networks, and I conclude by sketching some initial directions for this area of research.


page 3

page 4

page 7

page 13

page 15

page 16

page 18


Logic-inspired Deep Neural Networks

Deep neural networks have achieved impressive performance and become de-...

Evaluating Defensive Distillation For Defending Text Processing Neural Networks Against Adversarial Examples

Adversarial examples are artificially modified input samples which lead ...

High Dimensional Spaces, Deep Learning and Adversarial Examples

In this paper, we analyze deep learning from a mathematical point of vie...

Automated Detection System for Adversarial Examples with High-Frequency Noises Sieve

Deep neural networks are being applied in many tasks with encouraging re...

Adversarial Examples Are Not Bugs, They Are Features

Adversarial examples have attracted significant attention in machine lea...

Adversarial Examples for Electrocardiograms

Among all physiological signals, electrocardiogram (ECG) has seen some o...

Please sign up or login with your details

Forgot password? Click here to reset