Generalizing and Improving Jacobian and Hessian Regularization

12/01/2022
by   Chenwei Cui, et al.
0

Jacobian and Hessian regularization aim to reduce the magnitude of the first and second-order partial derivatives with respect to neural network inputs, and they are predominantly used to ensure the adversarial robustness of image classifiers. In this work, we generalize previous efforts by extending the target matrix from zero to any matrix that admits efficient matrix-vector products. The proposed paradigm allows us to construct novel regularization terms that enforce symmetry or diagonality on square Jacobian and Hessian matrices. On the other hand, the major challenge for Jacobian and Hessian regularization has been high computational complexity. We introduce Lanczos-based spectral norm minimization to tackle this difficulty. This technique uses a parallelized implementation of the Lanczos algorithm and is capable of effective and stable regularization of large Jacobian and Hessian matrices. Theoretical justifications and empirical evidence are provided for the proposed paradigm and technique. We carry out exploratory experiments to validate the effectiveness of our novel regularization terms. We also conduct comparative experiments to evaluate Lanczos-based spectral norm minimization against prior methods. Results show that the proposed methodologies are advantageous for a wide range of tasks.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/14/2020

Input Hessian Regularization of Neural Networks

Regularizing the input gradient has shown to be effective in promoting t...
research
09/14/2012

Hessian Schatten-Norm Regularization for Linear Inverse Problems

We introduce a novel family of invariant, convex, and non-quadratic func...
research
05/31/2017

Spectral Norm Regularization for Improving the Generalizability of Deep Learning

We investigate the generalizability of deep learning based on the sensit...
research
10/15/2019

Adjoint-based exact Hessian-vector multiplication using symplectic Runge–Kutta methods

We consider a function of the numerical solution of an initial value pro...
research
08/24/2020

The Hessian Penalty: A Weak Prior for Unsupervised Disentanglement

Existing disentanglement methods for deep generative models rely on hand...
research
02/24/2021

Learning-Augmented Sketches for Hessians

Sketching is a dimensionality reduction technique where one compresses a...
research
11/04/2022

Spectral Regularization: an Inductive Bias for Sequence Modeling

Various forms of regularization in learning tasks strive for different n...

Please sign up or login with your details

Forgot password? Click here to reset