Information Plane Analysis of Deep Neural Networks via Matrix-Based Renyi's Entropy and Tensor Kernels

by   Kristoffer Wickstrøm, et al.

Analyzing deep neural networks (DNNs) via information plane (IP) theory has gained tremendous attention recently as a tool to gain insight into, among others, their generalization ability. However, it is by no means obvious how to estimate mutual information (MI) between each hidden layer and the input/desired output, to construct the IP. For instance, hidden layers with many neurons require MI estimators with robustness towards the high dimensionality associated with such layers. MI estimators should also be able to naturally handle convolutional layers, while at the same time being computationally tractable to scale to large networks. None of the existing IP methods to date have been able to study truly deep Convolutional Neural Networks (CNNs), such as the e.g. VGG-16. In this paper, we propose an IP analysis using the new matrix--based Rényi's entropy coupled with tensor kernels over convolutional layers, leveraging the power of kernel methods to represent properties of the probability distribution independently of the dimensionality of the data. The obtained results shed new light on the previous literature concerning small-scale DNNs, however using a completely new approach. Importantly, the new framework enables us to provide the first comprehensive IP analysis of contemporary large-scale DNNs and CNNs, investigating the different training phases and providing new insights into the training dynamics of large-scale neural networks.


page 2

page 6

page 7

page 13

page 14


On the Information Plane of Autoencoders

The training dynamics of hidden layers in deep learning are poorly under...

CircConv: A Structured Convolution with Low Complexity

Deep neural networks (DNNs), especially deep convolutional neural networ...

Understanding Convolutional Neural Network Training with Information Theory

Using information theoretic concepts to understand and explore the inner...

Examining the causal structures of deep neural networks using information theory

Deep Neural Networks (DNNs) are often examined at the level of their res...

Separation of scales and a thermodynamic description of feature learning in some CNNs

Deep neural networks (DNNs) are powerful tools for compressing and disti...

Exploiting Elasticity in Tensor Ranks for Compressing Neural Networks

Elasticities in depth, width, kernel size and resolution have been explo...

Information Scaling Law of Deep Neural Networks

With the rapid development of Deep Neural Networks (DNNs), various netwo...

Please sign up or login with your details

Forgot password? Click here to reset