Adaptive Low-Rank Regularization with Damping Sequences to Restrict Lazy Weights in Deep Networks

06/17/2021
by   Mohammad Mahdi Bejani, et al.
0

Overfitting is one of the critical problems in deep neural networks. Many regularization schemes try to prevent overfitting blindly. However, they decrease the convergence speed of training algorithms. Adaptive regularization schemes can solve overfitting more intelligently. They usually do not affect the entire network weights. This paper detects a subset of the weighting layers that cause overfitting. The overfitting recognizes by matrix and tensor condition numbers. An adaptive regularization scheme entitled Adaptive Low-Rank (ALR) is proposed that converges a subset of the weighting layers to their Low-Rank Factorization (LRF). It happens by minimizing a new Tikhonov-based loss function. ALR also encourages lazy weights to contribute to the regularization when epochs grow up. It uses a damping sequence to increment layer selection likelihood in the last generations. Thus before falling the training accuracy, ALR reduces the lazy weights and regularizes the network substantially. The experimental results show that ALR regularizes the deep networks well with high training speed and low resource usage.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/05/2020

Adaptive Low-Rank Factorization to regularize shallow and deep neural networks

The overfitting is one of the cursing subjects in the deep learning fiel...
research
02/27/2019

Stochastically Rank-Regularized Tensor Regression Networks

Over-parametrization of deep neural networks has recently been shown to ...
research
07/18/2022

Implicit Regularization with Polynomial Growth in Deep Tensor Factorization

We study the implicit regularization effects of deep learning in tensor ...
research
12/07/2021

Low-rank Tensor Decomposition for Compression of Convolutional Neural Networks Using Funnel Regularization

Tensor decomposition is one of the fundamental technique for model compr...
research
06/02/2023

Robust low-rank training via approximate orthonormal constraints

With the growth of model and data sizes, a broad effort has been made to...
research
05/28/2019

Adaptive Reduced Rank Regression

Low rank regression has proven to be useful in a wide range of forecasti...
research
06/05/2017

Emergence of Invariance and Disentangling in Deep Representations

Using established principles from Information Theory and Statistics, we ...

Please sign up or login with your details

Forgot password? Click here to reset