Sparse, guided feature connections in an Abstract Deep Network

by   Anthony Knittel, et al.

We present a technique for developing a network of re-used features, where the topology is formed using a coarse learning method, that allows gradient-descent fine tuning, known as an Abstract Deep Network (ADN). New features are built based on observed co-occurrences, and the network is maintained using a selection process related to evolutionary algorithms. This allows coarse ex- ploration of the problem space, effective for irregular domains, while gradient descent allows pre- cise solutions. Accuracy on standard UCI and Protein-Structure Prediction problems is comparable with benchmark SVM and optimized GBML approaches, and shows scalability for addressing large problems. The discrete implementation is symbolic, allowing interpretability, while the continuous method using fine-tuning shows improved accuracy. The binary multiplexer problem is explored, as an irregular domain that does not support gradient descent learning, showing solution to the bench- mark 135-bit problem. A convolutional implementation is demonstrated on image classification, showing an error-rate of 0.79 pre-defined topology. The ADN system provides a method for developing a very sparse, deep feature topology, based on observed relationships between features, that is able to find solutions in irregular domains, and initialize a network prior to gradient descent learning.


page 1

page 2

page 3

page 4


A Novel Structured Natural Gradient Descent for Deep Learning

Natural gradient descent (NGD) provided deep insights and powerful tools...

Every Model Learned by Gradient Descent Is Approximately a Kernel Machine

Deep learning's successes are often attributed to its ability to automat...

Optimizing DDPM Sampling with Shortcut Fine-Tuning

In this study, we propose Shortcut Fine-tuning (SFT), a new approach for...

Incorporating the Barzilai-Borwein Adaptive Step Size into Sugradient Methods for Deep Network Training

In this paper, we incorporate the Barzilai-Borwein step size into gradie...

Are training trajectories of deep single-spike and deep ReLU network equivalent?

Communication by binary and sparse spikes is a key factor for the energy...

Please sign up or login with your details

Forgot password? Click here to reset