Permute to Train: A New Dimension to Training Deep Neural Networks
We show that Deep Neural Networks (DNNs) can be efficiently trained by permuting neuron connections. We introduce a new family of methods to train DNNs called Permute to Train (P2T). Two implementations of P2T are presented: Stochastic Gradient Permutation and Lookahead Permutation. The former computes permutation based on gradient, and the latter depends on another optimizer to derive the permutation. We empirically show that our proposed method, despite only swapping randomly weighted connections, achieves comparable accuracy to that of Adam on MNIST, Fashion-MNIST, and CIFAR-10 datasets. It opens up possibilities for new ways to train and regularize DNNs.
READ FULL TEXT