Brand New K-FACs: Speeding up K-FAC with Online Decomposition Updates

10/16/2022
by   Constantin Octavian Puiu, et al.
0

K-FAC (arXiv:1503.05671, arXiv:1602.01407) is a tractable implementation of Natural Gradient (NG) for Deep Learning (DL), whose bottleneck is computing the inverses of the so-called “Kronecker-Factors” (K-factors). RS-KFAC (arXiv:2206.15397) is a K-FAC improvement which provides a cheap way of estimating the K-factors inverses. In this paper, we exploit the exponential-average construction paradigm of the K-factors, and use online numerical linear algebra techniques to propose an even cheaper (but less accurate) way of estimating the K-factors inverses for Fully Connected layers. Numerical results show RS-KFAC's inversion error can be reduced with minimal CPU overhead by adding our proposed update to it. Based on the proposed procedure, a correction to it, and RS-KFAC, we propose three practical algorithms for optimizing generic Deep Neural Nets. Numerical results show that two of these outperform RS-KFAC for any target test accuracy on CIFAR10 classification with a slightly modified version of VGG16_bn. Our proposed algorithms achieve 91% test accuracy faster than SENG (the state of art implementation of empirical NG for DL; arXiv:2006.05924) but underperform it for higher test-accuracy.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset