Differentially Private Optimization on Large Model at Small Cost

09/30/2022
by   Zhiqi Bu, et al.
0

Differentially private (DP) optimization is the standard paradigm to learn large neural networks that are accurate and privacy-preserving. The computational cost for DP deep learning, however, is notoriously heavy due to the per-sample gradient clipping. Existing DP implementations are 2-1000× more costly in time and space complexity than the standard (non-private) training. In this work, we develop a novel Book-Keeping (BK) technique that implements existing DP optimizers (thus achieving the same accuracy), with a substantial improvement on the computational cost. Specifically, BK enables DP training on large models and high dimensional data to be roughly as efficient as the standard training, whereas previous DP algorithms can be inefficient or incapable of training due to memory error. The computational advantage of BK is supported by the complexity analysis as well as extensive experiments on vision and language tasks. Our implementation achieves state-of-the-art (SOTA) accuracy with very small extra cost: on GPT2 and at the same memory cost, BK has 1.0× the time complexity of the standard training (0.75× training speed in practice), and 0.6× the time complexity of the most efficient DP implementation (1.24× training speed in practice). We will open-source the codebase for the BK algorithm.

READ FULL TEXT

page 18

page 19

research
05/21/2022

Scalable and Efficient Training of Large Convolutional Neural Networks with Differential Privacy

Large convolutional neural networks (CNN) can be difficult to train in t...
research
09/30/2022

Differentially Private Bias-Term only Fine-tuning of Foundation Models

We study the problem of differentially private (DP) fine-tuning of large...
research
06/26/2023

Optimal Differentially Private Learning with Public Data

Differential Privacy (DP) ensures that training a machine learning model...
research
06/14/2022

Automatic Clipping: Differentially Private Deep Learning Made Easier and Stronger

Per-example gradient clipping is a key algorithmic step that enables pra...
research
10/12/2021

Large Language Models Can Be Strong Differentially Private Learners

Differentially Private (DP) learning has seen limited success for buildi...
research
05/25/2023

DP-SGD Without Clipping: The Lipschitz Neural Network Way

State-of-the-art approaches for training Differentially Private (DP) Dee...
research
07/04/2023

Fast Private Kernel Density Estimation via Locality Sensitive Quantization

We study efficient mechanisms for differentially private kernel density ...

Please sign up or login with your details

Forgot password? Click here to reset