Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:On the accuracy and efficiency of group-wise clipping in differentially private optimization

Oct 30, 2023

Zhiqi Bu, Ruixuan Liu, Yu-Xiang Wang, Sheng Zha, George Karypis

Figure 1 for On the accuracy and efficiency of group-wise clipping in differentially private optimization

Figure 2 for On the accuracy and efficiency of group-wise clipping in differentially private optimization

Figure 3 for On the accuracy and efficiency of group-wise clipping in differentially private optimization

Figure 4 for On the accuracy and efficiency of group-wise clipping in differentially private optimization

Share this with someone who'll enjoy it:

Abstract:Recent advances have substantially improved the accuracy, memory cost, and training speed of differentially private (DP) deep learning, especially on large vision and language models with millions to billions of parameters. In this work, we thoroughly study the per-sample gradient clipping style, a key component in DP optimization. We show that different clipping styles have the same time complexity but instantiate an accuracy-memory trade-off: while the all-layer clipping (of coarse granularity) is the most prevalent and usually gives the best accuracy, it incurs heavier memory cost compared to other group-wise clipping, such as the layer-wise clipping (of finer granularity). We formalize this trade-off through our convergence theory and complexity analysis. Importantly, we demonstrate that the accuracy gap between group-wise clipping and all-layer clipping becomes smaller for larger models, while the memory advantage of the group-wise clipping remains. Consequently, the group-wise clipping allows DP optimization of large models to achieve high accuracy and low peak memory simultaneously.

View paper on

Share this with someone who'll enjoy it:

Title:On the accuracy and efficiency of group-wise clipping in differentially private optimization

Paper and Code