A major direction in differentially private machine learning is differentially private fine-tuning: pretraining a model on a source of "public data" and transferring the extracted features to downstream tasks. This is an important setting because many industry deployments fine-tune publicly available feature extractors on proprietary data for downstream tasks. In this paper, we carefully integrate techniques, both new and from prior work, to solve benchmark tasks in computer vision and natural language processing using differentially private fine-tuning. Our key insight is that by accelerating training with the choice of key hyperparameters, we can quickly drive the model parameters to regions in parameter space where the impact of noise is minimized. We obtain new state-of-the art performance on CIFAR10, CIFAR100, FashionMNIST, STL10, and PersonaChat, including $99 \%$ on CIFAR10 for $\varepsilon=1, \delta=1e-5$-DP.

Title:DP-RAFT: A Differentially Private Recipe for Accelerated Fine-Tuning

Paper and Code