A New Linear Scaling Rule for Private Adaptive Hyperparameter Optimization
An open problem in differentially private deep learning is hyperparameter optimization (HPO). DP-SGD introduces new hyperparameters and complicates existing ones, forcing researchers to painstakingly tune hyperparameters with hundreds of trials, which in turn makes it impossible to account for the p...
Gespeichert in:
Hauptverfasser: | , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | An open problem in differentially private deep learning is hyperparameter
optimization (HPO). DP-SGD introduces new hyperparameters and complicates
existing ones, forcing researchers to painstakingly tune hyperparameters with
hundreds of trials, which in turn makes it impossible to account for the
privacy cost of HPO without destroying the utility. We propose an adaptive HPO
method that uses cheap trials (in terms of privacy cost and runtime) to
estimate optimal hyperparameters and scales them up. We obtain state-of-the-art
performance on 22 benchmark tasks, across computer vision and natural language
processing, across pretraining and finetuning, across architectures and a wide
range of $\varepsilon \in [0.01,8.0]$, all while accounting for the privacy
cost of HPO. |
---|---|
DOI: | 10.48550/arxiv.2212.04486 |