On the Convergence and Calibration of Deep Learning with Differential Privacy
Differentially private (DP) training preserves the data privacy usually at the cost of slower convergence (and thus lower accuracy), as well as more severe mis-calibration than its non-private counterpart. To analyze the convergence of DP training, we formulate a continuous time analysis through the...
Gespeichert in:
Hauptverfasser: | , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Differentially private (DP) training preserves the data privacy usually at
the cost of slower convergence (and thus lower accuracy), as well as more
severe mis-calibration than its non-private counterpart. To analyze the
convergence of DP training, we formulate a continuous time analysis through the
lens of neural tangent kernel (NTK), which characterizes the per-sample
gradient clipping and the noise addition in DP training, for arbitrary network
architectures and loss functions. Interestingly, we show that the noise
addition only affects the privacy risk but not the convergence or calibration,
whereas the per-sample gradient clipping (under both flat and layerwise
clipping styles) only affects the convergence and calibration.
Furthermore, we observe that while DP models trained with small clipping norm
usually achieve the best accurate, but are poorly calibrated and thus
unreliable. In sharp contrast, DP models trained with large clipping norm enjoy
the same privacy guarantee and similar accuracy, but are significantly more
\textit{calibrated}. Our code can be found at
\url{https://github.com/woodyx218/opacus_global_clipping}. |
---|---|
DOI: | 10.48550/arxiv.2106.07830 |