Recycling Scraps: Improving Private Learning by Leveraging Intermediate Checkpoints
In this work, we focus on improving the accuracy-variance trade-off for state-of-the-art differentially private machine learning (DP ML) methods. First, we design a general framework that uses aggregates of intermediate checkpoints \emph{during training} to increase the accuracy of DP ML techniques....
Gespeichert in:
Hauptverfasser: | , , , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | In this work, we focus on improving the accuracy-variance trade-off for
state-of-the-art differentially private machine learning (DP ML) methods.
First, we design a general framework that uses aggregates of intermediate
checkpoints \emph{during training} to increase the accuracy of DP ML
techniques. Specifically, we demonstrate that training over aggregates can
provide significant gains in prediction accuracy over the existing
state-of-the-art for StackOverflow, CIFAR10 and CIFAR100 datasets. For
instance, we improve the state-of-the-art DP StackOverflow accuracies to
22.74\% (+2.06\% relative) for $\epsilon=8.2$, and 23.90\% (+2.09\%) for
$\epsilon=18.9$. Furthermore, these gains magnify in settings with periodically
varying training data distributions. We also demonstrate that our methods
achieve relative improvements of 0.54\% and 62.6\% in terms of utility and
variance, on a proprietary, production-grade pCVR task. Lastly, we initiate an
exploration into estimating the uncertainty (variance) that DP noise adds in
the predictions of DP ML models. We prove that, under standard assumptions on
the loss function, the sample variance from last few checkpoints provides a
good approximation of the variance of the final model of a DP run. Empirically,
we show that the last few checkpoints can provide a reasonable lower bound for
the variance of a converged DP model. Crucially, all the methods proposed in
this paper operate on \emph{a single training run} of the DP ML technique, thus
incurring no additional privacy cost. |
---|---|
DOI: | 10.48550/arxiv.2210.01864 |