Stability is Stable: Connections between Replicability, Privacy, and Adaptive Generalization
The notion of replicable algorithms was introduced in Impagliazzo et al. [STOC '22] to describe randomized algorithms that are stable under the resampling of their inputs. More precisely, a replicable algorithm gives the same output with high probability when its randomness is fixed and it is r...
Gespeichert in:
Hauptverfasser: | , , , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | The notion of replicable algorithms was introduced in Impagliazzo et al.
[STOC '22] to describe randomized algorithms that are stable under the
resampling of their inputs. More precisely, a replicable algorithm gives the
same output with high probability when its randomness is fixed and it is run on
a new i.i.d. sample drawn from the same distribution. Using replicable
algorithms for data analysis can facilitate the verification of published
results by ensuring that the results of an analysis will be the same with high
probability, even when that analysis is performed on a new data set.
In this work, we establish new connections and separations between
replicability and standard notions of algorithmic stability. In particular, we
give sample-efficient algorithmic reductions between perfect generalization,
approximate differential privacy, and replicability for a broad class of
statistical problems. Conversely, we show any such equivalence must break down
computationally: there exist statistical problems that are easy under
differential privacy, but that cannot be solved replicably without breaking
public-key cryptography. Furthermore, these results are tight: our reductions
are statistically optimal, and we show that any computational separation
between DP and replicability must imply the existence of one-way functions.
Our statistical reductions give a new algorithmic framework for translating
between notions of stability, which we instantiate to answer several open
questions in replicability and privacy. This includes giving sample-efficient
replicable algorithms for various PAC learning, distribution estimation, and
distribution testing problems, algorithmic amplification of $\delta$ in
approximate DP, conversions from item-level to user-level privacy, and the
existence of private agnostic-to-realizable learning reductions under
structured distributions. |
---|---|
DOI: | 10.48550/arxiv.2303.12921 |