Sample-efficient private data release for Lipschitz functions under sparsity assumptions

Differential privacy is the de facto standard for protecting privacy in a variety of applications. One of the key challenges is private data release, which is particularly relevant in scenarios where limited information about the desired statistics is available beforehand. Recent work has presented...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:arXiv.org 2023-02
Hauptverfasser: Donhauser, Konstantin, Lokna, Johan, Sanyal, Amartya, March Boedihardjo, Hönig, Robert, Yang, Fanny
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Differential privacy is the de facto standard for protecting privacy in a variety of applications. One of the key challenges is private data release, which is particularly relevant in scenarios where limited information about the desired statistics is available beforehand. Recent work has presented a differentially private data release algorithm that achieves optimal rates of order \(n^{-1/d}\), with \(n\) being the size of the dataset and \(d\) being the dimension, for the worst-case error over all Lipschitz continuous statistics. This type of guarantee is desirable in many practical applications, as for instance it ensures that clusters present in the data are preserved. However, due to the "slow" rates, it is often infeasible in practice unless the dimension of the data is small. We demonstrate that these rates can be significantly improved to \(n^{-1/s}\) when only guarantees over s-sparse Lipschitz continuous functions are required, or to \(n^{-1/(s+1)}\) when the data lies on an unknown s-dimensional subspace, disregarding logarithmic factors. We therefore obtain practically meaningful rates for moderate constants \(s\) which motivates future work on computationally efficient approximate algorithms for this~problem.
ISSN:2331-8422