Optimal Disclosure Limitation Strategy in Statistical Databases: Deterring Tracker Attacks through Additive Noise

Disclosure limitation methods transform statistical databases to protect confidentiality, a practical concern of statistical agencies. A statistical database responds to queries with aggregate statistics. The database administrator should maximize legitimate data access while keeping the risk of dis...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Journal of the American Statistical Association 2000-09, Vol.95 (451), p.720-729
Hauptverfasser: Duncan, George T., Mukherjee, Sumitra
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Disclosure limitation methods transform statistical databases to protect confidentiality, a practical concern of statistical agencies. A statistical database responds to queries with aggregate statistics. The database administrator should maximize legitimate data access while keeping the risk of disclosure below an acceptable level. Legitimate users seek statistical information, generally in aggregate form; malicious users-the data snoopers-attempt to infer confidential information about an individual data subject. Tracker attacks are of special concern for databases accessed online. This article derives optimal disclosure limitation strategies under tracker attacks for the important case of data masking through additive noise. Operational measures of the utility of data access and of disclosure risk are developed. The utility of data access is expressed so that trade-offs can be made between the quantity and the quality of data to be released. Application is made to Ohio data from the 1990 census. The article derives conditions under which an attack by a data snooper is better thwarted by a combination of query restriction and data masking than by either disclosure limitation method separately. Data masking by independent noise addition and data perturbation are considered as extreme cases in the continuum of data masking using positively correlated additive noise. Optimal strategies are established for the data snooper. Circumstances are determined under which adding autocorrelated noise is preferable to using existing methods of either independent noise addition or data perturbation. Both moving average and autoregressive noise addition are considered.
ISSN:0162-1459
1537-274X
DOI:10.1080/01621459.2000.10474260