PrivTDSI: A Local Differentially Private Approach for Truth Discovery via Sampling and Inference
Truth discovery is an effective way to identify the aggregated truth of each task among multiple observed data drawn from different workers of varying reliabilities. However, existing studies are insufficient to protect individuals' privacy, as they either just guarantee the weaker versions of...
Gespeichert in:
Veröffentlicht in: | IEEE transactions on big data 2023-04, Vol.9 (2), p.471-484 |
---|---|
Hauptverfasser: | , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Truth discovery is an effective way to identify the aggregated truth of each task among multiple observed data drawn from different workers of varying reliabilities. However, existing studies are insufficient to protect individuals' privacy, as they either just guarantee the weaker versions of local differential privacy (LDP) or potentially assume that the tasks are independent. In this paper, we, for the first time, investigate the problem of truth discovery while achieving the rigorous LDP for each worker with continuous inputs without the independence assumption. We present a locally differentially private truth discovery approach called PrivTDSI based on sampling and inference with solid privacy and utility guarantees. In PrivTDSI , the server first determines which values of each worker should be sampled according to a sample proportion and sends the indexes of these values to each worker. Then, each worker adds noise into the sampled values for privacy protection and uploads them to the server. After receiving the noisy sampled values from all the workers, the server first infers the unsampled values and then conducts truth discovery based on both the noisy sampled values and the inferred values. In particular, to determine the sample proportion, we formulate a constrained nonlinear programming problem and give a closed-form solution to this problem. Moreover, to determine which values of each worker should be sampled while avoiding the situation where the values of some workers or tasks might not be sampled at all, we develop a two-stage sampling method called TOSS . Furthermore, to infer the unsampled values accurately, we design a quality-aware inference method based on matrix factorization called QualityMF . Experimental results on two real-world datasets and a synthetic dataset demonstrate the effectiveness of {PrivTDSI} PrivTDSI . |
---|---|
ISSN: | 2332-7790 2332-7790 2372-2096 |
DOI: | 10.1109/TBDATA.2022.3186175 |