Differentially Private Heavy Hitter Detection using Federated Analytics
In this work, we study practical heuristics to improve the performance of prefix-tree based algorithms for differentially private heavy hitter detection. Our model assumes each user has multiple data points and the goal is to learn as many of the most frequent data points as possible across all user...
Gespeichert in:
Hauptverfasser: | , , , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | |
---|---|
container_issue | |
container_start_page | |
container_title | |
container_volume | |
creator | Chadha, Karan Chen, Junye Duchi, John Feldman, Vitaly Hashemi, Hanieh Javidbakht, Omid McMillan, Audra Talwar, Kunal |
description | In this work, we study practical heuristics to improve the performance of
prefix-tree based algorithms for differentially private heavy hitter detection.
Our model assumes each user has multiple data points and the goal is to learn
as many of the most frequent data points as possible across all users' data
with aggregate and local differential privacy. We propose an adaptive
hyperparameter tuning algorithm that improves the performance of the algorithm
while satisfying computational, communication and privacy constraints. We
explore the impact of different data-selection schemes as well as the impact of
introducing deny lists during multiple runs of the algorithm. We test these
improvements using extensive experimentation on the Reddit
dataset~\cite{caldas2018leaf} on the task of learning the most frequent words. |
doi_str_mv | 10.48550/arxiv.2307.11749 |
format | Article |
fullrecord | <record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2307_11749</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2307_11749</sourcerecordid><originalsourceid>FETCH-LOGICAL-a679-c66790fd0589176b31ee06aedd137959401d0bbfe2bcf1c68f051573dc8a92fb3</originalsourceid><addsrcrecordid>eNotz71uwyAUBWCWDlXaB-gUXsAuGANmjPLnSpGSIbt1gUuE5LgVplb89k3TLucsR0f6CHnjrKwbKdk7pFucykowXXKua_NM9psYAiYccoS-n-kpxQky0hZhmmkbc8ZEN5jR5fg50O8xDhe6Q4_pvvJ0NUA_5-jGF_IUoB_x9b8X5LzbntdtcTjuP9arQwFKm8Kpe7LgmWwM18oKjsgUoPdcaCNNzbhn1gasrAvcqSYwyaUW3jVgqmDFgiz_bh-S7ivFK6S5-xV1D5H4AYorRtg</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Differentially Private Heavy Hitter Detection using Federated Analytics</title><source>arXiv.org</source><creator>Chadha, Karan ; Chen, Junye ; Duchi, John ; Feldman, Vitaly ; Hashemi, Hanieh ; Javidbakht, Omid ; McMillan, Audra ; Talwar, Kunal</creator><creatorcontrib>Chadha, Karan ; Chen, Junye ; Duchi, John ; Feldman, Vitaly ; Hashemi, Hanieh ; Javidbakht, Omid ; McMillan, Audra ; Talwar, Kunal</creatorcontrib><description>In this work, we study practical heuristics to improve the performance of
prefix-tree based algorithms for differentially private heavy hitter detection.
Our model assumes each user has multiple data points and the goal is to learn
as many of the most frequent data points as possible across all users' data
with aggregate and local differential privacy. We propose an adaptive
hyperparameter tuning algorithm that improves the performance of the algorithm
while satisfying computational, communication and privacy constraints. We
explore the impact of different data-selection schemes as well as the impact of
introducing deny lists during multiple runs of the algorithm. We test these
improvements using extensive experimentation on the Reddit
dataset~\cite{caldas2018leaf} on the task of learning the most frequent words.</description><identifier>DOI: 10.48550/arxiv.2307.11749</identifier><language>eng</language><subject>Computer Science - Cryptography and Security ; Computer Science - Learning</subject><creationdate>2023-07</creationdate><rights>http://arxiv.org/licenses/nonexclusive-distrib/1.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,776,881</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2307.11749$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2307.11749$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Chadha, Karan</creatorcontrib><creatorcontrib>Chen, Junye</creatorcontrib><creatorcontrib>Duchi, John</creatorcontrib><creatorcontrib>Feldman, Vitaly</creatorcontrib><creatorcontrib>Hashemi, Hanieh</creatorcontrib><creatorcontrib>Javidbakht, Omid</creatorcontrib><creatorcontrib>McMillan, Audra</creatorcontrib><creatorcontrib>Talwar, Kunal</creatorcontrib><title>Differentially Private Heavy Hitter Detection using Federated Analytics</title><description>In this work, we study practical heuristics to improve the performance of
prefix-tree based algorithms for differentially private heavy hitter detection.
Our model assumes each user has multiple data points and the goal is to learn
as many of the most frequent data points as possible across all users' data
with aggregate and local differential privacy. We propose an adaptive
hyperparameter tuning algorithm that improves the performance of the algorithm
while satisfying computational, communication and privacy constraints. We
explore the impact of different data-selection schemes as well as the impact of
introducing deny lists during multiple runs of the algorithm. We test these
improvements using extensive experimentation on the Reddit
dataset~\cite{caldas2018leaf} on the task of learning the most frequent words.</description><subject>Computer Science - Cryptography and Security</subject><subject>Computer Science - Learning</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNotz71uwyAUBWCWDlXaB-gUXsAuGANmjPLnSpGSIbt1gUuE5LgVplb89k3TLucsR0f6CHnjrKwbKdk7pFucykowXXKua_NM9psYAiYccoS-n-kpxQky0hZhmmkbc8ZEN5jR5fg50O8xDhe6Q4_pvvJ0NUA_5-jGF_IUoB_x9b8X5LzbntdtcTjuP9arQwFKm8Kpe7LgmWwM18oKjsgUoPdcaCNNzbhn1gasrAvcqSYwyaUW3jVgqmDFgiz_bh-S7ivFK6S5-xV1D5H4AYorRtg</recordid><startdate>20230721</startdate><enddate>20230721</enddate><creator>Chadha, Karan</creator><creator>Chen, Junye</creator><creator>Duchi, John</creator><creator>Feldman, Vitaly</creator><creator>Hashemi, Hanieh</creator><creator>Javidbakht, Omid</creator><creator>McMillan, Audra</creator><creator>Talwar, Kunal</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20230721</creationdate><title>Differentially Private Heavy Hitter Detection using Federated Analytics</title><author>Chadha, Karan ; Chen, Junye ; Duchi, John ; Feldman, Vitaly ; Hashemi, Hanieh ; Javidbakht, Omid ; McMillan, Audra ; Talwar, Kunal</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a679-c66790fd0589176b31ee06aedd137959401d0bbfe2bcf1c68f051573dc8a92fb3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Computer Science - Cryptography and Security</topic><topic>Computer Science - Learning</topic><toplevel>online_resources</toplevel><creatorcontrib>Chadha, Karan</creatorcontrib><creatorcontrib>Chen, Junye</creatorcontrib><creatorcontrib>Duchi, John</creatorcontrib><creatorcontrib>Feldman, Vitaly</creatorcontrib><creatorcontrib>Hashemi, Hanieh</creatorcontrib><creatorcontrib>Javidbakht, Omid</creatorcontrib><creatorcontrib>McMillan, Audra</creatorcontrib><creatorcontrib>Talwar, Kunal</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Chadha, Karan</au><au>Chen, Junye</au><au>Duchi, John</au><au>Feldman, Vitaly</au><au>Hashemi, Hanieh</au><au>Javidbakht, Omid</au><au>McMillan, Audra</au><au>Talwar, Kunal</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Differentially Private Heavy Hitter Detection using Federated Analytics</atitle><date>2023-07-21</date><risdate>2023</risdate><abstract>In this work, we study practical heuristics to improve the performance of
prefix-tree based algorithms for differentially private heavy hitter detection.
Our model assumes each user has multiple data points and the goal is to learn
as many of the most frequent data points as possible across all users' data
with aggregate and local differential privacy. We propose an adaptive
hyperparameter tuning algorithm that improves the performance of the algorithm
while satisfying computational, communication and privacy constraints. We
explore the impact of different data-selection schemes as well as the impact of
introducing deny lists during multiple runs of the algorithm. We test these
improvements using extensive experimentation on the Reddit
dataset~\cite{caldas2018leaf} on the task of learning the most frequent words.</abstract><doi>10.48550/arxiv.2307.11749</doi><oa>free_for_read</oa></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | DOI: 10.48550/arxiv.2307.11749 |
ispartof | |
issn | |
language | eng |
recordid | cdi_arxiv_primary_2307_11749 |
source | arXiv.org |
subjects | Computer Science - Cryptography and Security Computer Science - Learning |
title | Differentially Private Heavy Hitter Detection using Federated Analytics |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-04T17%3A53%3A19IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Differentially%20Private%20Heavy%20Hitter%20Detection%20using%20Federated%20Analytics&rft.au=Chadha,%20Karan&rft.date=2023-07-21&rft_id=info:doi/10.48550/arxiv.2307.11749&rft_dat=%3Carxiv_GOX%3E2307_11749%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true |