Differentially Private Heavy Hitter Detection using Federated Analytics

In this work, we study practical heuristics to improve the performance of prefix-tree based algorithms for differentially private heavy hitter detection. Our model assumes each user has multiple data points and the goal is to learn as many of the most frequent data points as possible across all user...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Chadha, Karan, Chen, Junye, Duchi, John, Feldman, Vitaly, Hashemi, Hanieh, Javidbakht, Omid, McMillan, Audra, Talwar, Kunal
Format:	Artikel
Sprache:	eng
Schlagworte:	Computer Science - Cryptography and Security Computer Science - Learning
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title
container_volume
creator	Chadha, Karan Chen, Junye Duchi, John Feldman, Vitaly Hashemi, Hanieh Javidbakht, Omid McMillan, Audra Talwar, Kunal
description	In this work, we study practical heuristics to improve the performance of prefix-tree based algorithms for differentially private heavy hitter detection. Our model assumes each user has multiple data points and the goal is to learn as many of the most frequent data points as possible across all users' data with aggregate and local differential privacy. We propose an adaptive hyperparameter tuning algorithm that improves the performance of the algorithm while satisfying computational, communication and privacy constraints. We explore the impact of different data-selection schemes as well as the impact of introducing deny lists during multiple runs of the algorithm. We test these improvements using extensive experimentation on the Reddit dataset~\cite{caldas2018leaf} on the task of learning the most frequent words.
doi_str_mv	10.48550/arxiv.2307.11749
format	Article
fullrecord	<record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2307_11749</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2307_11749</sourcerecordid><originalsourceid>FETCH-LOGICAL-a679-c66790fd0589176b31ee06aedd137959401d0bbfe2bcf1c68f051573dc8a92fb3</originalsourceid><addsrcrecordid>eNotz71uwyAUBWCWDlXaB-gUXsAuGANmjPLnSpGSIbt1gUuE5LgVplb89k3TLucsR0f6CHnjrKwbKdk7pFucykowXXKua_NM9psYAiYccoS-n-kpxQky0hZhmmkbc8ZEN5jR5fg50O8xDhe6Q4_pvvJ0NUA_5-jGF_IUoB_x9b8X5LzbntdtcTjuP9arQwFKm8Kpe7LgmWwM18oKjsgUoPdcaCNNzbhn1gasrAvcqSYwyaUW3jVgqmDFgiz_bh-S7ivFK6S5-xV1D5H4AYorRtg</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Differentially Private Heavy Hitter Detection using Federated Analytics</title><source>arXiv.org</source><creator>Chadha, Karan ; Chen, Junye ; Duchi, John ; Feldman, Vitaly ; Hashemi, Hanieh ; Javidbakht, Omid ; McMillan, Audra ; Talwar, Kunal</creator><creatorcontrib>Chadha, Karan ; Chen, Junye ; Duchi, John ; Feldman, Vitaly ; Hashemi, Hanieh ; Javidbakht, Omid ; McMillan, Audra ; Talwar, Kunal</creatorcontrib><description>In this work, we study practical heuristics to improve the performance of prefix-tree based algorithms for differentially private heavy hitter detection. Our model assumes each user has multiple data points and the goal is to learn as many of the most frequent data points as possible across all users' data with aggregate and local differential privacy. We propose an adaptive hyperparameter tuning algorithm that improves the performance of the algorithm while satisfying computational, communication and privacy constraints. We explore the impact of different data-selection schemes as well as the impact of introducing deny lists during multiple runs of the algorithm. We test these improvements using extensive experimentation on the Reddit dataset~\cite{caldas2018leaf} on the task of learning the most frequent words.</description><identifier>DOI: 10.48550/arxiv.2307.11749</identifier><language>eng</language><subject>Computer Science - Cryptography and Security ; Computer Science - Learning</subject><creationdate>2023-07</creationdate><rights>http://arxiv.org/licenses/nonexclusive-distrib/1.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,776,881</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2307.11749$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2307.11749$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Chadha, Karan</creatorcontrib><creatorcontrib>Chen, Junye</creatorcontrib><creatorcontrib>Duchi, John</creatorcontrib><creatorcontrib>Feldman, Vitaly</creatorcontrib><creatorcontrib>Hashemi, Hanieh</creatorcontrib><creatorcontrib>Javidbakht, Omid</creatorcontrib><creatorcontrib>McMillan, Audra</creatorcontrib><creatorcontrib>Talwar, Kunal</creatorcontrib><title>Differentially Private Heavy Hitter Detection using Federated Analytics</title><description>In this work, we study practical heuristics to improve the performance of prefix-tree based algorithms for differentially private heavy hitter detection. Our model assumes each user has multiple data points and the goal is to learn as many of the most frequent data points as possible across all users' data with aggregate and local differential privacy. We propose an adaptive hyperparameter tuning algorithm that improves the performance of the algorithm while satisfying computational, communication and privacy constraints. We explore the impact of different data-selection schemes as well as the impact of introducing deny lists during multiple runs of the algorithm. We test these improvements using extensive experimentation on the Reddit dataset~\cite{caldas2018leaf} on the task of learning the most frequent words.</description><subject>Computer Science - Cryptography and Security</subject><subject>Computer Science - Learning</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNotz71uwyAUBWCWDlXaB-gUXsAuGANmjPLnSpGSIbt1gUuE5LgVplb89k3TLucsR0f6CHnjrKwbKdk7pFucykowXXKua_NM9psYAiYccoS-n-kpxQky0hZhmmkbc8ZEN5jR5fg50O8xDhe6Q4_pvvJ0NUA_5-jGF_IUoB_x9b8X5LzbntdtcTjuP9arQwFKm8Kpe7LgmWwM18oKjsgUoPdcaCNNzbhn1gasrAvcqSYwyaUW3jVgqmDFgiz_bh-S7ivFK6S5-xV1D5H4AYorRtg</recordid><startdate>20230721</startdate><enddate>20230721</enddate><creator>Chadha, Karan</creator><creator>Chen, Junye</creator><creator>Duchi, John</creator><creator>Feldman, Vitaly</creator><creator>Hashemi, Hanieh</creator><creator>Javidbakht, Omid</creator><creator>McMillan, Audra</creator><creator>Talwar, Kunal</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20230721</creationdate><title>Differentially Private Heavy Hitter Detection using Federated Analytics</title><author>Chadha, Karan ; Chen, Junye ; Duchi, John ; Feldman, Vitaly ; Hashemi, Hanieh ; Javidbakht, Omid ; McMillan, Audra ; Talwar, Kunal</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a679-c66790fd0589176b31ee06aedd137959401d0bbfe2bcf1c68f051573dc8a92fb3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Computer Science - Cryptography and Security</topic><topic>Computer Science - Learning</topic><toplevel>online_resources</toplevel><creatorcontrib>Chadha, Karan</creatorcontrib><creatorcontrib>Chen, Junye</creatorcontrib><creatorcontrib>Duchi, John</creatorcontrib><creatorcontrib>Feldman, Vitaly</creatorcontrib><creatorcontrib>Hashemi, Hanieh</creatorcontrib><creatorcontrib>Javidbakht, Omid</creatorcontrib><creatorcontrib>McMillan, Audra</creatorcontrib><creatorcontrib>Talwar, Kunal</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Chadha, Karan</au><au>Chen, Junye</au><au>Duchi, John</au><au>Feldman, Vitaly</au><au>Hashemi, Hanieh</au><au>Javidbakht, Omid</au><au>McMillan, Audra</au><au>Talwar, Kunal</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Differentially Private Heavy Hitter Detection using Federated Analytics</atitle><date>2023-07-21</date><risdate>2023</risdate><abstract>In this work, we study practical heuristics to improve the performance of prefix-tree based algorithms for differentially private heavy hitter detection. Our model assumes each user has multiple data points and the goal is to learn as many of the most frequent data points as possible across all users' data with aggregate and local differential privacy. We propose an adaptive hyperparameter tuning algorithm that improves the performance of the algorithm while satisfying computational, communication and privacy constraints. We explore the impact of different data-selection schemes as well as the impact of introducing deny lists during multiple runs of the algorithm. We test these improvements using extensive experimentation on the Reddit dataset~\cite{caldas2018leaf} on the task of learning the most frequent words.</abstract><doi>10.48550/arxiv.2307.11749</doi><oa>free_for_read</oa></addata></record>
fulltext	fulltext_linktorsrc
identifier	DOI: 10.48550/arxiv.2307.11749
ispartof
issn
language	eng
recordid	cdi_arxiv_primary_2307_11749
source	arXiv.org
subjects	Computer Science - Cryptography and Security Computer Science - Learning
title	Differentially Private Heavy Hitter Detection using Federated Analytics
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-04T17%3A53%3A19IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Differentially%20Private%20Heavy%20Hitter%20Detection%20using%20Federated%20Analytics&rft.au=Chadha,%20Karan&rft.date=2023-07-21&rft_id=info:doi/10.48550/arxiv.2307.11749&rft_dat=%3Carxiv_GOX%3E2307_11749%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true