Multi-Resolution Diffusion for Privacy-Sensitive Recommender Systems

While recommender systems have become an integral component of the Web experience, their heavy reliance on user data raises privacy and security concerns. Substituting user data with synthetic data can address these concerns, but accurately replicating these real-world datasets has been a notoriousl...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE access 2024, Vol.12, p.58275-58287
Hauptverfasser:	Lilienthal, Derek, Mello, Paul, Eirinaki, Magdalini, Tiomkin, Stas
Format:	Artikel
Sprache:	eng
Schlagworte:	Data models Data privacy Datasets diffusion models Diffusion processes Gaussian distribution Generative adversarial networks Generative artificial intelligence Machine learning Noise reduction Privacy Recommender systems Synthetic data Training
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	58287
container_issue
container_start_page	58275
container_title	IEEE access
container_volume	12
creator	Lilienthal, Derek Mello, Paul Eirinaki, Magdalini Tiomkin, Stas
description	While recommender systems have become an integral component of the Web experience, their heavy reliance on user data raises privacy and security concerns. Substituting user data with synthetic data can address these concerns, but accurately replicating these real-world datasets has been a notoriously challenging problem. Recent advancements in generative AI have demonstrated the impressive capabilities of diffusion models in generating realistic data across various domains. In this work we introduce a Score-based Diffusion Recommendation Module (SDRM), which captures the intricate patterns of real-world datasets required for training highly accurate recommender systems. SDRM allows for the generation of synthetic data that can replace existing datasets to preserve user privacy, or augment existing datasets to address excessive data sparsity. Our method outperforms competing baselines such as generative adversarial networks, variational autoencoders, and recently proposed diffusion models in synthesizing various datasets to replace or augment the original data by an average improvement of 4.30% in Recall@ k and 4.65% in NDCG@ k .
doi_str_mv	10.1109/ACCESS.2024.3388299
format	Article
fullrecord	<record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_3050303872</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>10497577</ieee_id><doaj_id>oai_doaj_org_article_8cd3581cdfc244a08f541df60f6275ba</doaj_id><sourcerecordid>3050303872</sourcerecordid><originalsourceid>FETCH-LOGICAL-c359t-1dfa2c3faf709063ec4fd4c59b89e3659df6be0749b94f1e3ca102824bb008c63</originalsourceid><addsrcrecordid>eNpNUF1LwzAULaLgmPsF-jDwuTOfbfI4uqmDibLpc0jTG8nYmpm0g_17Oztk9-UeLvd8cJLkHqMJxkg-TYtivl5PCCJsQqkQRMqrZEBwJlPKaXZ9gW-TUYwb1I3oTjwfJLO3dtu4dAXRb9vG-Xo8c9a28YSsD-OP4A7aHNM11NE17gDjFRi_20FdQRivj7GBXbxLbqzeRhid9zD5ep5_Fq_p8v1lUUyXqaFcNimurCaGWm1zJFFGwTBbMcNlKSTQjMvKZiWgnMlSMouBGo0REYSVZRfYZHSYLHrdyuuN2ge30-GovHbq7-DDt9KhcWYLSpiKcoFNZQ1hTCNhOev8M2QzkvNSd1qPvdY--J8WYqM2vg11F19RxBFFVOSk-6L9lwk-xgD23xUjdWpf9e2rU_vq3H7HeuhZDgAuGEzmPM_pLzX8gPU</addsrcrecordid><sourcetype>Open Website</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3050303872</pqid></control><display><type>article</type><title>Multi-Resolution Diffusion for Privacy-Sensitive Recommender Systems</title><source>IEEE Open Access Journals</source><source>DOAJ Directory of Open Access Journals</source><source>EZB-FREE-00999 freely available EZB journals</source><creator>Lilienthal, Derek ; Mello, Paul ; Eirinaki, Magdalini ; Tiomkin, Stas</creator><creatorcontrib>Lilienthal, Derek ; Mello, Paul ; Eirinaki, Magdalini ; Tiomkin, Stas</creatorcontrib><description><![CDATA[While recommender systems have become an integral component of the Web experience, their heavy reliance on user data raises privacy and security concerns. Substituting user data with synthetic data can address these concerns, but accurately replicating these real-world datasets has been a notoriously challenging problem. Recent advancements in generative AI have demonstrated the impressive capabilities of diffusion models in generating realistic data across various domains. In this work we introduce a Score-based Diffusion Recommendation Module (SDRM), which captures the intricate patterns of real-world datasets required for training highly accurate recommender systems. SDRM allows for the generation of synthetic data that can replace existing datasets to preserve user privacy, or augment existing datasets to address excessive data sparsity. Our method outperforms competing baselines such as generative adversarial networks, variational autoencoders, and recently proposed diffusion models in synthesizing various datasets to replace or augment the original data by an average improvement of 4.30% in Recall@<inline-formula> <tex-math notation="LaTeX">k </tex-math></inline-formula> and 4.65% in NDCG@<inline-formula> <tex-math notation="LaTeX">k </tex-math></inline-formula>.]]></description><identifier>ISSN: 2169-3536</identifier><identifier>EISSN: 2169-3536</identifier><identifier>DOI: 10.1109/ACCESS.2024.3388299</identifier><identifier>CODEN: IAECCG</identifier><language>eng</language><publisher>Piscataway: IEEE</publisher><subject>Data models ; Data privacy ; Datasets ; diffusion models ; Diffusion processes ; Gaussian distribution ; Generative adversarial networks ; Generative artificial intelligence ; Machine learning ; Noise reduction ; Privacy ; Recommender systems ; Synthetic data ; Training</subject><ispartof>IEEE access, 2024, Vol.12, p.58275-58287</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2024</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c359t-1dfa2c3faf709063ec4fd4c59b89e3659df6be0749b94f1e3ca102824bb008c63</cites><orcidid>0000-0002-4711-3366 ; 0009-0003-0407-6424 ; 0009-0008-8306-9877 ; 0000-0003-3677-6874</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/10497577$$EHTML$$P50$$Gieee$$Hfree_for_read</linktohtml><link.rule.ids>314,776,780,860,2096,4010,27610,27900,27901,27902,54908</link.rule.ids></links><search><creatorcontrib>Lilienthal, Derek</creatorcontrib><creatorcontrib>Mello, Paul</creatorcontrib><creatorcontrib>Eirinaki, Magdalini</creatorcontrib><creatorcontrib>Tiomkin, Stas</creatorcontrib><title>Multi-Resolution Diffusion for Privacy-Sensitive Recommender Systems</title><title>IEEE access</title><addtitle>Access</addtitle><description><![CDATA[While recommender systems have become an integral component of the Web experience, their heavy reliance on user data raises privacy and security concerns. Substituting user data with synthetic data can address these concerns, but accurately replicating these real-world datasets has been a notoriously challenging problem. Recent advancements in generative AI have demonstrated the impressive capabilities of diffusion models in generating realistic data across various domains. In this work we introduce a Score-based Diffusion Recommendation Module (SDRM), which captures the intricate patterns of real-world datasets required for training highly accurate recommender systems. SDRM allows for the generation of synthetic data that can replace existing datasets to preserve user privacy, or augment existing datasets to address excessive data sparsity. Our method outperforms competing baselines such as generative adversarial networks, variational autoencoders, and recently proposed diffusion models in synthesizing various datasets to replace or augment the original data by an average improvement of 4.30% in Recall@<inline-formula> <tex-math notation="LaTeX">k </tex-math></inline-formula> and 4.65% in NDCG@<inline-formula> <tex-math notation="LaTeX">k </tex-math></inline-formula>.]]></description><subject>Data models</subject><subject>Data privacy</subject><subject>Datasets</subject><subject>diffusion models</subject><subject>Diffusion processes</subject><subject>Gaussian distribution</subject><subject>Generative adversarial networks</subject><subject>Generative artificial intelligence</subject><subject>Machine learning</subject><subject>Noise reduction</subject><subject>Privacy</subject><subject>Recommender systems</subject><subject>Synthetic data</subject><subject>Training</subject><issn>2169-3536</issn><issn>2169-3536</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>ESBDL</sourceid><sourceid>RIE</sourceid><sourceid>DOA</sourceid><recordid>eNpNUF1LwzAULaLgmPsF-jDwuTOfbfI4uqmDibLpc0jTG8nYmpm0g_17Oztk9-UeLvd8cJLkHqMJxkg-TYtivl5PCCJsQqkQRMqrZEBwJlPKaXZ9gW-TUYwb1I3oTjwfJLO3dtu4dAXRb9vG-Xo8c9a28YSsD-OP4A7aHNM11NE17gDjFRi_20FdQRivj7GBXbxLbqzeRhid9zD5ep5_Fq_p8v1lUUyXqaFcNimurCaGWm1zJFFGwTBbMcNlKSTQjMvKZiWgnMlSMouBGo0REYSVZRfYZHSYLHrdyuuN2ge30-GovHbq7-DDt9KhcWYLSpiKcoFNZQ1hTCNhOev8M2QzkvNSd1qPvdY--J8WYqM2vg11F19RxBFFVOSk-6L9lwk-xgD23xUjdWpf9e2rU_vq3H7HeuhZDgAuGEzmPM_pLzX8gPU</recordid><startdate>2024</startdate><enddate>2024</enddate><creator>Lilienthal, Derek</creator><creator>Mello, Paul</creator><creator>Eirinaki, Magdalini</creator><creator>Tiomkin, Stas</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>ESBDL</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>7SR</scope><scope>8BQ</scope><scope>8FD</scope><scope>JG9</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>DOA</scope><orcidid>https://orcid.org/0000-0002-4711-3366</orcidid><orcidid>https://orcid.org/0009-0003-0407-6424</orcidid><orcidid>https://orcid.org/0009-0008-8306-9877</orcidid><orcidid>https://orcid.org/0000-0003-3677-6874</orcidid></search><sort><creationdate>2024</creationdate><title>Multi-Resolution Diffusion for Privacy-Sensitive Recommender Systems</title><author>Lilienthal, Derek ; Mello, Paul ; Eirinaki, Magdalini ; Tiomkin, Stas</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c359t-1dfa2c3faf709063ec4fd4c59b89e3659df6be0749b94f1e3ca102824bb008c63</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Data models</topic><topic>Data privacy</topic><topic>Datasets</topic><topic>diffusion models</topic><topic>Diffusion processes</topic><topic>Gaussian distribution</topic><topic>Generative adversarial networks</topic><topic>Generative artificial intelligence</topic><topic>Machine learning</topic><topic>Noise reduction</topic><topic>Privacy</topic><topic>Recommender systems</topic><topic>Synthetic data</topic><topic>Training</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Lilienthal, Derek</creatorcontrib><creatorcontrib>Mello, Paul</creatorcontrib><creatorcontrib>Eirinaki, Magdalini</creatorcontrib><creatorcontrib>Tiomkin, Stas</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE Open Access Journals</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Engineered Materials Abstracts</collection><collection>METADEX</collection><collection>Technology Research Database</collection><collection>Materials Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>DOAJ Directory of Open Access Journals</collection><jtitle>IEEE access</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Lilienthal, Derek</au><au>Mello, Paul</au><au>Eirinaki, Magdalini</au><au>Tiomkin, Stas</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Multi-Resolution Diffusion for Privacy-Sensitive Recommender Systems</atitle><jtitle>IEEE access</jtitle><stitle>Access</stitle><date>2024</date><risdate>2024</risdate><volume>12</volume><spage>58275</spage><epage>58287</epage><pages>58275-58287</pages><issn>2169-3536</issn><eissn>2169-3536</eissn><coden>IAECCG</coden><abstract><![CDATA[While recommender systems have become an integral component of the Web experience, their heavy reliance on user data raises privacy and security concerns. Substituting user data with synthetic data can address these concerns, but accurately replicating these real-world datasets has been a notoriously challenging problem. Recent advancements in generative AI have demonstrated the impressive capabilities of diffusion models in generating realistic data across various domains. In this work we introduce a Score-based Diffusion Recommendation Module (SDRM), which captures the intricate patterns of real-world datasets required for training highly accurate recommender systems. SDRM allows for the generation of synthetic data that can replace existing datasets to preserve user privacy, or augment existing datasets to address excessive data sparsity. Our method outperforms competing baselines such as generative adversarial networks, variational autoencoders, and recently proposed diffusion models in synthesizing various datasets to replace or augment the original data by an average improvement of 4.30% in Recall@<inline-formula> <tex-math notation="LaTeX">k </tex-math></inline-formula> and 4.65% in NDCG@<inline-formula> <tex-math notation="LaTeX">k </tex-math></inline-formula>.]]></abstract><cop>Piscataway</cop><pub>IEEE</pub><doi>10.1109/ACCESS.2024.3388299</doi><tpages>13</tpages><orcidid>https://orcid.org/0000-0002-4711-3366</orcidid><orcidid>https://orcid.org/0009-0003-0407-6424</orcidid><orcidid>https://orcid.org/0009-0008-8306-9877</orcidid><orcidid>https://orcid.org/0000-0003-3677-6874</orcidid><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	ISSN: 2169-3536
ispartof	IEEE access, 2024, Vol.12, p.58275-58287
issn	2169-3536 2169-3536
language	eng
recordid	cdi_proquest_journals_3050303872
source	IEEE Open Access Journals; DOAJ Directory of Open Access Journals; EZB-FREE-00999 freely available EZB journals
subjects	Data models Data privacy Datasets diffusion models Diffusion processes Gaussian distribution Generative adversarial networks Generative artificial intelligence Machine learning Noise reduction Privacy Recommender systems Synthetic data Training
title	Multi-Resolution Diffusion for Privacy-Sensitive Recommender Systems
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-05T07%3A20%3A12IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Multi-Resolution%20Diffusion%20for%20Privacy-Sensitive%20Recommender%20Systems&rft.jtitle=IEEE%20access&rft.au=Lilienthal,%20Derek&rft.date=2024&rft.volume=12&rft.spage=58275&rft.epage=58287&rft.pages=58275-58287&rft.issn=2169-3536&rft.eissn=2169-3536&rft.coden=IAECCG&rft_id=info:doi/10.1109/ACCESS.2024.3388299&rft_dat=%3Cproquest_cross%3E3050303872%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=3050303872&rft_id=info:pmid/&rft_ieee_id=10497577&rft_doaj_id=oai_doaj_org_article_8cd3581cdfc244a08f541df60f6275ba&rfr_iscdi=true