LD4MRec: Simplifying and Powering Diffusion Model for Multimedia Recommendation

Multimedia recommendation aims to predict users' future behaviors based on historical behavioral data and item's multimodal information. However, noise inherent in behavioral data, arising from unintended user interactions with uninteresting items, detrimentally impacts recommendation perf...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Yu, Penghang, Tan, Zhiyi, Lu, Guanming, Bao, Bing-Kun
Format:	Artikel
Sprache:	eng
Schlagworte:	Computer Science - Information Retrieval
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title
container_volume
creator	Yu, Penghang Tan, Zhiyi Lu, Guanming Bao, Bing-Kun
description	Multimedia recommendation aims to predict users' future behaviors based on historical behavioral data and item's multimodal information. However, noise inherent in behavioral data, arising from unintended user interactions with uninteresting items, detrimentally impacts recommendation performance. Recently, diffusion models have achieved high-quality information generation, in which the reverse process iteratively infers future information based on the corrupted state. It meets the need of predictive tasks under noisy conditions, and inspires exploring their application to predicting user behaviors. Nonetheless, several challenges must be addressed: 1) Classical diffusion models require excessive computation, which does not meet the efficiency requirements of recommendation systems. 2) Existing reverse processes are mainly designed for continuous data, whereas behavioral information is discrete in nature. Therefore, an effective method is needed for the generation of discrete behavioral information. To tackle the aforementioned issues, we propose a Light Diffusion model for Multimedia Recommendation. First, to reduce computational complexity, we simplify the formula of the reverse process, enabling one-step inference instead of multi-step inference. Second, to achieve effective behavioral information generation, we propose a novel Conditional neural Network. It maps the discrete behavior data into a continuous latent space, and generates behaviors with the guidance of collaborative signals and user multimodal preference. Additionally, considering that completely clean behavior data is inaccessible, we introduce a soft behavioral reconstruction constraint during model training, facilitating behavior prediction with noisy data. Empirical studies conducted on three public datasets demonstrate the effectiveness of LD4MRec.
doi_str_mv	10.48550/arxiv.2309.15363
format	Article
fullrecord	<record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2309_15363</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2309_15363</sourcerecordid><originalsourceid>FETCH-LOGICAL-a673-aef4168ddbdd8fa5f06ae2f8c4299e1412289593fc88a0e5fbf05f2e77f6fecb3</originalsourceid><addsrcrecordid>eNotj81Kw0AUhWfjQqoP4Mp5gcT5z8SdtFqFhIp2H24y98pAkilpq_bt-6Orw4HzHfgYu5MiN95a8QDTb_zOlRZlLq12-pqtqoWpP7B75J9x2PSRDnH84jAG_p5-cDqXRSTab2MaeZ0C9pzSxOt9v4sDhgj8BKdhwDHA7rS5YVcE_RZv_3PG1i_P6_lrVq2Wb_OnKgNX6AyQjHQ-hDYET2BJOEBFvjOqLFEaqZQvbamp8x4EWmpJWFJYFOQIu1bP2P3f7cWo2UxxgOnQnM2ai5k-Auw8SiE</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>LD4MRec: Simplifying and Powering Diffusion Model for Multimedia Recommendation</title><source>arXiv.org</source><creator>Yu, Penghang ; Tan, Zhiyi ; Lu, Guanming ; Bao, Bing-Kun</creator><creatorcontrib>Yu, Penghang ; Tan, Zhiyi ; Lu, Guanming ; Bao, Bing-Kun</creatorcontrib><description>Multimedia recommendation aims to predict users' future behaviors based on historical behavioral data and item's multimodal information. However, noise inherent in behavioral data, arising from unintended user interactions with uninteresting items, detrimentally impacts recommendation performance. Recently, diffusion models have achieved high-quality information generation, in which the reverse process iteratively infers future information based on the corrupted state. It meets the need of predictive tasks under noisy conditions, and inspires exploring their application to predicting user behaviors. Nonetheless, several challenges must be addressed: 1) Classical diffusion models require excessive computation, which does not meet the efficiency requirements of recommendation systems. 2) Existing reverse processes are mainly designed for continuous data, whereas behavioral information is discrete in nature. Therefore, an effective method is needed for the generation of discrete behavioral information. To tackle the aforementioned issues, we propose a Light Diffusion model for Multimedia Recommendation. First, to reduce computational complexity, we simplify the formula of the reverse process, enabling one-step inference instead of multi-step inference. Second, to achieve effective behavioral information generation, we propose a novel Conditional neural Network. It maps the discrete behavior data into a continuous latent space, and generates behaviors with the guidance of collaborative signals and user multimodal preference. Additionally, considering that completely clean behavior data is inaccessible, we introduce a soft behavioral reconstruction constraint during model training, facilitating behavior prediction with noisy data. Empirical studies conducted on three public datasets demonstrate the effectiveness of LD4MRec.</description><identifier>DOI: 10.48550/arxiv.2309.15363</identifier><language>eng</language><subject>Computer Science - Information Retrieval</subject><creationdate>2023-09</creationdate><rights>http://arxiv.org/licenses/nonexclusive-distrib/1.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,780,885</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2309.15363$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2309.15363$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Yu, Penghang</creatorcontrib><creatorcontrib>Tan, Zhiyi</creatorcontrib><creatorcontrib>Lu, Guanming</creatorcontrib><creatorcontrib>Bao, Bing-Kun</creatorcontrib><title>LD4MRec: Simplifying and Powering Diffusion Model for Multimedia Recommendation</title><description>Multimedia recommendation aims to predict users' future behaviors based on historical behavioral data and item's multimodal information. However, noise inherent in behavioral data, arising from unintended user interactions with uninteresting items, detrimentally impacts recommendation performance. Recently, diffusion models have achieved high-quality information generation, in which the reverse process iteratively infers future information based on the corrupted state. It meets the need of predictive tasks under noisy conditions, and inspires exploring their application to predicting user behaviors. Nonetheless, several challenges must be addressed: 1) Classical diffusion models require excessive computation, which does not meet the efficiency requirements of recommendation systems. 2) Existing reverse processes are mainly designed for continuous data, whereas behavioral information is discrete in nature. Therefore, an effective method is needed for the generation of discrete behavioral information. To tackle the aforementioned issues, we propose a Light Diffusion model for Multimedia Recommendation. First, to reduce computational complexity, we simplify the formula of the reverse process, enabling one-step inference instead of multi-step inference. Second, to achieve effective behavioral information generation, we propose a novel Conditional neural Network. It maps the discrete behavior data into a continuous latent space, and generates behaviors with the guidance of collaborative signals and user multimodal preference. Additionally, considering that completely clean behavior data is inaccessible, we introduce a soft behavioral reconstruction constraint during model training, facilitating behavior prediction with noisy data. Empirical studies conducted on three public datasets demonstrate the effectiveness of LD4MRec.</description><subject>Computer Science - Information Retrieval</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNotj81Kw0AUhWfjQqoP4Mp5gcT5z8SdtFqFhIp2H24y98pAkilpq_bt-6Orw4HzHfgYu5MiN95a8QDTb_zOlRZlLq12-pqtqoWpP7B75J9x2PSRDnH84jAG_p5-cDqXRSTab2MaeZ0C9pzSxOt9v4sDhgj8BKdhwDHA7rS5YVcE_RZv_3PG1i_P6_lrVq2Wb_OnKgNX6AyQjHQ-hDYET2BJOEBFvjOqLFEaqZQvbamp8x4EWmpJWFJYFOQIu1bP2P3f7cWo2UxxgOnQnM2ai5k-Auw8SiE</recordid><startdate>20230926</startdate><enddate>20230926</enddate><creator>Yu, Penghang</creator><creator>Tan, Zhiyi</creator><creator>Lu, Guanming</creator><creator>Bao, Bing-Kun</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20230926</creationdate><title>LD4MRec: Simplifying and Powering Diffusion Model for Multimedia Recommendation</title><author>Yu, Penghang ; Tan, Zhiyi ; Lu, Guanming ; Bao, Bing-Kun</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a673-aef4168ddbdd8fa5f06ae2f8c4299e1412289593fc88a0e5fbf05f2e77f6fecb3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Computer Science - Information Retrieval</topic><toplevel>online_resources</toplevel><creatorcontrib>Yu, Penghang</creatorcontrib><creatorcontrib>Tan, Zhiyi</creatorcontrib><creatorcontrib>Lu, Guanming</creatorcontrib><creatorcontrib>Bao, Bing-Kun</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Yu, Penghang</au><au>Tan, Zhiyi</au><au>Lu, Guanming</au><au>Bao, Bing-Kun</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>LD4MRec: Simplifying and Powering Diffusion Model for Multimedia Recommendation</atitle><date>2023-09-26</date><risdate>2023</risdate><abstract>Multimedia recommendation aims to predict users' future behaviors based on historical behavioral data and item's multimodal information. However, noise inherent in behavioral data, arising from unintended user interactions with uninteresting items, detrimentally impacts recommendation performance. Recently, diffusion models have achieved high-quality information generation, in which the reverse process iteratively infers future information based on the corrupted state. It meets the need of predictive tasks under noisy conditions, and inspires exploring their application to predicting user behaviors. Nonetheless, several challenges must be addressed: 1) Classical diffusion models require excessive computation, which does not meet the efficiency requirements of recommendation systems. 2) Existing reverse processes are mainly designed for continuous data, whereas behavioral information is discrete in nature. Therefore, an effective method is needed for the generation of discrete behavioral information. To tackle the aforementioned issues, we propose a Light Diffusion model for Multimedia Recommendation. First, to reduce computational complexity, we simplify the formula of the reverse process, enabling one-step inference instead of multi-step inference. Second, to achieve effective behavioral information generation, we propose a novel Conditional neural Network. It maps the discrete behavior data into a continuous latent space, and generates behaviors with the guidance of collaborative signals and user multimodal preference. Additionally, considering that completely clean behavior data is inaccessible, we introduce a soft behavioral reconstruction constraint during model training, facilitating behavior prediction with noisy data. Empirical studies conducted on three public datasets demonstrate the effectiveness of LD4MRec.</abstract><doi>10.48550/arxiv.2309.15363</doi><oa>free_for_read</oa></addata></record>
fulltext	fulltext_linktorsrc
identifier	DOI: 10.48550/arxiv.2309.15363
ispartof
issn
language	eng
recordid	cdi_arxiv_primary_2309_15363
source	arXiv.org
subjects	Computer Science - Information Retrieval
title	LD4MRec: Simplifying and Powering Diffusion Model for Multimedia Recommendation
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-28T11%3A59%3A23IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=LD4MRec:%20Simplifying%20and%20Powering%20Diffusion%20Model%20for%20Multimedia%20Recommendation&rft.au=Yu,%20Penghang&rft.date=2023-09-26&rft_id=info:doi/10.48550/arxiv.2309.15363&rft_dat=%3Carxiv_GOX%3E2309_15363%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true