Understanding Biases in ChatGPT-based Recommender Systems: Provider Fairness, Temporal Stability, and Recency

This paper explores the biases inherent in ChatGPT-based recommender systems, focusing on provider fairness (item-side fairness). Through extensive experiments and over a thousand API calls, we investigate the impact of prompt design strategies—including structure, system role, and intent—on evaluat...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	ACM transactions on recommender systems 2024-08
1. Verfasser:	Deldjoo, Yashar
Format:	Artikel
Sprache:	eng
Schlagworte:	Computing methodologies Computing methodologies / Machine learning Computing methodologies / Machine learning / Learning paradigms Computing methodologies / Machine learning / Learning paradigms / Supervised learning Computing methodologies / Machine learning / Learning paradigms / Supervised learning / Learning to rank Computing methodologies / Machine learning / Learning settings Computing methodologies / Machine learning / Learning settings / Learning from implicit feedback Human-centered computing Human-centered computing / Collaborative and social computing Human-centered computing / Collaborative and social computing / Collaborative and social computing theory, concepts and paradigms Human-centered computing / Collaborative and social computing / Collaborative and social computing theory, concepts and paradigms / Collaborative filtering Human-centered computing / Collaborative and social computing / Collaborative and social computing theory, concepts and paradigms / Social recommendation Information systems Information systems / Information retrieval Information systems / Information retrieval / Retrieval tasks and goals Information systems / Information retrieval / Retrieval tasks and goals / Recommender systems Information systems / Information systems applications Information systems / Information systems applications / Data mining Information systems / Information systems applications / Data mining / Collaborative filtering Information systems / World Wide Web Information systems / World Wide Web / Web searching and information discovery Information systems / World Wide Web / Web searching and information discovery / Collaborative filtering Information systems / World Wide Web / Web searching and information discovery / Social recommendation
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title	ACM transactions on recommender systems
container_volume
creator	Deldjoo, Yashar
description	This paper explores the biases inherent in ChatGPT-based recommender systems, focusing on provider fairness (item-side fairness). Through extensive experiments and over a thousand API calls, we investigate the impact of prompt design strategies—including structure, system role, and intent—on evaluation metrics such as provider fairness, catalog coverage, temporal stability, and recency. The first experiment examines these strategies in classical top-K recommendations, while the second evaluates sequential in-context learning (ICL). In the first experiment, we assess seven distinct prompt scenarios on top-K recommendation accuracy and fairness. Accuracy-oriented prompts, like Simple and Chain-of-Thought (COT), outperform diversification prompts, which, despite enhancing temporal freshness, reduce accuracy by up to 50%. Embedding fairness into system roles, such as “act as a fair recommender”, proved more effective than fairness directives within prompts. We also found that diversification prompts led to recommending newer movies, offering broader genre distribution compared to traditional collaborative filtering (CF) models. The system showed high consistency across multiple runs. The second experiment explores sequential ICL, comparing zero-shot and few-shot learning scenarios. Results indicate that including user demographic information in prompts affects model biases and stereotypes. However, ICL did not consistently improve item fairness and catalog coverage over zero-shot learning. Zero-shot learning achieved higher NDCG and coverage, while ICL-2 showed slight improvements in hit rate (HR) when age-group context was included. Overall, our study provides insights into biases of RecLLMs, particularly in provider fairness and catalog coverage. By examining prompt design, learning strategies, and system roles, we highlight the potential and challenges of integrating large language models into recommendation systems, paving the way for future research. Further details can be found at https://github.com/yasdel/Benchmark_RecLLM_Fairness.
doi_str_mv	10.1145/3690655
format	Article
fullrecord	<record><control><sourceid>acm_cross</sourceid><recordid>TN_cdi_crossref_primary_10_1145_3690655</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>3690655</sourcerecordid><originalsourceid>FETCH-LOGICAL-a845-c6c250cfacd60926cad69886d7c63cf652dd22a78151eeb6e91ee2c7a46696613</originalsourceid><addsrcrecordid>eNpNkMFLwzAUxoMoOObw7ik3L6sm6fLaeNOxTWHgcPVc3pJUI0s7kiL0v7dlUzx9j-_93gfvI-SaszvOZ_I-BcVAyjMyElnGEgClzv_Nl2QS4xdjTChIeZqPiH-vjQ2xxdq4-oM-OYw2UlfT-Se2q02R7HrD0DerG-_twNJtF1vr4wPdhObbDc4SXahtjFNaWH9oAu7ptsWd27u2m9I-eri3te6uyEWF-2gnJx2TYrko5s_J-nX1Mn9cJ5jPZKJBC8l0hdoAUwI0GlB5DibTkOoKpDBGCMxyLrm1O7CqF6EznPU_AvB0TG6PsTo0MQZblYfgPIau5KwceipPPfXkzZFE7f-g3-UP5ndi-Q</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Understanding Biases in ChatGPT-based Recommender Systems: Provider Fairness, Temporal Stability, and Recency</title><source>ACM Digital Library</source><creator>Deldjoo, Yashar</creator><creatorcontrib>Deldjoo, Yashar</creatorcontrib><description>This paper explores the biases inherent in ChatGPT-based recommender systems, focusing on provider fairness (item-side fairness). Through extensive experiments and over a thousand API calls, we investigate the impact of prompt design strategies—including structure, system role, and intent—on evaluation metrics such as provider fairness, catalog coverage, temporal stability, and recency. The first experiment examines these strategies in classical top-K recommendations, while the second evaluates sequential in-context learning (ICL). In the first experiment, we assess seven distinct prompt scenarios on top-K recommendation accuracy and fairness. Accuracy-oriented prompts, like Simple and Chain-of-Thought (COT), outperform diversification prompts, which, despite enhancing temporal freshness, reduce accuracy by up to 50%. Embedding fairness into system roles, such as “act as a fair recommender”, proved more effective than fairness directives within prompts. We also found that diversification prompts led to recommending newer movies, offering broader genre distribution compared to traditional collaborative filtering (CF) models. The system showed high consistency across multiple runs. The second experiment explores sequential ICL, comparing zero-shot and few-shot learning scenarios. Results indicate that including user demographic information in prompts affects model biases and stereotypes. However, ICL did not consistently improve item fairness and catalog coverage over zero-shot learning. Zero-shot learning achieved higher NDCG and coverage, while ICL-2 showed slight improvements in hit rate (HR) when age-group context was included. Overall, our study provides insights into biases of RecLLMs, particularly in provider fairness and catalog coverage. By examining prompt design, learning strategies, and system roles, we highlight the potential and challenges of integrating large language models into recommendation systems, paving the way for future research. Further details can be found at https://github.com/yasdel/Benchmark_RecLLM_Fairness.</description><identifier>ISSN: 2770-6699</identifier><identifier>EISSN: 2770-6699</identifier><identifier>DOI: 10.1145/3690655</identifier><language>eng</language><publisher>New York, NY: ACM</publisher><subject>Computing methodologies ; Computing methodologies / Machine learning ; Computing methodologies / Machine learning / Learning paradigms ; Computing methodologies / Machine learning / Learning paradigms / Supervised learning ; Computing methodologies / Machine learning / Learning paradigms / Supervised learning / Learning to rank ; Computing methodologies / Machine learning / Learning settings ; Computing methodologies / Machine learning / Learning settings / Learning from implicit feedback ; Human-centered computing ; Human-centered computing / Collaborative and social computing ; Human-centered computing / Collaborative and social computing / Collaborative and social computing theory, concepts and paradigms ; Human-centered computing / Collaborative and social computing / Collaborative and social computing theory, concepts and paradigms / Collaborative filtering ; Human-centered computing / Collaborative and social computing / Collaborative and social computing theory, concepts and paradigms / Social recommendation ; Information systems ; Information systems / Information retrieval ; Information systems / Information retrieval / Retrieval tasks and goals ; Information systems / Information retrieval / Retrieval tasks and goals / Recommender systems ; Information systems / Information systems applications ; Information systems / Information systems applications / Data mining ; Information systems / Information systems applications / Data mining / Collaborative filtering ; Information systems / World Wide Web ; Information systems / World Wide Web / Web searching and information discovery ; Information systems / World Wide Web / Web searching and information discovery / Collaborative filtering ; Information systems / World Wide Web / Web searching and information discovery / Social recommendation</subject><ispartof>ACM transactions on recommender systems, 2024-08</ispartof><rights>Copyright held by the owner/author(s). Publication rights licensed to ACM.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-a845-c6c250cfacd60926cad69886d7c63cf652dd22a78151eeb6e91ee2c7a46696613</cites><orcidid>0000-0002-6767-358X</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,776,780,27901,27902</link.rule.ids></links><search><creatorcontrib>Deldjoo, Yashar</creatorcontrib><title>Understanding Biases in ChatGPT-based Recommender Systems: Provider Fairness, Temporal Stability, and Recency</title><title>ACM transactions on recommender systems</title><addtitle>ACM TORS</addtitle><description>This paper explores the biases inherent in ChatGPT-based recommender systems, focusing on provider fairness (item-side fairness). Through extensive experiments and over a thousand API calls, we investigate the impact of prompt design strategies—including structure, system role, and intent—on evaluation metrics such as provider fairness, catalog coverage, temporal stability, and recency. The first experiment examines these strategies in classical top-K recommendations, while the second evaluates sequential in-context learning (ICL). In the first experiment, we assess seven distinct prompt scenarios on top-K recommendation accuracy and fairness. Accuracy-oriented prompts, like Simple and Chain-of-Thought (COT), outperform diversification prompts, which, despite enhancing temporal freshness, reduce accuracy by up to 50%. Embedding fairness into system roles, such as “act as a fair recommender”, proved more effective than fairness directives within prompts. We also found that diversification prompts led to recommending newer movies, offering broader genre distribution compared to traditional collaborative filtering (CF) models. The system showed high consistency across multiple runs. The second experiment explores sequential ICL, comparing zero-shot and few-shot learning scenarios. Results indicate that including user demographic information in prompts affects model biases and stereotypes. However, ICL did not consistently improve item fairness and catalog coverage over zero-shot learning. Zero-shot learning achieved higher NDCG and coverage, while ICL-2 showed slight improvements in hit rate (HR) when age-group context was included. Overall, our study provides insights into biases of RecLLMs, particularly in provider fairness and catalog coverage. By examining prompt design, learning strategies, and system roles, we highlight the potential and challenges of integrating large language models into recommendation systems, paving the way for future research. Further details can be found at https://github.com/yasdel/Benchmark_RecLLM_Fairness.</description><subject>Computing methodologies</subject><subject>Computing methodologies / Machine learning</subject><subject>Computing methodologies / Machine learning / Learning paradigms</subject><subject>Computing methodologies / Machine learning / Learning paradigms / Supervised learning</subject><subject>Computing methodologies / Machine learning / Learning paradigms / Supervised learning / Learning to rank</subject><subject>Computing methodologies / Machine learning / Learning settings</subject><subject>Computing methodologies / Machine learning / Learning settings / Learning from implicit feedback</subject><subject>Human-centered computing</subject><subject>Human-centered computing / Collaborative and social computing</subject><subject>Human-centered computing / Collaborative and social computing / Collaborative and social computing theory, concepts and paradigms</subject><subject>Human-centered computing / Collaborative and social computing / Collaborative and social computing theory, concepts and paradigms / Collaborative filtering</subject><subject>Human-centered computing / Collaborative and social computing / Collaborative and social computing theory, concepts and paradigms / Social recommendation</subject><subject>Information systems</subject><subject>Information systems / Information retrieval</subject><subject>Information systems / Information retrieval / Retrieval tasks and goals</subject><subject>Information systems / Information retrieval / Retrieval tasks and goals / Recommender systems</subject><subject>Information systems / Information systems applications</subject><subject>Information systems / Information systems applications / Data mining</subject><subject>Information systems / Information systems applications / Data mining / Collaborative filtering</subject><subject>Information systems / World Wide Web</subject><subject>Information systems / World Wide Web / Web searching and information discovery</subject><subject>Information systems / World Wide Web / Web searching and information discovery / Collaborative filtering</subject><subject>Information systems / World Wide Web / Web searching and information discovery / Social recommendation</subject><issn>2770-6699</issn><issn>2770-6699</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><recordid>eNpNkMFLwzAUxoMoOObw7ik3L6sm6fLaeNOxTWHgcPVc3pJUI0s7kiL0v7dlUzx9j-_93gfvI-SaszvOZ_I-BcVAyjMyElnGEgClzv_Nl2QS4xdjTChIeZqPiH-vjQ2xxdq4-oM-OYw2UlfT-Se2q02R7HrD0DerG-_twNJtF1vr4wPdhObbDc4SXahtjFNaWH9oAu7ptsWd27u2m9I-eri3te6uyEWF-2gnJx2TYrko5s_J-nX1Mn9cJ5jPZKJBC8l0hdoAUwI0GlB5DibTkOoKpDBGCMxyLrm1O7CqF6EznPU_AvB0TG6PsTo0MQZblYfgPIau5KwceipPPfXkzZFE7f-g3-UP5ndi-Q</recordid><startdate>20240828</startdate><enddate>20240828</enddate><creator>Deldjoo, Yashar</creator><general>ACM</general><scope>AAYXX</scope><scope>CITATION</scope><orcidid>https://orcid.org/0000-0002-6767-358X</orcidid></search><sort><creationdate>20240828</creationdate><title>Understanding Biases in ChatGPT-based Recommender Systems: Provider Fairness, Temporal Stability, and Recency</title><author>Deldjoo, Yashar</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a845-c6c250cfacd60926cad69886d7c63cf652dd22a78151eeb6e91ee2c7a46696613</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Computing methodologies</topic><topic>Computing methodologies / Machine learning</topic><topic>Computing methodologies / Machine learning / Learning paradigms</topic><topic>Computing methodologies / Machine learning / Learning paradigms / Supervised learning</topic><topic>Computing methodologies / Machine learning / Learning paradigms / Supervised learning / Learning to rank</topic><topic>Computing methodologies / Machine learning / Learning settings</topic><topic>Computing methodologies / Machine learning / Learning settings / Learning from implicit feedback</topic><topic>Human-centered computing</topic><topic>Human-centered computing / Collaborative and social computing</topic><topic>Human-centered computing / Collaborative and social computing / Collaborative and social computing theory, concepts and paradigms</topic><topic>Human-centered computing / Collaborative and social computing / Collaborative and social computing theory, concepts and paradigms / Collaborative filtering</topic><topic>Human-centered computing / Collaborative and social computing / Collaborative and social computing theory, concepts and paradigms / Social recommendation</topic><topic>Information systems</topic><topic>Information systems / Information retrieval</topic><topic>Information systems / Information retrieval / Retrieval tasks and goals</topic><topic>Information systems / Information retrieval / Retrieval tasks and goals / Recommender systems</topic><topic>Information systems / Information systems applications</topic><topic>Information systems / Information systems applications / Data mining</topic><topic>Information systems / Information systems applications / Data mining / Collaborative filtering</topic><topic>Information systems / World Wide Web</topic><topic>Information systems / World Wide Web / Web searching and information discovery</topic><topic>Information systems / World Wide Web / Web searching and information discovery / Collaborative filtering</topic><topic>Information systems / World Wide Web / Web searching and information discovery / Social recommendation</topic><toplevel>online_resources</toplevel><creatorcontrib>Deldjoo, Yashar</creatorcontrib><collection>CrossRef</collection><jtitle>ACM transactions on recommender systems</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Deldjoo, Yashar</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Understanding Biases in ChatGPT-based Recommender Systems: Provider Fairness, Temporal Stability, and Recency</atitle><jtitle>ACM transactions on recommender systems</jtitle><stitle>ACM TORS</stitle><date>2024-08-28</date><risdate>2024</risdate><issn>2770-6699</issn><eissn>2770-6699</eissn><abstract>This paper explores the biases inherent in ChatGPT-based recommender systems, focusing on provider fairness (item-side fairness). Through extensive experiments and over a thousand API calls, we investigate the impact of prompt design strategies—including structure, system role, and intent—on evaluation metrics such as provider fairness, catalog coverage, temporal stability, and recency. The first experiment examines these strategies in classical top-K recommendations, while the second evaluates sequential in-context learning (ICL). In the first experiment, we assess seven distinct prompt scenarios on top-K recommendation accuracy and fairness. Accuracy-oriented prompts, like Simple and Chain-of-Thought (COT), outperform diversification prompts, which, despite enhancing temporal freshness, reduce accuracy by up to 50%. Embedding fairness into system roles, such as “act as a fair recommender”, proved more effective than fairness directives within prompts. We also found that diversification prompts led to recommending newer movies, offering broader genre distribution compared to traditional collaborative filtering (CF) models. The system showed high consistency across multiple runs. The second experiment explores sequential ICL, comparing zero-shot and few-shot learning scenarios. Results indicate that including user demographic information in prompts affects model biases and stereotypes. However, ICL did not consistently improve item fairness and catalog coverage over zero-shot learning. Zero-shot learning achieved higher NDCG and coverage, while ICL-2 showed slight improvements in hit rate (HR) when age-group context was included. Overall, our study provides insights into biases of RecLLMs, particularly in provider fairness and catalog coverage. By examining prompt design, learning strategies, and system roles, we highlight the potential and challenges of integrating large language models into recommendation systems, paving the way for future research. Further details can be found at https://github.com/yasdel/Benchmark_RecLLM_Fairness.</abstract><cop>New York, NY</cop><pub>ACM</pub><doi>10.1145/3690655</doi><orcidid>https://orcid.org/0000-0002-6767-358X</orcidid><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	ISSN: 2770-6699
ispartof	ACM transactions on recommender systems, 2024-08
issn	2770-6699 2770-6699
language	eng
recordid	cdi_crossref_primary_10_1145_3690655
source	ACM Digital Library
subjects	Computing methodologies Computing methodologies / Machine learning Computing methodologies / Machine learning / Learning paradigms Computing methodologies / Machine learning / Learning paradigms / Supervised learning Computing methodologies / Machine learning / Learning paradigms / Supervised learning / Learning to rank Computing methodologies / Machine learning / Learning settings Computing methodologies / Machine learning / Learning settings / Learning from implicit feedback Human-centered computing Human-centered computing / Collaborative and social computing Human-centered computing / Collaborative and social computing / Collaborative and social computing theory, concepts and paradigms Human-centered computing / Collaborative and social computing / Collaborative and social computing theory, concepts and paradigms / Collaborative filtering Human-centered computing / Collaborative and social computing / Collaborative and social computing theory, concepts and paradigms / Social recommendation Information systems Information systems / Information retrieval Information systems / Information retrieval / Retrieval tasks and goals Information systems / Information retrieval / Retrieval tasks and goals / Recommender systems Information systems / Information systems applications Information systems / Information systems applications / Data mining Information systems / Information systems applications / Data mining / Collaborative filtering Information systems / World Wide Web Information systems / World Wide Web / Web searching and information discovery Information systems / World Wide Web / Web searching and information discovery / Collaborative filtering Information systems / World Wide Web / Web searching and information discovery / Social recommendation
title	Understanding Biases in ChatGPT-based Recommender Systems: Provider Fairness, Temporal Stability, and Recency
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-05T01%3A58%3A17IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-acm_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Understanding%20Biases%20in%20ChatGPT-based%20Recommender%20Systems:%20Provider%20Fairness,%20Temporal%20Stability,%20and%20Recency&rft.jtitle=ACM%20transactions%20on%20recommender%20systems&rft.au=Deldjoo,%20Yashar&rft.date=2024-08-28&rft.issn=2770-6699&rft.eissn=2770-6699&rft_id=info:doi/10.1145/3690655&rft_dat=%3Cacm_cross%3E3690655%3C/acm_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true