Hierarchical Exploration for Accelerating Contextual Bandits

Contextual bandit learning is an increasingly popular approach to optimizing recommender systems via user feedback, but can be slow to converge in practice due to the need for exploring a large feature space. In this paper, we propose a coarse-to-fine hierarchical approach for encoding prior knowled...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Yue, Yisong, Hong, Sue Ann, Guestrin, Carlos
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title
container_volume
creator Yue, Yisong
Hong, Sue Ann
Guestrin, Carlos
description Contextual bandit learning is an increasingly popular approach to optimizing recommender systems via user feedback, but can be slow to converge in practice due to the need for exploring a large feature space. In this paper, we propose a coarse-to-fine hierarchical approach for encoding prior knowledge that drastically reduces the amount of exploration required. Intuitively, user preferences can be reasonably embedded in a coarse low-dimensional feature space that can be explored efficiently, requiring exploration in the high-dimensional space only as necessary. We introduce a bandit algorithm that explores within this coarse-to-fine spectrum, and prove performance guarantees that depend on how well the coarse space captures the user's preferences. We demonstrate substantial improvement over conventional bandit algorithms through extensive simulation as well as a live user study in the setting of personalized news recommendation.
doi_str_mv 10.48550/arxiv.1206.6454
format Article
fullrecord <record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_1206_6454</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>1206_6454</sourcerecordid><originalsourceid>FETCH-LOGICAL-a654-10adf49545702e63418c1fe8c22e95037e1eb0bc89718eea574678d9b5b0e78d3</originalsourceid><addsrcrecordid>eNotjzFvgzAUhL10qNLunSL-ANQGP9tIXSiiTaVIWdjRwzxSSxSQoRX594Gk051Op9N9jL0IHkkDwF_RL-4vEjFXkZIgH9nbwZFHb7-dxS4olrEbPM5u6IN28EFmLXW0Bf05yId-pmX-XXvv2Ddunp7YQ4vdRM__umPlR1Hmh_B4-vzKs2OICmQoODatTEGC5jGpRApjRUvGxjGlwBNNgmpeW5NqYYgQtFTaNGkNNafVJDu2v8_e3lejdz_oL9VGUW0UyRWELEH1</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Hierarchical Exploration for Accelerating Contextual Bandits</title><source>arXiv.org</source><creator>Yue, Yisong ; Hong, Sue Ann ; Guestrin, Carlos</creator><creatorcontrib>Yue, Yisong ; Hong, Sue Ann ; Guestrin, Carlos</creatorcontrib><description>Contextual bandit learning is an increasingly popular approach to optimizing recommender systems via user feedback, but can be slow to converge in practice due to the need for exploring a large feature space. In this paper, we propose a coarse-to-fine hierarchical approach for encoding prior knowledge that drastically reduces the amount of exploration required. Intuitively, user preferences can be reasonably embedded in a coarse low-dimensional feature space that can be explored efficiently, requiring exploration in the high-dimensional space only as necessary. We introduce a bandit algorithm that explores within this coarse-to-fine spectrum, and prove performance guarantees that depend on how well the coarse space captures the user's preferences. We demonstrate substantial improvement over conventional bandit algorithms through extensive simulation as well as a live user study in the setting of personalized news recommendation.</description><identifier>DOI: 10.48550/arxiv.1206.6454</identifier><language>eng</language><subject>Computer Science - Learning ; Statistics - Machine Learning</subject><creationdate>2012-06</creationdate><rights>http://arxiv.org/licenses/nonexclusive-distrib/1.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,776,881</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/1206.6454$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.1206.6454$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Yue, Yisong</creatorcontrib><creatorcontrib>Hong, Sue Ann</creatorcontrib><creatorcontrib>Guestrin, Carlos</creatorcontrib><title>Hierarchical Exploration for Accelerating Contextual Bandits</title><description>Contextual bandit learning is an increasingly popular approach to optimizing recommender systems via user feedback, but can be slow to converge in practice due to the need for exploring a large feature space. In this paper, we propose a coarse-to-fine hierarchical approach for encoding prior knowledge that drastically reduces the amount of exploration required. Intuitively, user preferences can be reasonably embedded in a coarse low-dimensional feature space that can be explored efficiently, requiring exploration in the high-dimensional space only as necessary. We introduce a bandit algorithm that explores within this coarse-to-fine spectrum, and prove performance guarantees that depend on how well the coarse space captures the user's preferences. We demonstrate substantial improvement over conventional bandit algorithms through extensive simulation as well as a live user study in the setting of personalized news recommendation.</description><subject>Computer Science - Learning</subject><subject>Statistics - Machine Learning</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2012</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNotjzFvgzAUhL10qNLunSL-ANQGP9tIXSiiTaVIWdjRwzxSSxSQoRX594Gk051Op9N9jL0IHkkDwF_RL-4vEjFXkZIgH9nbwZFHb7-dxS4olrEbPM5u6IN28EFmLXW0Bf05yId-pmX-XXvv2Ddunp7YQ4vdRM__umPlR1Hmh_B4-vzKs2OICmQoODatTEGC5jGpRApjRUvGxjGlwBNNgmpeW5NqYYgQtFTaNGkNNafVJDu2v8_e3lejdz_oL9VGUW0UyRWELEH1</recordid><startdate>20120627</startdate><enddate>20120627</enddate><creator>Yue, Yisong</creator><creator>Hong, Sue Ann</creator><creator>Guestrin, Carlos</creator><scope>AKY</scope><scope>EPD</scope><scope>GOX</scope></search><sort><creationdate>20120627</creationdate><title>Hierarchical Exploration for Accelerating Contextual Bandits</title><author>Yue, Yisong ; Hong, Sue Ann ; Guestrin, Carlos</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a654-10adf49545702e63418c1fe8c22e95037e1eb0bc89718eea574678d9b5b0e78d3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2012</creationdate><topic>Computer Science - Learning</topic><topic>Statistics - Machine Learning</topic><toplevel>online_resources</toplevel><creatorcontrib>Yue, Yisong</creatorcontrib><creatorcontrib>Hong, Sue Ann</creatorcontrib><creatorcontrib>Guestrin, Carlos</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv Statistics</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Yue, Yisong</au><au>Hong, Sue Ann</au><au>Guestrin, Carlos</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Hierarchical Exploration for Accelerating Contextual Bandits</atitle><date>2012-06-27</date><risdate>2012</risdate><abstract>Contextual bandit learning is an increasingly popular approach to optimizing recommender systems via user feedback, but can be slow to converge in practice due to the need for exploring a large feature space. In this paper, we propose a coarse-to-fine hierarchical approach for encoding prior knowledge that drastically reduces the amount of exploration required. Intuitively, user preferences can be reasonably embedded in a coarse low-dimensional feature space that can be explored efficiently, requiring exploration in the high-dimensional space only as necessary. We introduce a bandit algorithm that explores within this coarse-to-fine spectrum, and prove performance guarantees that depend on how well the coarse space captures the user's preferences. We demonstrate substantial improvement over conventional bandit algorithms through extensive simulation as well as a live user study in the setting of personalized news recommendation.</abstract><doi>10.48550/arxiv.1206.6454</doi><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier DOI: 10.48550/arxiv.1206.6454
ispartof
issn
language eng
recordid cdi_arxiv_primary_1206_6454
source arXiv.org
subjects Computer Science - Learning
Statistics - Machine Learning
title Hierarchical Exploration for Accelerating Contextual Bandits
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-09T19%3A30%3A25IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Hierarchical%20Exploration%20for%20Accelerating%20Contextual%20Bandits&rft.au=Yue,%20Yisong&rft.date=2012-06-27&rft_id=info:doi/10.48550/arxiv.1206.6454&rft_dat=%3Carxiv_GOX%3E1206_6454%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true