Dynamic Online Learning via Frank-Wolfe Algorithm

Online convex optimization (OCO) encapsulates supervised learning when training sets are large-scale or dynamic, and has grown essential as data has proliferated. OCO decomposes learning into a sequence of sub-problems, each of which must be solved with limited information. To ensure safe model adap...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on signal processing 2021, Vol.69, p.932-947
Hauptverfasser: Kalhan, Deepak S., Singh Bedi, Amrit, Koppel, Alec, Rajawat, Ketan, Hassani, Hamed, Gupta, Abhishek K., Banerjee, Adrish
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 947
container_issue
container_start_page 932
container_title IEEE transactions on signal processing
container_volume 69
creator Kalhan, Deepak S.
Singh Bedi, Amrit
Koppel, Alec
Rajawat, Ketan
Hassani, Hamed
Gupta, Abhishek K.
Banerjee, Adrish
description Online convex optimization (OCO) encapsulates supervised learning when training sets are large-scale or dynamic, and has grown essential as data has proliferated. OCO decomposes learning into a sequence of sub-problems, each of which must be solved with limited information. To ensure safe model adaption or to avoid overfitting, constraints are often imposed, which are often addressed with projections or Lagrangian relaxation. To avoid this complexity incursion, we propose to study Frank-Wolfe (FW), which operates by updating in collinear directions with the gradient but guaranteed to be feasible. We specifically focus on its use in non-stationary settings, motivated by the fact that its iterates have structured sparsity that may be employed as a distribution-free change-point detector. We establish performance in terms of dynamic regret, which quantifies cost accumulation as compared with the optimal at each individual time slot. Specifically, for convex losses, we establish {\mathcal O}(T^{1/2}) dynamic regret up to metrics of non-stationarity. We relax the algorithm's required information to only noisy gradient estimates, i.e., partial feedback. We also consider a mini-batching 'Meta-Frank Wolfe', and characterize its dynamic regret. Experiments on matrix completion problem and background separation in video demonstrate favorable performance of the proposed scheme. Moreover, the structured sparsity of FW is experimentally observed to yield the sharpest tracker of change points among alternative approaches to non-stationary online convex optimization.
doi_str_mv 10.1109/TSP.2021.3051871
format Article
fullrecord <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_ieee_primary_9325943</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>9325943</ieee_id><sourcerecordid>2487441418</sourcerecordid><originalsourceid>FETCH-LOGICAL-c291t-febfa4bb56d7bbbb6a1874888e782a3015242497143e5c89154a72d2a870a9223</originalsourceid><addsrcrecordid>eNo9kEFLAzEQhYMoWKt3wcuC562ZZLJJjqVaFQoVrOgtZLfZmrrN1mxb6L83pcW5zBzee_P4CLkFOgCg-mH2_jZglMGAUwFKwhnpgUbIKcriPN1U8Fwo-XVJrrpuSSkg6qJH4HEf7MpX2TQ0Prhs4mwMPiyynbfZONrwk3-2Te2yYbNoo998r67JRW2bzt2cdp98jJ9mo5d8Mn1-HQ0necU0bPLalbXFshTFXJZpCptaoVLKScUspyAYMtQSkDtRKQ0CrWRzZpWkVjPG--T-mLuO7e_WdRuzbLcxpJeGYYpCQFBJRY-qKrZdF11t1tGvbNwboOYAxiQw5gDGnMAky93R4p1z_3LNmdDI-R9zhVyH</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2487441418</pqid></control><display><type>article</type><title>Dynamic Online Learning via Frank-Wolfe Algorithm</title><source>IEEE Electronic Library (IEL)</source><creator>Kalhan, Deepak S. ; Singh Bedi, Amrit ; Koppel, Alec ; Rajawat, Ketan ; Hassani, Hamed ; Gupta, Abhishek K. ; Banerjee, Adrish</creator><creatorcontrib>Kalhan, Deepak S. ; Singh Bedi, Amrit ; Koppel, Alec ; Rajawat, Ketan ; Hassani, Hamed ; Gupta, Abhishek K. ; Banerjee, Adrish</creatorcontrib><description>Online convex optimization (OCO) encapsulates supervised learning when training sets are large-scale or dynamic, and has grown essential as data has proliferated. OCO decomposes learning into a sequence of sub-problems, each of which must be solved with limited information. To ensure safe model adaption or to avoid overfitting, constraints are often imposed, which are often addressed with projections or Lagrangian relaxation. To avoid this complexity incursion, we propose to study Frank-Wolfe (FW), which operates by updating in collinear directions with the gradient but guaranteed to be feasible. We specifically focus on its use in non-stationary settings, motivated by the fact that its iterates have structured sparsity that may be employed as a distribution-free change-point detector. We establish performance in terms of dynamic regret, which quantifies cost accumulation as compared with the optimal at each individual time slot. Specifically, for convex losses, we establish &lt;inline-formula&gt;&lt;tex-math notation="LaTeX"&gt;{\mathcal O}(T^{1/2})&lt;/tex-math&gt;&lt;/inline-formula&gt; dynamic regret up to metrics of non-stationarity. We relax the algorithm's required information to only noisy gradient estimates, i.e., partial feedback. We also consider a mini-batching 'Meta-Frank Wolfe', and characterize its dynamic regret. Experiments on matrix completion problem and background separation in video demonstrate favorable performance of the proposed scheme. Moreover, the structured sparsity of FW is experimentally observed to yield the sharpest tracker of change points among alternative approaches to non-stationary online convex optimization.</description><identifier>ISSN: 1053-587X</identifier><identifier>EISSN: 1941-0476</identifier><identifier>DOI: 10.1109/TSP.2021.3051871</identifier><identifier>CODEN: ITPRED</identifier><language>eng</language><publisher>New York: IEEE</publisher><subject>Algorithms ; Complexity theory ; Computational geometry ; Constraint modelling ; Convergence ; Convex analysis ; Convex functions ; convex optimization ; Convexity ; Distance learning ; Frank-Wolfe algorithm ; gradient descent ; Heuristic algorithms ; Machine learning ; Online learning ; Optimization ; Sparsity ; Stochastic processes ; Training</subject><ispartof>IEEE transactions on signal processing, 2021, Vol.69, p.932-947</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2021</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c291t-febfa4bb56d7bbbb6a1874888e782a3015242497143e5c89154a72d2a870a9223</citedby><cites>FETCH-LOGICAL-c291t-febfa4bb56d7bbbb6a1874888e782a3015242497143e5c89154a72d2a870a9223</cites><orcidid>0000-0001-6646-8464 ; 0000-0002-4282-7407 ; 0000-0002-4508-0062 ; 0000-0002-9448-8750 ; 0000-0002-8807-2695 ; 0000-0003-2447-2873</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/9325943$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,777,781,793,4010,27904,27905,27906,54739</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/9325943$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Kalhan, Deepak S.</creatorcontrib><creatorcontrib>Singh Bedi, Amrit</creatorcontrib><creatorcontrib>Koppel, Alec</creatorcontrib><creatorcontrib>Rajawat, Ketan</creatorcontrib><creatorcontrib>Hassani, Hamed</creatorcontrib><creatorcontrib>Gupta, Abhishek K.</creatorcontrib><creatorcontrib>Banerjee, Adrish</creatorcontrib><title>Dynamic Online Learning via Frank-Wolfe Algorithm</title><title>IEEE transactions on signal processing</title><addtitle>TSP</addtitle><description>Online convex optimization (OCO) encapsulates supervised learning when training sets are large-scale or dynamic, and has grown essential as data has proliferated. OCO decomposes learning into a sequence of sub-problems, each of which must be solved with limited information. To ensure safe model adaption or to avoid overfitting, constraints are often imposed, which are often addressed with projections or Lagrangian relaxation. To avoid this complexity incursion, we propose to study Frank-Wolfe (FW), which operates by updating in collinear directions with the gradient but guaranteed to be feasible. We specifically focus on its use in non-stationary settings, motivated by the fact that its iterates have structured sparsity that may be employed as a distribution-free change-point detector. We establish performance in terms of dynamic regret, which quantifies cost accumulation as compared with the optimal at each individual time slot. Specifically, for convex losses, we establish &lt;inline-formula&gt;&lt;tex-math notation="LaTeX"&gt;{\mathcal O}(T^{1/2})&lt;/tex-math&gt;&lt;/inline-formula&gt; dynamic regret up to metrics of non-stationarity. We relax the algorithm's required information to only noisy gradient estimates, i.e., partial feedback. We also consider a mini-batching 'Meta-Frank Wolfe', and characterize its dynamic regret. Experiments on matrix completion problem and background separation in video demonstrate favorable performance of the proposed scheme. Moreover, the structured sparsity of FW is experimentally observed to yield the sharpest tracker of change points among alternative approaches to non-stationary online convex optimization.</description><subject>Algorithms</subject><subject>Complexity theory</subject><subject>Computational geometry</subject><subject>Constraint modelling</subject><subject>Convergence</subject><subject>Convex analysis</subject><subject>Convex functions</subject><subject>convex optimization</subject><subject>Convexity</subject><subject>Distance learning</subject><subject>Frank-Wolfe algorithm</subject><subject>gradient descent</subject><subject>Heuristic algorithms</subject><subject>Machine learning</subject><subject>Online learning</subject><subject>Optimization</subject><subject>Sparsity</subject><subject>Stochastic processes</subject><subject>Training</subject><issn>1053-587X</issn><issn>1941-0476</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2021</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNo9kEFLAzEQhYMoWKt3wcuC562ZZLJJjqVaFQoVrOgtZLfZmrrN1mxb6L83pcW5zBzee_P4CLkFOgCg-mH2_jZglMGAUwFKwhnpgUbIKcriPN1U8Fwo-XVJrrpuSSkg6qJH4HEf7MpX2TQ0Prhs4mwMPiyynbfZONrwk3-2Te2yYbNoo998r67JRW2bzt2cdp98jJ9mo5d8Mn1-HQ0necU0bPLalbXFshTFXJZpCptaoVLKScUspyAYMtQSkDtRKQ0CrWRzZpWkVjPG--T-mLuO7e_WdRuzbLcxpJeGYYpCQFBJRY-qKrZdF11t1tGvbNwboOYAxiQw5gDGnMAky93R4p1z_3LNmdDI-R9zhVyH</recordid><startdate>2021</startdate><enddate>2021</enddate><creator>Kalhan, Deepak S.</creator><creator>Singh Bedi, Amrit</creator><creator>Koppel, Alec</creator><creator>Rajawat, Ketan</creator><creator>Hassani, Hamed</creator><creator>Gupta, Abhishek K.</creator><creator>Banerjee, Adrish</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><orcidid>https://orcid.org/0000-0001-6646-8464</orcidid><orcidid>https://orcid.org/0000-0002-4282-7407</orcidid><orcidid>https://orcid.org/0000-0002-4508-0062</orcidid><orcidid>https://orcid.org/0000-0002-9448-8750</orcidid><orcidid>https://orcid.org/0000-0002-8807-2695</orcidid><orcidid>https://orcid.org/0000-0003-2447-2873</orcidid></search><sort><creationdate>2021</creationdate><title>Dynamic Online Learning via Frank-Wolfe Algorithm</title><author>Kalhan, Deepak S. ; Singh Bedi, Amrit ; Koppel, Alec ; Rajawat, Ketan ; Hassani, Hamed ; Gupta, Abhishek K. ; Banerjee, Adrish</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c291t-febfa4bb56d7bbbb6a1874888e782a3015242497143e5c89154a72d2a870a9223</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2021</creationdate><topic>Algorithms</topic><topic>Complexity theory</topic><topic>Computational geometry</topic><topic>Constraint modelling</topic><topic>Convergence</topic><topic>Convex analysis</topic><topic>Convex functions</topic><topic>convex optimization</topic><topic>Convexity</topic><topic>Distance learning</topic><topic>Frank-Wolfe algorithm</topic><topic>gradient descent</topic><topic>Heuristic algorithms</topic><topic>Machine learning</topic><topic>Online learning</topic><topic>Optimization</topic><topic>Sparsity</topic><topic>Stochastic processes</topic><topic>Training</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Kalhan, Deepak S.</creatorcontrib><creatorcontrib>Singh Bedi, Amrit</creatorcontrib><creatorcontrib>Koppel, Alec</creatorcontrib><creatorcontrib>Rajawat, Ketan</creatorcontrib><creatorcontrib>Hassani, Hamed</creatorcontrib><creatorcontrib>Gupta, Abhishek K.</creatorcontrib><creatorcontrib>Banerjee, Adrish</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics &amp; Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>IEEE transactions on signal processing</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Kalhan, Deepak S.</au><au>Singh Bedi, Amrit</au><au>Koppel, Alec</au><au>Rajawat, Ketan</au><au>Hassani, Hamed</au><au>Gupta, Abhishek K.</au><au>Banerjee, Adrish</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Dynamic Online Learning via Frank-Wolfe Algorithm</atitle><jtitle>IEEE transactions on signal processing</jtitle><stitle>TSP</stitle><date>2021</date><risdate>2021</risdate><volume>69</volume><spage>932</spage><epage>947</epage><pages>932-947</pages><issn>1053-587X</issn><eissn>1941-0476</eissn><coden>ITPRED</coden><abstract>Online convex optimization (OCO) encapsulates supervised learning when training sets are large-scale or dynamic, and has grown essential as data has proliferated. OCO decomposes learning into a sequence of sub-problems, each of which must be solved with limited information. To ensure safe model adaption or to avoid overfitting, constraints are often imposed, which are often addressed with projections or Lagrangian relaxation. To avoid this complexity incursion, we propose to study Frank-Wolfe (FW), which operates by updating in collinear directions with the gradient but guaranteed to be feasible. We specifically focus on its use in non-stationary settings, motivated by the fact that its iterates have structured sparsity that may be employed as a distribution-free change-point detector. We establish performance in terms of dynamic regret, which quantifies cost accumulation as compared with the optimal at each individual time slot. Specifically, for convex losses, we establish &lt;inline-formula&gt;&lt;tex-math notation="LaTeX"&gt;{\mathcal O}(T^{1/2})&lt;/tex-math&gt;&lt;/inline-formula&gt; dynamic regret up to metrics of non-stationarity. We relax the algorithm's required information to only noisy gradient estimates, i.e., partial feedback. We also consider a mini-batching 'Meta-Frank Wolfe', and characterize its dynamic regret. Experiments on matrix completion problem and background separation in video demonstrate favorable performance of the proposed scheme. Moreover, the structured sparsity of FW is experimentally observed to yield the sharpest tracker of change points among alternative approaches to non-stationary online convex optimization.</abstract><cop>New York</cop><pub>IEEE</pub><doi>10.1109/TSP.2021.3051871</doi><tpages>16</tpages><orcidid>https://orcid.org/0000-0001-6646-8464</orcidid><orcidid>https://orcid.org/0000-0002-4282-7407</orcidid><orcidid>https://orcid.org/0000-0002-4508-0062</orcidid><orcidid>https://orcid.org/0000-0002-9448-8750</orcidid><orcidid>https://orcid.org/0000-0002-8807-2695</orcidid><orcidid>https://orcid.org/0000-0003-2447-2873</orcidid></addata></record>
fulltext fulltext_linktorsrc
identifier ISSN: 1053-587X
ispartof IEEE transactions on signal processing, 2021, Vol.69, p.932-947
issn 1053-587X
1941-0476
language eng
recordid cdi_ieee_primary_9325943
source IEEE Electronic Library (IEL)
subjects Algorithms
Complexity theory
Computational geometry
Constraint modelling
Convergence
Convex analysis
Convex functions
convex optimization
Convexity
Distance learning
Frank-Wolfe algorithm
gradient descent
Heuristic algorithms
Machine learning
Online learning
Optimization
Sparsity
Stochastic processes
Training
title Dynamic Online Learning via Frank-Wolfe Algorithm
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-17T22%3A11%3A59IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Dynamic%20Online%20Learning%20via%20Frank-Wolfe%20Algorithm&rft.jtitle=IEEE%20transactions%20on%20signal%20processing&rft.au=Kalhan,%20Deepak%20S.&rft.date=2021&rft.volume=69&rft.spage=932&rft.epage=947&rft.pages=932-947&rft.issn=1053-587X&rft.eissn=1941-0476&rft.coden=ITPRED&rft_id=info:doi/10.1109/TSP.2021.3051871&rft_dat=%3Cproquest_RIE%3E2487441418%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2487441418&rft_id=info:pmid/&rft_ieee_id=9325943&rfr_iscdi=true