Dynamic Online Learning via Frank-Wolfe Algorithm
Online convex optimization (OCO) encapsulates supervised learning when training sets are large-scale or dynamic, and has grown essential as data has proliferated. OCO decomposes learning into a sequence of sub-problems, each of which must be solved with limited information. To ensure safe model adap...
Gespeichert in:
Veröffentlicht in: | IEEE transactions on signal processing 2021, Vol.69, p.932-947 |
---|---|
Hauptverfasser: | , , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 947 |
---|---|
container_issue | |
container_start_page | 932 |
container_title | IEEE transactions on signal processing |
container_volume | 69 |
creator | Kalhan, Deepak S. Singh Bedi, Amrit Koppel, Alec Rajawat, Ketan Hassani, Hamed Gupta, Abhishek K. Banerjee, Adrish |
description | Online convex optimization (OCO) encapsulates supervised learning when training sets are large-scale or dynamic, and has grown essential as data has proliferated. OCO decomposes learning into a sequence of sub-problems, each of which must be solved with limited information. To ensure safe model adaption or to avoid overfitting, constraints are often imposed, which are often addressed with projections or Lagrangian relaxation. To avoid this complexity incursion, we propose to study Frank-Wolfe (FW), which operates by updating in collinear directions with the gradient but guaranteed to be feasible. We specifically focus on its use in non-stationary settings, motivated by the fact that its iterates have structured sparsity that may be employed as a distribution-free change-point detector. We establish performance in terms of dynamic regret, which quantifies cost accumulation as compared with the optimal at each individual time slot. Specifically, for convex losses, we establish {\mathcal O}(T^{1/2}) dynamic regret up to metrics of non-stationarity. We relax the algorithm's required information to only noisy gradient estimates, i.e., partial feedback. We also consider a mini-batching 'Meta-Frank Wolfe', and characterize its dynamic regret. Experiments on matrix completion problem and background separation in video demonstrate favorable performance of the proposed scheme. Moreover, the structured sparsity of FW is experimentally observed to yield the sharpest tracker of change points among alternative approaches to non-stationary online convex optimization. |
doi_str_mv | 10.1109/TSP.2021.3051871 |
format | Article |
fullrecord | <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_ieee_primary_9325943</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>9325943</ieee_id><sourcerecordid>2487441418</sourcerecordid><originalsourceid>FETCH-LOGICAL-c291t-febfa4bb56d7bbbb6a1874888e782a3015242497143e5c89154a72d2a870a9223</originalsourceid><addsrcrecordid>eNo9kEFLAzEQhYMoWKt3wcuC562ZZLJJjqVaFQoVrOgtZLfZmrrN1mxb6L83pcW5zBzee_P4CLkFOgCg-mH2_jZglMGAUwFKwhnpgUbIKcriPN1U8Fwo-XVJrrpuSSkg6qJH4HEf7MpX2TQ0Prhs4mwMPiyynbfZONrwk3-2Te2yYbNoo998r67JRW2bzt2cdp98jJ9mo5d8Mn1-HQ0necU0bPLalbXFshTFXJZpCptaoVLKScUspyAYMtQSkDtRKQ0CrWRzZpWkVjPG--T-mLuO7e_WdRuzbLcxpJeGYYpCQFBJRY-qKrZdF11t1tGvbNwboOYAxiQw5gDGnMAky93R4p1z_3LNmdDI-R9zhVyH</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2487441418</pqid></control><display><type>article</type><title>Dynamic Online Learning via Frank-Wolfe Algorithm</title><source>IEEE Electronic Library (IEL)</source><creator>Kalhan, Deepak S. ; Singh Bedi, Amrit ; Koppel, Alec ; Rajawat, Ketan ; Hassani, Hamed ; Gupta, Abhishek K. ; Banerjee, Adrish</creator><creatorcontrib>Kalhan, Deepak S. ; Singh Bedi, Amrit ; Koppel, Alec ; Rajawat, Ketan ; Hassani, Hamed ; Gupta, Abhishek K. ; Banerjee, Adrish</creatorcontrib><description>Online convex optimization (OCO) encapsulates supervised learning when training sets are large-scale or dynamic, and has grown essential as data has proliferated. OCO decomposes learning into a sequence of sub-problems, each of which must be solved with limited information. To ensure safe model adaption or to avoid overfitting, constraints are often imposed, which are often addressed with projections or Lagrangian relaxation. To avoid this complexity incursion, we propose to study Frank-Wolfe (FW), which operates by updating in collinear directions with the gradient but guaranteed to be feasible. We specifically focus on its use in non-stationary settings, motivated by the fact that its iterates have structured sparsity that may be employed as a distribution-free change-point detector. We establish performance in terms of dynamic regret, which quantifies cost accumulation as compared with the optimal at each individual time slot. Specifically, for convex losses, we establish <inline-formula><tex-math notation="LaTeX">{\mathcal O}(T^{1/2})</tex-math></inline-formula> dynamic regret up to metrics of non-stationarity. We relax the algorithm's required information to only noisy gradient estimates, i.e., partial feedback. We also consider a mini-batching 'Meta-Frank Wolfe', and characterize its dynamic regret. Experiments on matrix completion problem and background separation in video demonstrate favorable performance of the proposed scheme. Moreover, the structured sparsity of FW is experimentally observed to yield the sharpest tracker of change points among alternative approaches to non-stationary online convex optimization.</description><identifier>ISSN: 1053-587X</identifier><identifier>EISSN: 1941-0476</identifier><identifier>DOI: 10.1109/TSP.2021.3051871</identifier><identifier>CODEN: ITPRED</identifier><language>eng</language><publisher>New York: IEEE</publisher><subject>Algorithms ; Complexity theory ; Computational geometry ; Constraint modelling ; Convergence ; Convex analysis ; Convex functions ; convex optimization ; Convexity ; Distance learning ; Frank-Wolfe algorithm ; gradient descent ; Heuristic algorithms ; Machine learning ; Online learning ; Optimization ; Sparsity ; Stochastic processes ; Training</subject><ispartof>IEEE transactions on signal processing, 2021, Vol.69, p.932-947</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2021</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c291t-febfa4bb56d7bbbb6a1874888e782a3015242497143e5c89154a72d2a870a9223</citedby><cites>FETCH-LOGICAL-c291t-febfa4bb56d7bbbb6a1874888e782a3015242497143e5c89154a72d2a870a9223</cites><orcidid>0000-0001-6646-8464 ; 0000-0002-4282-7407 ; 0000-0002-4508-0062 ; 0000-0002-9448-8750 ; 0000-0002-8807-2695 ; 0000-0003-2447-2873</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/9325943$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,777,781,793,4010,27904,27905,27906,54739</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/9325943$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Kalhan, Deepak S.</creatorcontrib><creatorcontrib>Singh Bedi, Amrit</creatorcontrib><creatorcontrib>Koppel, Alec</creatorcontrib><creatorcontrib>Rajawat, Ketan</creatorcontrib><creatorcontrib>Hassani, Hamed</creatorcontrib><creatorcontrib>Gupta, Abhishek K.</creatorcontrib><creatorcontrib>Banerjee, Adrish</creatorcontrib><title>Dynamic Online Learning via Frank-Wolfe Algorithm</title><title>IEEE transactions on signal processing</title><addtitle>TSP</addtitle><description>Online convex optimization (OCO) encapsulates supervised learning when training sets are large-scale or dynamic, and has grown essential as data has proliferated. OCO decomposes learning into a sequence of sub-problems, each of which must be solved with limited information. To ensure safe model adaption or to avoid overfitting, constraints are often imposed, which are often addressed with projections or Lagrangian relaxation. To avoid this complexity incursion, we propose to study Frank-Wolfe (FW), which operates by updating in collinear directions with the gradient but guaranteed to be feasible. We specifically focus on its use in non-stationary settings, motivated by the fact that its iterates have structured sparsity that may be employed as a distribution-free change-point detector. We establish performance in terms of dynamic regret, which quantifies cost accumulation as compared with the optimal at each individual time slot. Specifically, for convex losses, we establish <inline-formula><tex-math notation="LaTeX">{\mathcal O}(T^{1/2})</tex-math></inline-formula> dynamic regret up to metrics of non-stationarity. We relax the algorithm's required information to only noisy gradient estimates, i.e., partial feedback. We also consider a mini-batching 'Meta-Frank Wolfe', and characterize its dynamic regret. Experiments on matrix completion problem and background separation in video demonstrate favorable performance of the proposed scheme. Moreover, the structured sparsity of FW is experimentally observed to yield the sharpest tracker of change points among alternative approaches to non-stationary online convex optimization.</description><subject>Algorithms</subject><subject>Complexity theory</subject><subject>Computational geometry</subject><subject>Constraint modelling</subject><subject>Convergence</subject><subject>Convex analysis</subject><subject>Convex functions</subject><subject>convex optimization</subject><subject>Convexity</subject><subject>Distance learning</subject><subject>Frank-Wolfe algorithm</subject><subject>gradient descent</subject><subject>Heuristic algorithms</subject><subject>Machine learning</subject><subject>Online learning</subject><subject>Optimization</subject><subject>Sparsity</subject><subject>Stochastic processes</subject><subject>Training</subject><issn>1053-587X</issn><issn>1941-0476</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2021</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNo9kEFLAzEQhYMoWKt3wcuC562ZZLJJjqVaFQoVrOgtZLfZmrrN1mxb6L83pcW5zBzee_P4CLkFOgCg-mH2_jZglMGAUwFKwhnpgUbIKcriPN1U8Fwo-XVJrrpuSSkg6qJH4HEf7MpX2TQ0Prhs4mwMPiyynbfZONrwk3-2Te2yYbNoo998r67JRW2bzt2cdp98jJ9mo5d8Mn1-HQ0necU0bPLalbXFshTFXJZpCptaoVLKScUspyAYMtQSkDtRKQ0CrWRzZpWkVjPG--T-mLuO7e_WdRuzbLcxpJeGYYpCQFBJRY-qKrZdF11t1tGvbNwboOYAxiQw5gDGnMAky93R4p1z_3LNmdDI-R9zhVyH</recordid><startdate>2021</startdate><enddate>2021</enddate><creator>Kalhan, Deepak S.</creator><creator>Singh Bedi, Amrit</creator><creator>Koppel, Alec</creator><creator>Rajawat, Ketan</creator><creator>Hassani, Hamed</creator><creator>Gupta, Abhishek K.</creator><creator>Banerjee, Adrish</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><orcidid>https://orcid.org/0000-0001-6646-8464</orcidid><orcidid>https://orcid.org/0000-0002-4282-7407</orcidid><orcidid>https://orcid.org/0000-0002-4508-0062</orcidid><orcidid>https://orcid.org/0000-0002-9448-8750</orcidid><orcidid>https://orcid.org/0000-0002-8807-2695</orcidid><orcidid>https://orcid.org/0000-0003-2447-2873</orcidid></search><sort><creationdate>2021</creationdate><title>Dynamic Online Learning via Frank-Wolfe Algorithm</title><author>Kalhan, Deepak S. ; Singh Bedi, Amrit ; Koppel, Alec ; Rajawat, Ketan ; Hassani, Hamed ; Gupta, Abhishek K. ; Banerjee, Adrish</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c291t-febfa4bb56d7bbbb6a1874888e782a3015242497143e5c89154a72d2a870a9223</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2021</creationdate><topic>Algorithms</topic><topic>Complexity theory</topic><topic>Computational geometry</topic><topic>Constraint modelling</topic><topic>Convergence</topic><topic>Convex analysis</topic><topic>Convex functions</topic><topic>convex optimization</topic><topic>Convexity</topic><topic>Distance learning</topic><topic>Frank-Wolfe algorithm</topic><topic>gradient descent</topic><topic>Heuristic algorithms</topic><topic>Machine learning</topic><topic>Online learning</topic><topic>Optimization</topic><topic>Sparsity</topic><topic>Stochastic processes</topic><topic>Training</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Kalhan, Deepak S.</creatorcontrib><creatorcontrib>Singh Bedi, Amrit</creatorcontrib><creatorcontrib>Koppel, Alec</creatorcontrib><creatorcontrib>Rajawat, Ketan</creatorcontrib><creatorcontrib>Hassani, Hamed</creatorcontrib><creatorcontrib>Gupta, Abhishek K.</creatorcontrib><creatorcontrib>Banerjee, Adrish</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>IEEE transactions on signal processing</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Kalhan, Deepak S.</au><au>Singh Bedi, Amrit</au><au>Koppel, Alec</au><au>Rajawat, Ketan</au><au>Hassani, Hamed</au><au>Gupta, Abhishek K.</au><au>Banerjee, Adrish</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Dynamic Online Learning via Frank-Wolfe Algorithm</atitle><jtitle>IEEE transactions on signal processing</jtitle><stitle>TSP</stitle><date>2021</date><risdate>2021</risdate><volume>69</volume><spage>932</spage><epage>947</epage><pages>932-947</pages><issn>1053-587X</issn><eissn>1941-0476</eissn><coden>ITPRED</coden><abstract>Online convex optimization (OCO) encapsulates supervised learning when training sets are large-scale or dynamic, and has grown essential as data has proliferated. OCO decomposes learning into a sequence of sub-problems, each of which must be solved with limited information. To ensure safe model adaption or to avoid overfitting, constraints are often imposed, which are often addressed with projections or Lagrangian relaxation. To avoid this complexity incursion, we propose to study Frank-Wolfe (FW), which operates by updating in collinear directions with the gradient but guaranteed to be feasible. We specifically focus on its use in non-stationary settings, motivated by the fact that its iterates have structured sparsity that may be employed as a distribution-free change-point detector. We establish performance in terms of dynamic regret, which quantifies cost accumulation as compared with the optimal at each individual time slot. Specifically, for convex losses, we establish <inline-formula><tex-math notation="LaTeX">{\mathcal O}(T^{1/2})</tex-math></inline-formula> dynamic regret up to metrics of non-stationarity. We relax the algorithm's required information to only noisy gradient estimates, i.e., partial feedback. We also consider a mini-batching 'Meta-Frank Wolfe', and characterize its dynamic regret. Experiments on matrix completion problem and background separation in video demonstrate favorable performance of the proposed scheme. Moreover, the structured sparsity of FW is experimentally observed to yield the sharpest tracker of change points among alternative approaches to non-stationary online convex optimization.</abstract><cop>New York</cop><pub>IEEE</pub><doi>10.1109/TSP.2021.3051871</doi><tpages>16</tpages><orcidid>https://orcid.org/0000-0001-6646-8464</orcidid><orcidid>https://orcid.org/0000-0002-4282-7407</orcidid><orcidid>https://orcid.org/0000-0002-4508-0062</orcidid><orcidid>https://orcid.org/0000-0002-9448-8750</orcidid><orcidid>https://orcid.org/0000-0002-8807-2695</orcidid><orcidid>https://orcid.org/0000-0003-2447-2873</orcidid></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | ISSN: 1053-587X |
ispartof | IEEE transactions on signal processing, 2021, Vol.69, p.932-947 |
issn | 1053-587X 1941-0476 |
language | eng |
recordid | cdi_ieee_primary_9325943 |
source | IEEE Electronic Library (IEL) |
subjects | Algorithms Complexity theory Computational geometry Constraint modelling Convergence Convex analysis Convex functions convex optimization Convexity Distance learning Frank-Wolfe algorithm gradient descent Heuristic algorithms Machine learning Online learning Optimization Sparsity Stochastic processes Training |
title | Dynamic Online Learning via Frank-Wolfe Algorithm |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-17T22%3A11%3A59IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Dynamic%20Online%20Learning%20via%20Frank-Wolfe%20Algorithm&rft.jtitle=IEEE%20transactions%20on%20signal%20processing&rft.au=Kalhan,%20Deepak%20S.&rft.date=2021&rft.volume=69&rft.spage=932&rft.epage=947&rft.pages=932-947&rft.issn=1053-587X&rft.eissn=1941-0476&rft.coden=ITPRED&rft_id=info:doi/10.1109/TSP.2021.3051871&rft_dat=%3Cproquest_RIE%3E2487441418%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2487441418&rft_id=info:pmid/&rft_ieee_id=9325943&rfr_iscdi=true |