Worst-case quadratic loss bounds for prediction using linear functions and gradient descent

Studies the performance of gradient descent (GD) when applied to the problem of online linear prediction in arbitrary inner product spaces. We prove worst-case bounds on the sum of the squared prediction errors under various assumptions concerning the amount of a priori information about the sequenc...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on neural networks 1996-05, Vol.7 (3), p.604-619
Hauptverfasser: Cesa-Bianchi, N., Long, P.M., Warmuth, M.K.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 619
container_issue 3
container_start_page 604
container_title IEEE transactions on neural networks
container_volume 7
creator Cesa-Bianchi, N.
Long, P.M.
Warmuth, M.K.
description Studies the performance of gradient descent (GD) when applied to the problem of online linear prediction in arbitrary inner product spaces. We prove worst-case bounds on the sum of the squared prediction errors under various assumptions concerning the amount of a priori information about the sequence to predict. The algorithms we use are variants and extensions of online GD. Whereas our algorithms always predict using linear functions as hypotheses, none of our results requires the data to be linearly related. In fact, the bounds proved on the total prediction loss are typically expressed as a function of the total loss of the best fixed linear predictor with bounded norm. All the upper bounds are tight to within constants. Matching lower bounds are provided in some cases. Finally, we apply our results to the problem of online prediction for classes of smooth functions.
doi_str_mv 10.1109/72.501719
format Article
fullrecord <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_proquest_miscellaneous_28722731</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>501719</ieee_id><sourcerecordid>28722731</sourcerecordid><originalsourceid>FETCH-LOGICAL-c361t-6bfe91f2e222b72c423c4c893bd932c9bf97b6a667f467720fd34cc7ac72d39f3</originalsourceid><addsrcrecordid>eNp90DtPwzAQB3ALgWgpDKwMyAMCMaT4lTgeUcVLqsQCYmCIHD8qo9Rp7WTg2-OSqGxMdzr_dGf9ATjHaI4xEneczHOEORYHYIoFwxlCgh6mHrE8E4TwCTiJ8QshzHJUHIMJLklBWV5OwedHG2KXKRkN3PZSB9k5BZs2Rli3vdcR2jbATTDaqc61HvbR-RVsnDcyQNv732mE0mu4ClI74zuoTVSpnoIjK5tozsY6A--PD2-L52z5-vSyuF9miha4y4raGoEtMYSQmhPFCFVMlYLWWlCiRG0FrwtZFNyygnOCrKZMKS4VJ5oKS2fgZti7Ce22N7Gr1i59oGmkN20fK04ZyVGZoySv_5Wk5CktihO8HaAKKYpgbLUJbi3Dd4VRtcu84qQaMk_2clza12uj_-QYcgJXI5BRycYG6ZWLe0dxXpZod_NiYM4Ys38dj_wAEJWRRQ</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>28722731</pqid></control><display><type>article</type><title>Worst-case quadratic loss bounds for prediction using linear functions and gradient descent</title><source>IEEE Electronic Library (IEL)</source><creator>Cesa-Bianchi, N. ; Long, P.M. ; Warmuth, M.K.</creator><creatorcontrib>Cesa-Bianchi, N. ; Long, P.M. ; Warmuth, M.K.</creatorcontrib><description>Studies the performance of gradient descent (GD) when applied to the problem of online linear prediction in arbitrary inner product spaces. We prove worst-case bounds on the sum of the squared prediction errors under various assumptions concerning the amount of a priori information about the sequence to predict. The algorithms we use are variants and extensions of online GD. Whereas our algorithms always predict using linear functions as hypotheses, none of our results requires the data to be linearly related. In fact, the bounds proved on the total prediction loss are typically expressed as a function of the total loss of the best fixed linear predictor with bounded norm. All the upper bounds are tight to within constants. Matching lower bounds are provided in some cases. Finally, we apply our results to the problem of online prediction for classes of smooth functions.</description><identifier>ISSN: 1045-9227</identifier><identifier>EISSN: 1941-0093</identifier><identifier>DOI: 10.1109/72.501719</identifier><identifier>PMID: 18263458</identifier><identifier>CODEN: ITNNEP</identifier><language>eng</language><publisher>New York, NY: IEEE</publisher><subject>Algorithm design and analysis ; Algorithmics. Computability. Computer arithmetics ; Applied sciences ; Computer science ; Computer science; control theory; systems ; Exact sciences and technology ; Prediction algorithms ; Predictive models ; Theoretical computing ; Upper bound</subject><ispartof>IEEE transactions on neural networks, 1996-05, Vol.7 (3), p.604-619</ispartof><rights>1996 INIST-CNRS</rights><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c361t-6bfe91f2e222b72c423c4c893bd932c9bf97b6a667f467720fd34cc7ac72d39f3</citedby><cites>FETCH-LOGICAL-c361t-6bfe91f2e222b72c423c4c893bd932c9bf97b6a667f467720fd34cc7ac72d39f3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/501719$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,780,784,796,27923,27924,54757</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/501719$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc><backlink>$$Uhttp://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&amp;idt=3158801$$DView record in Pascal Francis$$Hfree_for_read</backlink><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/18263458$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Cesa-Bianchi, N.</creatorcontrib><creatorcontrib>Long, P.M.</creatorcontrib><creatorcontrib>Warmuth, M.K.</creatorcontrib><title>Worst-case quadratic loss bounds for prediction using linear functions and gradient descent</title><title>IEEE transactions on neural networks</title><addtitle>TNN</addtitle><addtitle>IEEE Trans Neural Netw</addtitle><description>Studies the performance of gradient descent (GD) when applied to the problem of online linear prediction in arbitrary inner product spaces. We prove worst-case bounds on the sum of the squared prediction errors under various assumptions concerning the amount of a priori information about the sequence to predict. The algorithms we use are variants and extensions of online GD. Whereas our algorithms always predict using linear functions as hypotheses, none of our results requires the data to be linearly related. In fact, the bounds proved on the total prediction loss are typically expressed as a function of the total loss of the best fixed linear predictor with bounded norm. All the upper bounds are tight to within constants. Matching lower bounds are provided in some cases. Finally, we apply our results to the problem of online prediction for classes of smooth functions.</description><subject>Algorithm design and analysis</subject><subject>Algorithmics. Computability. Computer arithmetics</subject><subject>Applied sciences</subject><subject>Computer science</subject><subject>Computer science; control theory; systems</subject><subject>Exact sciences and technology</subject><subject>Prediction algorithms</subject><subject>Predictive models</subject><subject>Theoretical computing</subject><subject>Upper bound</subject><issn>1045-9227</issn><issn>1941-0093</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>1996</creationdate><recordtype>article</recordtype><recordid>eNp90DtPwzAQB3ALgWgpDKwMyAMCMaT4lTgeUcVLqsQCYmCIHD8qo9Rp7WTg2-OSqGxMdzr_dGf9ATjHaI4xEneczHOEORYHYIoFwxlCgh6mHrE8E4TwCTiJ8QshzHJUHIMJLklBWV5OwedHG2KXKRkN3PZSB9k5BZs2Rli3vdcR2jbATTDaqc61HvbR-RVsnDcyQNv732mE0mu4ClI74zuoTVSpnoIjK5tozsY6A--PD2-L52z5-vSyuF9miha4y4raGoEtMYSQmhPFCFVMlYLWWlCiRG0FrwtZFNyygnOCrKZMKS4VJ5oKS2fgZti7Ce22N7Gr1i59oGmkN20fK04ZyVGZoySv_5Wk5CktihO8HaAKKYpgbLUJbi3Dd4VRtcu84qQaMk_2clza12uj_-QYcgJXI5BRycYG6ZWLe0dxXpZod_NiYM4Ys38dj_wAEJWRRQ</recordid><startdate>19960501</startdate><enddate>19960501</enddate><creator>Cesa-Bianchi, N.</creator><creator>Long, P.M.</creator><creator>Warmuth, M.K.</creator><general>IEEE</general><general>Institute of Electrical and Electronics Engineers</general><scope>IQODW</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>7X8</scope></search><sort><creationdate>19960501</creationdate><title>Worst-case quadratic loss bounds for prediction using linear functions and gradient descent</title><author>Cesa-Bianchi, N. ; Long, P.M. ; Warmuth, M.K.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c361t-6bfe91f2e222b72c423c4c893bd932c9bf97b6a667f467720fd34cc7ac72d39f3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>1996</creationdate><topic>Algorithm design and analysis</topic><topic>Algorithmics. Computability. Computer arithmetics</topic><topic>Applied sciences</topic><topic>Computer science</topic><topic>Computer science; control theory; systems</topic><topic>Exact sciences and technology</topic><topic>Prediction algorithms</topic><topic>Predictive models</topic><topic>Theoretical computing</topic><topic>Upper bound</topic><toplevel>online_resources</toplevel><creatorcontrib>Cesa-Bianchi, N.</creatorcontrib><creatorcontrib>Long, P.M.</creatorcontrib><creatorcontrib>Warmuth, M.K.</creatorcontrib><collection>Pascal-Francis</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>MEDLINE - Academic</collection><jtitle>IEEE transactions on neural networks</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Cesa-Bianchi, N.</au><au>Long, P.M.</au><au>Warmuth, M.K.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Worst-case quadratic loss bounds for prediction using linear functions and gradient descent</atitle><jtitle>IEEE transactions on neural networks</jtitle><stitle>TNN</stitle><addtitle>IEEE Trans Neural Netw</addtitle><date>1996-05-01</date><risdate>1996</risdate><volume>7</volume><issue>3</issue><spage>604</spage><epage>619</epage><pages>604-619</pages><issn>1045-9227</issn><eissn>1941-0093</eissn><coden>ITNNEP</coden><abstract>Studies the performance of gradient descent (GD) when applied to the problem of online linear prediction in arbitrary inner product spaces. We prove worst-case bounds on the sum of the squared prediction errors under various assumptions concerning the amount of a priori information about the sequence to predict. The algorithms we use are variants and extensions of online GD. Whereas our algorithms always predict using linear functions as hypotheses, none of our results requires the data to be linearly related. In fact, the bounds proved on the total prediction loss are typically expressed as a function of the total loss of the best fixed linear predictor with bounded norm. All the upper bounds are tight to within constants. Matching lower bounds are provided in some cases. Finally, we apply our results to the problem of online prediction for classes of smooth functions.</abstract><cop>New York, NY</cop><pub>IEEE</pub><pmid>18263458</pmid><doi>10.1109/72.501719</doi><tpages>16</tpages></addata></record>
fulltext fulltext_linktorsrc
identifier ISSN: 1045-9227
ispartof IEEE transactions on neural networks, 1996-05, Vol.7 (3), p.604-619
issn 1045-9227
1941-0093
language eng
recordid cdi_proquest_miscellaneous_28722731
source IEEE Electronic Library (IEL)
subjects Algorithm design and analysis
Algorithmics. Computability. Computer arithmetics
Applied sciences
Computer science
Computer science
control theory
systems
Exact sciences and technology
Prediction algorithms
Predictive models
Theoretical computing
Upper bound
title Worst-case quadratic loss bounds for prediction using linear functions and gradient descent
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-09T03%3A50%3A44IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Worst-case%20quadratic%20loss%20bounds%20for%20prediction%20using%20linear%20functions%20and%20gradient%20descent&rft.jtitle=IEEE%20transactions%20on%20neural%20networks&rft.au=Cesa-Bianchi,%20N.&rft.date=1996-05-01&rft.volume=7&rft.issue=3&rft.spage=604&rft.epage=619&rft.pages=604-619&rft.issn=1045-9227&rft.eissn=1941-0093&rft.coden=ITNNEP&rft_id=info:doi/10.1109/72.501719&rft_dat=%3Cproquest_RIE%3E28722731%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=28722731&rft_id=info:pmid/18263458&rft_ieee_id=501719&rfr_iscdi=true