Generalization Analysis for Contrastive Representation Learning
Recently, contrastive learning has found impressive success in advancing the state of the art in solving various machine learning tasks. However, the existing generalization analysis is very limited or even not meaningful. In particular, the existing generalization error bounds depend linearly on th...
Gespeichert in:
Veröffentlicht in: | arXiv.org 2023-02 |
---|---|
Hauptverfasser: | , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | |
---|---|
container_issue | |
container_start_page | |
container_title | arXiv.org |
container_volume | |
creator | Yunwen Lei Yang, Tianbao Ying, Yiming Ding-Xuan, Zhou |
description | Recently, contrastive learning has found impressive success in advancing the state of the art in solving various machine learning tasks. However, the existing generalization analysis is very limited or even not meaningful. In particular, the existing generalization error bounds depend linearly on the number \(k\) of negative examples while it was widely shown in practice that choosing a large \(k\) is necessary to guarantee good generalization of contrastive learning in downstream tasks. In this paper, we establish novel generalization bounds for contrastive learning which do not depend on \(k\), up to logarithmic terms. Our analysis uses structural results on empirical covering numbers and Rademacher complexities to exploit the Lipschitz continuity of loss functions. For self-bounding Lipschitz loss functions, we further improve our results by developing optimistic bounds which imply fast rates in a low noise condition. We apply our results to learning with both linear representation and nonlinear representation by deep neural networks, for both of which we derive Rademacher complexity bounds to get improved generalization bounds. |
format | Article |
fullrecord | <record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_2780249881</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2780249881</sourcerecordid><originalsourceid>FETCH-proquest_journals_27802498813</originalsourceid><addsrcrecordid>eNqNzL0KwjAUQOEgCBbtOwScC-lNa-MkUvwZnMS9ZLgtKeWm5qaCPr2CPoDTWT7OTCSgdZ6ZAmAhUuZeKQWbCspSJ2J3QsJgB_ey0XmSe7LDkx3L1gdZe4rBcnQPlFccAzJS_LoL2kCOupWYt3ZgTH9divXxcKvP2Rj8fUKOTe-n8JlyA5VRUGyNyfV_6g1o1jmi</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2780249881</pqid></control><display><type>article</type><title>Generalization Analysis for Contrastive Representation Learning</title><source>Freely Accessible Journals</source><creator>Yunwen Lei ; Yang, Tianbao ; Ying, Yiming ; Ding-Xuan, Zhou</creator><creatorcontrib>Yunwen Lei ; Yang, Tianbao ; Ying, Yiming ; Ding-Xuan, Zhou</creatorcontrib><description>Recently, contrastive learning has found impressive success in advancing the state of the art in solving various machine learning tasks. However, the existing generalization analysis is very limited or even not meaningful. In particular, the existing generalization error bounds depend linearly on the number \(k\) of negative examples while it was widely shown in practice that choosing a large \(k\) is necessary to guarantee good generalization of contrastive learning in downstream tasks. In this paper, we establish novel generalization bounds for contrastive learning which do not depend on \(k\), up to logarithmic terms. Our analysis uses structural results on empirical covering numbers and Rademacher complexities to exploit the Lipschitz continuity of loss functions. For self-bounding Lipschitz loss functions, we further improve our results by developing optimistic bounds which imply fast rates in a low noise condition. We apply our results to learning with both linear representation and nonlinear representation by deep neural networks, for both of which we derive Rademacher complexity bounds to get improved generalization bounds.</description><identifier>EISSN: 2331-8422</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Artificial neural networks ; Cognitive tasks ; Continuity (mathematics) ; Empirical analysis ; Low noise ; Machine learning ; Representations</subject><ispartof>arXiv.org, 2023-02</ispartof><rights>2023. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>776,780</link.rule.ids></links><search><creatorcontrib>Yunwen Lei</creatorcontrib><creatorcontrib>Yang, Tianbao</creatorcontrib><creatorcontrib>Ying, Yiming</creatorcontrib><creatorcontrib>Ding-Xuan, Zhou</creatorcontrib><title>Generalization Analysis for Contrastive Representation Learning</title><title>arXiv.org</title><description>Recently, contrastive learning has found impressive success in advancing the state of the art in solving various machine learning tasks. However, the existing generalization analysis is very limited or even not meaningful. In particular, the existing generalization error bounds depend linearly on the number \(k\) of negative examples while it was widely shown in practice that choosing a large \(k\) is necessary to guarantee good generalization of contrastive learning in downstream tasks. In this paper, we establish novel generalization bounds for contrastive learning which do not depend on \(k\), up to logarithmic terms. Our analysis uses structural results on empirical covering numbers and Rademacher complexities to exploit the Lipschitz continuity of loss functions. For self-bounding Lipschitz loss functions, we further improve our results by developing optimistic bounds which imply fast rates in a low noise condition. We apply our results to learning with both linear representation and nonlinear representation by deep neural networks, for both of which we derive Rademacher complexity bounds to get improved generalization bounds.</description><subject>Artificial neural networks</subject><subject>Cognitive tasks</subject><subject>Continuity (mathematics)</subject><subject>Empirical analysis</subject><subject>Low noise</subject><subject>Machine learning</subject><subject>Representations</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>BENPR</sourceid><recordid>eNqNzL0KwjAUQOEgCBbtOwScC-lNa-MkUvwZnMS9ZLgtKeWm5qaCPr2CPoDTWT7OTCSgdZ6ZAmAhUuZeKQWbCspSJ2J3QsJgB_ey0XmSe7LDkx3L1gdZe4rBcnQPlFccAzJS_LoL2kCOupWYt3ZgTH9divXxcKvP2Rj8fUKOTe-n8JlyA5VRUGyNyfV_6g1o1jmi</recordid><startdate>20230228</startdate><enddate>20230228</enddate><creator>Yunwen Lei</creator><creator>Yang, Tianbao</creator><creator>Ying, Yiming</creator><creator>Ding-Xuan, Zhou</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20230228</creationdate><title>Generalization Analysis for Contrastive Representation Learning</title><author>Yunwen Lei ; Yang, Tianbao ; Ying, Yiming ; Ding-Xuan, Zhou</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_journals_27802498813</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Artificial neural networks</topic><topic>Cognitive tasks</topic><topic>Continuity (mathematics)</topic><topic>Empirical analysis</topic><topic>Low noise</topic><topic>Machine learning</topic><topic>Representations</topic><toplevel>online_resources</toplevel><creatorcontrib>Yunwen Lei</creatorcontrib><creatorcontrib>Yang, Tianbao</creatorcontrib><creatorcontrib>Ying, Yiming</creatorcontrib><creatorcontrib>Ding-Xuan, Zhou</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science & Engineering Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Yunwen Lei</au><au>Yang, Tianbao</au><au>Ying, Yiming</au><au>Ding-Xuan, Zhou</au><format>book</format><genre>document</genre><ristype>GEN</ristype><atitle>Generalization Analysis for Contrastive Representation Learning</atitle><jtitle>arXiv.org</jtitle><date>2023-02-28</date><risdate>2023</risdate><eissn>2331-8422</eissn><abstract>Recently, contrastive learning has found impressive success in advancing the state of the art in solving various machine learning tasks. However, the existing generalization analysis is very limited or even not meaningful. In particular, the existing generalization error bounds depend linearly on the number \(k\) of negative examples while it was widely shown in practice that choosing a large \(k\) is necessary to guarantee good generalization of contrastive learning in downstream tasks. In this paper, we establish novel generalization bounds for contrastive learning which do not depend on \(k\), up to logarithmic terms. Our analysis uses structural results on empirical covering numbers and Rademacher complexities to exploit the Lipschitz continuity of loss functions. For self-bounding Lipschitz loss functions, we further improve our results by developing optimistic bounds which imply fast rates in a low noise condition. We apply our results to learning with both linear representation and nonlinear representation by deep neural networks, for both of which we derive Rademacher complexity bounds to get improved generalization bounds.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | EISSN: 2331-8422 |
ispartof | arXiv.org, 2023-02 |
issn | 2331-8422 |
language | eng |
recordid | cdi_proquest_journals_2780249881 |
source | Freely Accessible Journals |
subjects | Artificial neural networks Cognitive tasks Continuity (mathematics) Empirical analysis Low noise Machine learning Representations |
title | Generalization Analysis for Contrastive Representation Learning |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-10T19%3A29%3A32IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=document&rft.atitle=Generalization%20Analysis%20for%20Contrastive%20Representation%20Learning&rft.jtitle=arXiv.org&rft.au=Yunwen%20Lei&rft.date=2023-02-28&rft.eissn=2331-8422&rft_id=info:doi/&rft_dat=%3Cproquest%3E2780249881%3C/proquest%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2780249881&rft_id=info:pmid/&rfr_iscdi=true |