Generalization Analysis for Contrastive Representation Learning

Recently, contrastive learning has found impressive success in advancing the state of the art in solving various machine learning tasks. However, the existing generalization analysis is very limited or even not meaningful. In particular, the existing generalization error bounds depend linearly on th...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:arXiv.org 2023-02
Hauptverfasser: Yunwen Lei, Yang, Tianbao, Ying, Yiming, Ding-Xuan, Zhou
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title arXiv.org
container_volume
creator Yunwen Lei
Yang, Tianbao
Ying, Yiming
Ding-Xuan, Zhou
description Recently, contrastive learning has found impressive success in advancing the state of the art in solving various machine learning tasks. However, the existing generalization analysis is very limited or even not meaningful. In particular, the existing generalization error bounds depend linearly on the number \(k\) of negative examples while it was widely shown in practice that choosing a large \(k\) is necessary to guarantee good generalization of contrastive learning in downstream tasks. In this paper, we establish novel generalization bounds for contrastive learning which do not depend on \(k\), up to logarithmic terms. Our analysis uses structural results on empirical covering numbers and Rademacher complexities to exploit the Lipschitz continuity of loss functions. For self-bounding Lipschitz loss functions, we further improve our results by developing optimistic bounds which imply fast rates in a low noise condition. We apply our results to learning with both linear representation and nonlinear representation by deep neural networks, for both of which we derive Rademacher complexity bounds to get improved generalization bounds.
format Article
fullrecord <record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_2780249881</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2780249881</sourcerecordid><originalsourceid>FETCH-proquest_journals_27802498813</originalsourceid><addsrcrecordid>eNqNzL0KwjAUQOEgCBbtOwScC-lNa-MkUvwZnMS9ZLgtKeWm5qaCPr2CPoDTWT7OTCSgdZ6ZAmAhUuZeKQWbCspSJ2J3QsJgB_ey0XmSe7LDkx3L1gdZe4rBcnQPlFccAzJS_LoL2kCOupWYt3ZgTH9divXxcKvP2Rj8fUKOTe-n8JlyA5VRUGyNyfV_6g1o1jmi</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2780249881</pqid></control><display><type>article</type><title>Generalization Analysis for Contrastive Representation Learning</title><source>Freely Accessible Journals</source><creator>Yunwen Lei ; Yang, Tianbao ; Ying, Yiming ; Ding-Xuan, Zhou</creator><creatorcontrib>Yunwen Lei ; Yang, Tianbao ; Ying, Yiming ; Ding-Xuan, Zhou</creatorcontrib><description>Recently, contrastive learning has found impressive success in advancing the state of the art in solving various machine learning tasks. However, the existing generalization analysis is very limited or even not meaningful. In particular, the existing generalization error bounds depend linearly on the number \(k\) of negative examples while it was widely shown in practice that choosing a large \(k\) is necessary to guarantee good generalization of contrastive learning in downstream tasks. In this paper, we establish novel generalization bounds for contrastive learning which do not depend on \(k\), up to logarithmic terms. Our analysis uses structural results on empirical covering numbers and Rademacher complexities to exploit the Lipschitz continuity of loss functions. For self-bounding Lipschitz loss functions, we further improve our results by developing optimistic bounds which imply fast rates in a low noise condition. We apply our results to learning with both linear representation and nonlinear representation by deep neural networks, for both of which we derive Rademacher complexity bounds to get improved generalization bounds.</description><identifier>EISSN: 2331-8422</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Artificial neural networks ; Cognitive tasks ; Continuity (mathematics) ; Empirical analysis ; Low noise ; Machine learning ; Representations</subject><ispartof>arXiv.org, 2023-02</ispartof><rights>2023. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>776,780</link.rule.ids></links><search><creatorcontrib>Yunwen Lei</creatorcontrib><creatorcontrib>Yang, Tianbao</creatorcontrib><creatorcontrib>Ying, Yiming</creatorcontrib><creatorcontrib>Ding-Xuan, Zhou</creatorcontrib><title>Generalization Analysis for Contrastive Representation Learning</title><title>arXiv.org</title><description>Recently, contrastive learning has found impressive success in advancing the state of the art in solving various machine learning tasks. However, the existing generalization analysis is very limited or even not meaningful. In particular, the existing generalization error bounds depend linearly on the number \(k\) of negative examples while it was widely shown in practice that choosing a large \(k\) is necessary to guarantee good generalization of contrastive learning in downstream tasks. In this paper, we establish novel generalization bounds for contrastive learning which do not depend on \(k\), up to logarithmic terms. Our analysis uses structural results on empirical covering numbers and Rademacher complexities to exploit the Lipschitz continuity of loss functions. For self-bounding Lipschitz loss functions, we further improve our results by developing optimistic bounds which imply fast rates in a low noise condition. We apply our results to learning with both linear representation and nonlinear representation by deep neural networks, for both of which we derive Rademacher complexity bounds to get improved generalization bounds.</description><subject>Artificial neural networks</subject><subject>Cognitive tasks</subject><subject>Continuity (mathematics)</subject><subject>Empirical analysis</subject><subject>Low noise</subject><subject>Machine learning</subject><subject>Representations</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>BENPR</sourceid><recordid>eNqNzL0KwjAUQOEgCBbtOwScC-lNa-MkUvwZnMS9ZLgtKeWm5qaCPr2CPoDTWT7OTCSgdZ6ZAmAhUuZeKQWbCspSJ2J3QsJgB_ey0XmSe7LDkx3L1gdZe4rBcnQPlFccAzJS_LoL2kCOupWYt3ZgTH9divXxcKvP2Rj8fUKOTe-n8JlyA5VRUGyNyfV_6g1o1jmi</recordid><startdate>20230228</startdate><enddate>20230228</enddate><creator>Yunwen Lei</creator><creator>Yang, Tianbao</creator><creator>Ying, Yiming</creator><creator>Ding-Xuan, Zhou</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20230228</creationdate><title>Generalization Analysis for Contrastive Representation Learning</title><author>Yunwen Lei ; Yang, Tianbao ; Ying, Yiming ; Ding-Xuan, Zhou</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_journals_27802498813</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Artificial neural networks</topic><topic>Cognitive tasks</topic><topic>Continuity (mathematics)</topic><topic>Empirical analysis</topic><topic>Low noise</topic><topic>Machine learning</topic><topic>Representations</topic><toplevel>online_resources</toplevel><creatorcontrib>Yunwen Lei</creatorcontrib><creatorcontrib>Yang, Tianbao</creatorcontrib><creatorcontrib>Ying, Yiming</creatorcontrib><creatorcontrib>Ding-Xuan, Zhou</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science &amp; Engineering Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Yunwen Lei</au><au>Yang, Tianbao</au><au>Ying, Yiming</au><au>Ding-Xuan, Zhou</au><format>book</format><genre>document</genre><ristype>GEN</ristype><atitle>Generalization Analysis for Contrastive Representation Learning</atitle><jtitle>arXiv.org</jtitle><date>2023-02-28</date><risdate>2023</risdate><eissn>2331-8422</eissn><abstract>Recently, contrastive learning has found impressive success in advancing the state of the art in solving various machine learning tasks. However, the existing generalization analysis is very limited or even not meaningful. In particular, the existing generalization error bounds depend linearly on the number \(k\) of negative examples while it was widely shown in practice that choosing a large \(k\) is necessary to guarantee good generalization of contrastive learning in downstream tasks. In this paper, we establish novel generalization bounds for contrastive learning which do not depend on \(k\), up to logarithmic terms. Our analysis uses structural results on empirical covering numbers and Rademacher complexities to exploit the Lipschitz continuity of loss functions. For self-bounding Lipschitz loss functions, we further improve our results by developing optimistic bounds which imply fast rates in a low noise condition. We apply our results to learning with both linear representation and nonlinear representation by deep neural networks, for both of which we derive Rademacher complexity bounds to get improved generalization bounds.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier EISSN: 2331-8422
ispartof arXiv.org, 2023-02
issn 2331-8422
language eng
recordid cdi_proquest_journals_2780249881
source Freely Accessible Journals
subjects Artificial neural networks
Cognitive tasks
Continuity (mathematics)
Empirical analysis
Low noise
Machine learning
Representations
title Generalization Analysis for Contrastive Representation Learning
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-10T19%3A29%3A32IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=document&rft.atitle=Generalization%20Analysis%20for%20Contrastive%20Representation%20Learning&rft.jtitle=arXiv.org&rft.au=Yunwen%20Lei&rft.date=2023-02-28&rft.eissn=2331-8422&rft_id=info:doi/&rft_dat=%3Cproquest%3E2780249881%3C/proquest%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2780249881&rft_id=info:pmid/&rfr_iscdi=true