Generalization Analysis for Contrastive Representation Learning

Recently, contrastive learning has found impressive success in advancing the state of the art in solving various machine learning tasks. However, the existing generalization analysis is very limited or even not meaningful. In particular, the existing generalization error bounds depend linearly on th...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	arXiv.org 2023-02
Hauptverfasser:	Yunwen Lei, Yang, Tianbao, Ying, Yiming, Ding-Xuan, Zhou
Format:	Artikel
Sprache:	eng
Schlagworte:	Artificial neural networks Cognitive tasks Continuity (mathematics) Empirical analysis Low noise Machine learning Representations
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title	arXiv.org
container_volume
creator	Yunwen Lei Yang, Tianbao Ying, Yiming Ding-Xuan, Zhou
description	Recently, contrastive learning has found impressive success in advancing the state of the art in solving various machine learning tasks. However, the existing generalization analysis is very limited or even not meaningful. In particular, the existing generalization error bounds depend linearly on the number $k$ of negative examples while it was widely shown in practice that choosing a large $k$ is necessary to guarantee good generalization of contrastive learning in downstream tasks. In this paper, we establish novel generalization bounds for contrastive learning which do not depend on $k$, up to logarithmic terms. Our analysis uses structural results on empirical covering numbers and Rademacher complexities to exploit the Lipschitz continuity of loss functions. For self-bounding Lipschitz loss functions, we further improve our results by developing optimistic bounds which imply fast rates in a low noise condition. We apply our results to learning with both linear representation and nonlinear representation by deep neural networks, for both of which we derive Rademacher complexity bounds to get improved generalization bounds.
format	Article
fullrecord	<record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_2780249881</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2780249881</sourcerecordid><originalsourceid>FETCH-proquest_journals_27802498813</originalsourceid><addsrcrecordid>eNqNzL0KwjAUQOEgCBbtOwScC-lNa-MkUvwZnMS9ZLgtKeWm5qaCPr2CPoDTWT7OTCSgdZ6ZAmAhUuZeKQWbCspSJ2J3QsJgB_ey0XmSe7LDkx3L1gdZe4rBcnQPlFccAzJS_LoL2kCOupWYt3ZgTH9divXxcKvP2Rj8fUKOTe-n8JlyA5VRUGyNyfV_6g1o1jmi</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2780249881</pqid></control><display><type>article</type><title>Generalization Analysis for Contrastive Representation Learning</title><source>Freely Accessible Journals</source><creator>Yunwen Lei ; Yang, Tianbao ; Ying, Yiming ; Ding-Xuan, Zhou</creator><creatorcontrib>Yunwen Lei ; Yang, Tianbao ; Ying, Yiming ; Ding-Xuan, Zhou</creatorcontrib><description>Recently, contrastive learning has found impressive success in advancing the state of the art in solving various machine learning tasks. However, the existing generalization analysis is very limited or even not meaningful. In particular, the existing generalization error bounds depend linearly on the number $k$ of negative examples while it was widely shown in practice that choosing a large $k$ is necessary to guarantee good generalization of contrastive learning in downstream tasks. In this paper, we establish novel generalization bounds for contrastive learning which do not depend on $k$, up to logarithmic terms. Our analysis uses structural results on empirical covering numbers and Rademacher complexities to exploit the Lipschitz continuity of loss functions. For self-bounding Lipschitz loss functions, we further improve our results by developing optimistic bounds which imply fast rates in a low noise condition. We apply our results to learning with both linear representation and nonlinear representation by deep neural networks, for both of which we derive Rademacher complexity bounds to get improved generalization bounds.</description><identifier>EISSN: 2331-8422</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Artificial neural networks ; Cognitive tasks ; Continuity (mathematics) ; Empirical analysis ; Low noise ; Machine learning ; Representations</subject><ispartof>arXiv.org, 2023-02</ispartof><rights>2023. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>776,780</link.rule.ids></links><search><creatorcontrib>Yunwen Lei</creatorcontrib><creatorcontrib>Yang, Tianbao</creatorcontrib><creatorcontrib>Ying, Yiming</creatorcontrib><creatorcontrib>Ding-Xuan, Zhou</creatorcontrib><title>Generalization Analysis for Contrastive Representation Learning</title><title>arXiv.org</title><description>Recently, contrastive learning has found impressive success in advancing the state of the art in solving various machine learning tasks. However, the existing generalization analysis is very limited or even not meaningful. In particular, the existing generalization error bounds depend linearly on the number $k$ of negative examples while it was widely shown in practice that choosing a large $k$ is necessary to guarantee good generalization of contrastive learning in downstream tasks. In this paper, we establish novel generalization bounds for contrastive learning which do not depend on $k$, up to logarithmic terms. Our analysis uses structural results on empirical covering numbers and Rademacher complexities to exploit the Lipschitz continuity of loss functions. For self-bounding Lipschitz loss functions, we further improve our results by developing optimistic bounds which imply fast rates in a low noise condition. We apply our results to learning with both linear representation and nonlinear representation by deep neural networks, for both of which we derive Rademacher complexity bounds to get improved generalization bounds.</description><subject>Artificial neural networks</subject><subject>Cognitive tasks</subject><subject>Continuity (mathematics)</subject><subject>Empirical analysis</subject><subject>Low noise</subject><subject>Machine learning</subject><subject>Representations</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>BENPR</sourceid><recordid>eNqNzL0KwjAUQOEgCBbtOwScC-lNa-MkUvwZnMS9ZLgtKeWm5qaCPr2CPoDTWT7OTCSgdZ6ZAmAhUuZeKQWbCspSJ2J3QsJgB_ey0XmSe7LDkx3L1gdZe4rBcnQPlFccAzJS_LoL2kCOupWYt3ZgTH9divXxcKvP2Rj8fUKOTe-n8JlyA5VRUGyNyfV_6g1o1jmi</recordid><startdate>20230228</startdate><enddate>20230228</enddate><creator>Yunwen Lei</creator><creator>Yang, Tianbao</creator><creator>Ying, Yiming</creator><creator>Ding-Xuan, Zhou</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20230228</creationdate><title>Generalization Analysis for Contrastive Representation Learning</title><author>Yunwen Lei ; Yang, Tianbao ; Ying, Yiming ; Ding-Xuan, Zhou</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_journals_27802498813</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Artificial neural networks</topic><topic>Cognitive tasks</topic><topic>Continuity (mathematics)</topic><topic>Empirical analysis</topic><topic>Low noise</topic><topic>Machine learning</topic><topic>Representations</topic><toplevel>online_resources</toplevel><creatorcontrib>Yunwen Lei</creatorcontrib><creatorcontrib>Yang, Tianbao</creatorcontrib><creatorcontrib>Ying, Yiming</creatorcontrib><creatorcontrib>Ding-Xuan, Zhou</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science & Engineering Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Yunwen Lei</au><au>Yang, Tianbao</au><au>Ying, Yiming</au><au>Ding-Xuan, Zhou</au><format>book</format><genre>document</genre><ristype>GEN</ristype><atitle>Generalization Analysis for Contrastive Representation Learning</atitle><jtitle>arXiv.org</jtitle><date>2023-02-28</date><risdate>2023</risdate><eissn>2331-8422</eissn><abstract>Recently, contrastive learning has found impressive success in advancing the state of the art in solving various machine learning tasks. However, the existing generalization analysis is very limited or even not meaningful. In particular, the existing generalization error bounds depend linearly on the number $k$ of negative examples while it was widely shown in practice that choosing a large $k$ is necessary to guarantee good generalization of contrastive learning in downstream tasks. In this paper, we establish novel generalization bounds for contrastive learning which do not depend on $k$, up to logarithmic terms. Our analysis uses structural results on empirical covering numbers and Rademacher complexities to exploit the Lipschitz continuity of loss functions. For self-bounding Lipschitz loss functions, we further improve our results by developing optimistic bounds which imply fast rates in a low noise condition. We apply our results to learning with both linear representation and nonlinear representation by deep neural networks, for both of which we derive Rademacher complexity bounds to get improved generalization bounds.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	EISSN: 2331-8422
ispartof	arXiv.org, 2023-02
issn	2331-8422
language	eng
recordid	cdi_proquest_journals_2780249881
source	Freely Accessible Journals
subjects	Artificial neural networks Cognitive tasks Continuity (mathematics) Empirical analysis Low noise Machine learning Representations
title	Generalization Analysis for Contrastive Representation Learning
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-10T19%3A29%3A32IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=document&rft.atitle=Generalization%20Analysis%20for%20Contrastive%20Representation%20Learning&rft.jtitle=arXiv.org&rft.au=Yunwen%20Lei&rft.date=2023-02-28&rft.eissn=2331-8422&rft_id=info:doi/&rft_dat=%3Cproquest%3E2780249881%3C/proquest%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2780249881&rft_id=info:pmid/&rfr_iscdi=true