Weighted Ensemble Self-Supervised Learning

Ensembling has proven to be a powerful technique for boosting model performance, uncertainty estimation, and robustness in supervised learning. Advances in self-supervised learning (SSL) enable leveraging large unlabeled corpora for state-of-the-art few-shot and supervised learning performance. In t...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	arXiv.org 2023-04
Hauptverfasser:	Ruan, Yangjun, Singh, Saurabh, Morningstar, Warren, Alemi, Alexander A, Ioffe, Sergey, Fischer, Ian, Dillon, Joshua V
Format:	Artikel
Sprache:	eng
Schlagworte:	Downstream effects Self-supervised learning Supervised learning
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title	arXiv.org
container_volume
creator	Ruan, Yangjun Singh, Saurabh Morningstar, Warren Alemi, Alexander A Ioffe, Sergey Fischer, Ian Dillon, Joshua V
description	Ensembling has proven to be a powerful technique for boosting model performance, uncertainty estimation, and robustness in supervised learning. Advances in self-supervised learning (SSL) enable leveraging large unlabeled corpora for state-of-the-art few-shot and supervised learning performance. In this paper, we explore how ensemble methods can improve recent SSL techniques by developing a framework that permits data-dependent weighted cross-entropy losses. We refrain from ensembling the representation backbone; this choice yields an efficient ensemble method that incurs a small training cost and requires no architectural changes or computational overhead to downstream evaluation. The effectiveness of our method is demonstrated with two state-of-the-art SSL methods, DINO (Caron et al., 2021) and MSN (Assran et al., 2022). Our method outperforms both in multiple evaluation metrics on ImageNet-1K, particularly in the few-shot setting. We explore several weighting schemes and find that those which increase the diversity of ensemble heads lead to better downstream evaluation results. Thorough experiments yield improved prior art baselines which our method still surpasses; e.g., our overall improvement with MSN ViT-B/16 is 3.9 p.p. for 1-shot learning.
format	Article
fullrecord	<record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_2738301407</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2738301407</sourcerecordid><originalsourceid>FETCH-proquest_journals_27383014073</originalsourceid><addsrcrecordid>eNpjYuA0MjY21LUwMTLiYOAtLs4yMDAwMjM3MjU15mTQCk_NTM8oSU1RcM0rTs1NyklVCE7NSdMNLi1ILSrLLAZK-KQmFuVl5qXzMLCmJeYUp_JCaW4GZTfXEGcP3YKi_MLS1OKS-Kz80qI8oFS8kbmxhbGBoYmBuTFxqgAMETDC</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2738301407</pqid></control><display><type>article</type><title>Weighted Ensemble Self-Supervised Learning</title><source>Freely Accessible Journals</source><creator>Ruan, Yangjun ; Singh, Saurabh ; Morningstar, Warren ; Alemi, Alexander A ; Ioffe, Sergey ; Fischer, Ian ; Dillon, Joshua V</creator><creatorcontrib>Ruan, Yangjun ; Singh, Saurabh ; Morningstar, Warren ; Alemi, Alexander A ; Ioffe, Sergey ; Fischer, Ian ; Dillon, Joshua V</creatorcontrib><description>Ensembling has proven to be a powerful technique for boosting model performance, uncertainty estimation, and robustness in supervised learning. Advances in self-supervised learning (SSL) enable leveraging large unlabeled corpora for state-of-the-art few-shot and supervised learning performance. In this paper, we explore how ensemble methods can improve recent SSL techniques by developing a framework that permits data-dependent weighted cross-entropy losses. We refrain from ensembling the representation backbone; this choice yields an efficient ensemble method that incurs a small training cost and requires no architectural changes or computational overhead to downstream evaluation. The effectiveness of our method is demonstrated with two state-of-the-art SSL methods, DINO (Caron et al., 2021) and MSN (Assran et al., 2022). Our method outperforms both in multiple evaluation metrics on ImageNet-1K, particularly in the few-shot setting. We explore several weighting schemes and find that those which increase the diversity of ensemble heads lead to better downstream evaluation results. Thorough experiments yield improved prior art baselines which our method still surpasses; e.g., our overall improvement with MSN ViT-B/16 is 3.9 p.p. for 1-shot learning.</description><identifier>EISSN: 2331-8422</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Downstream effects ; Self-supervised learning ; Supervised learning</subject><ispartof>arXiv.org, 2023-04</ispartof><rights>2023. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>782,786</link.rule.ids></links><search><creatorcontrib>Ruan, Yangjun</creatorcontrib><creatorcontrib>Singh, Saurabh</creatorcontrib><creatorcontrib>Morningstar, Warren</creatorcontrib><creatorcontrib>Alemi, Alexander A</creatorcontrib><creatorcontrib>Ioffe, Sergey</creatorcontrib><creatorcontrib>Fischer, Ian</creatorcontrib><creatorcontrib>Dillon, Joshua V</creatorcontrib><title>Weighted Ensemble Self-Supervised Learning</title><title>arXiv.org</title><description>Ensembling has proven to be a powerful technique for boosting model performance, uncertainty estimation, and robustness in supervised learning. Advances in self-supervised learning (SSL) enable leveraging large unlabeled corpora for state-of-the-art few-shot and supervised learning performance. In this paper, we explore how ensemble methods can improve recent SSL techniques by developing a framework that permits data-dependent weighted cross-entropy losses. We refrain from ensembling the representation backbone; this choice yields an efficient ensemble method that incurs a small training cost and requires no architectural changes or computational overhead to downstream evaluation. The effectiveness of our method is demonstrated with two state-of-the-art SSL methods, DINO (Caron et al., 2021) and MSN (Assran et al., 2022). Our method outperforms both in multiple evaluation metrics on ImageNet-1K, particularly in the few-shot setting. We explore several weighting schemes and find that those which increase the diversity of ensemble heads lead to better downstream evaluation results. Thorough experiments yield improved prior art baselines which our method still surpasses; e.g., our overall improvement with MSN ViT-B/16 is 3.9 p.p. for 1-shot learning.</description><subject>Downstream effects</subject><subject>Self-supervised learning</subject><subject>Supervised learning</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><recordid>eNpjYuA0MjY21LUwMTLiYOAtLs4yMDAwMjM3MjU15mTQCk_NTM8oSU1RcM0rTs1NyklVCE7NSdMNLi1ILSrLLAZK-KQmFuVl5qXzMLCmJeYUp_JCaW4GZTfXEGcP3YKi_MLS1OKS-Kz80qI8oFS8kbmxhbGBoYmBuTFxqgAMETDC</recordid><startdate>20230409</startdate><enddate>20230409</enddate><creator>Ruan, Yangjun</creator><creator>Singh, Saurabh</creator><creator>Morningstar, Warren</creator><creator>Alemi, Alexander A</creator><creator>Ioffe, Sergey</creator><creator>Fischer, Ian</creator><creator>Dillon, Joshua V</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20230409</creationdate><title>Weighted Ensemble Self-Supervised Learning</title><author>Ruan, Yangjun ; Singh, Saurabh ; Morningstar, Warren ; Alemi, Alexander A ; Ioffe, Sergey ; Fischer, Ian ; Dillon, Joshua V</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_journals_27383014073</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Downstream effects</topic><topic>Self-supervised learning</topic><topic>Supervised learning</topic><toplevel>online_resources</toplevel><creatorcontrib>Ruan, Yangjun</creatorcontrib><creatorcontrib>Singh, Saurabh</creatorcontrib><creatorcontrib>Morningstar, Warren</creatorcontrib><creatorcontrib>Alemi, Alexander A</creatorcontrib><creatorcontrib>Ioffe, Sergey</creatorcontrib><creatorcontrib>Fischer, Ian</creatorcontrib><creatorcontrib>Dillon, Joshua V</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science & Engineering Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Access via ProQuest (Open Access)</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Ruan, Yangjun</au><au>Singh, Saurabh</au><au>Morningstar, Warren</au><au>Alemi, Alexander A</au><au>Ioffe, Sergey</au><au>Fischer, Ian</au><au>Dillon, Joshua V</au><format>book</format><genre>document</genre><ristype>GEN</ristype><atitle>Weighted Ensemble Self-Supervised Learning</atitle><jtitle>arXiv.org</jtitle><date>2023-04-09</date><risdate>2023</risdate><eissn>2331-8422</eissn><abstract>Ensembling has proven to be a powerful technique for boosting model performance, uncertainty estimation, and robustness in supervised learning. Advances in self-supervised learning (SSL) enable leveraging large unlabeled corpora for state-of-the-art few-shot and supervised learning performance. In this paper, we explore how ensemble methods can improve recent SSL techniques by developing a framework that permits data-dependent weighted cross-entropy losses. We refrain from ensembling the representation backbone; this choice yields an efficient ensemble method that incurs a small training cost and requires no architectural changes or computational overhead to downstream evaluation. The effectiveness of our method is demonstrated with two state-of-the-art SSL methods, DINO (Caron et al., 2021) and MSN (Assran et al., 2022). Our method outperforms both in multiple evaluation metrics on ImageNet-1K, particularly in the few-shot setting. We explore several weighting schemes and find that those which increase the diversity of ensemble heads lead to better downstream evaluation results. Thorough experiments yield improved prior art baselines which our method still surpasses; e.g., our overall improvement with MSN ViT-B/16 is 3.9 p.p. for 1-shot learning.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	EISSN: 2331-8422
ispartof	arXiv.org, 2023-04
issn	2331-8422
language	eng
recordid	cdi_proquest_journals_2738301407
source	Freely Accessible Journals
subjects	Downstream effects Self-supervised learning Supervised learning
title	Weighted Ensemble Self-Supervised Learning
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-04T13%3A15%3A20IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=document&rft.atitle=Weighted%20Ensemble%20Self-Supervised%20Learning&rft.jtitle=arXiv.org&rft.au=Ruan,%20Yangjun&rft.date=2023-04-09&rft.eissn=2331-8422&rft_id=info:doi/&rft_dat=%3Cproquest%3E2738301407%3C/proquest%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2738301407&rft_id=info:pmid/&rfr_iscdi=true