Dataset Awareness is not Enough: Implementing Sample-level Tail Encouragement in Long-tailed Self-supervised Learning

Self-supervised learning (SSL) has shown remarkable data representation capabilities across a wide range of datasets. However, when applied to real-world datasets with long-tailed distributions, performance on multiple downstream tasks degrades significantly. Recently, the community has begun to foc...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	arXiv.org 2024-11
Hauptverfasser:	Xiao, Haowen, Liu, Guanghui, Gao, Xinyi, Yang, Li, Lv, Fengmao, Chu, Jielei
Format:	Artikel
Sprache:	eng
Schlagworte:	Computer vision Datasets Performance enhancement Self-supervised learning
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title	arXiv.org
container_volume
creator	Xiao, Haowen Liu, Guanghui Gao, Xinyi Yang, Li Lv, Fengmao Chu, Jielei
description	Self-supervised learning (SSL) has shown remarkable data representation capabilities across a wide range of datasets. However, when applied to real-world datasets with long-tailed distributions, performance on multiple downstream tasks degrades significantly. Recently, the community has begun to focus more on self-supervised long-tailed learning. Some works attempt to transfer temperature mechanisms to self-supervised learning or use category-space uniformity constraints to balance the representation of different categories in the embedding space to fight against long-tail distributions. However, most of these approaches focus on the joint optimization of all samples in the dataset or on constraining the category distribution, with little attention given to whether each individual sample is optimally guided during training. To address this issue, we propose Temperature Auxiliary Sample-level Encouragement (TASE). We introduce pseudo-labels into self-supervised long-tailed learning, utilizing pseudo-label information to drive a dynamic temperature and re-weighting strategy. Specifically, We assign an optimal temperature parameter to each sample. Additionally, we analyze the lack of quantity awareness in the temperature parameter and use re-weighting to compensate for this deficiency, thereby achieving optimal training patterns at the sample level. Comprehensive experimental results on six benchmarks across three datasets demonstrate that our method achieves outstanding performance in improving long-tail recognition, while also exhibiting high robustness.
format	Article
fullrecord	<record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_3122763695</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>3122763695</sourcerecordid><originalsourceid>FETCH-proquest_journals_31227636953</originalsourceid><addsrcrecordid>eNqNjMFqAjEURUNBUKr_8KDrwJg4o-2uqEXBnbOXoM8xQ-Zlmpfo7xvFD3B1OZzD_RAjpfVULmZKDcWEuS2KQlVzVZZ6JNLKRMMY4fdmAhIyg2UgH2FNPjWXH9h2vcMOKVpqYG8eJB1e0UFtrMvZ0adgmmcClmDnqZExKzzBHt1ZcuoxXC1n3qEJlH_GYnA2jnHy2k_x9beulxvZB_-fkOOhzaeU1UFPlZpXuvou9XvVHWSYTK0</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3122763695</pqid></control><display><type>article</type><title>Dataset Awareness is not Enough: Implementing Sample-level Tail Encouragement in Long-tailed Self-supervised Learning</title><source>Freely Accessible Journals</source><creator>Xiao, Haowen ; Liu, Guanghui ; Gao, Xinyi ; Yang, Li ; Lv, Fengmao ; Chu, Jielei</creator><creatorcontrib>Xiao, Haowen ; Liu, Guanghui ; Gao, Xinyi ; Yang, Li ; Lv, Fengmao ; Chu, Jielei</creatorcontrib><description>Self-supervised learning (SSL) has shown remarkable data representation capabilities across a wide range of datasets. However, when applied to real-world datasets with long-tailed distributions, performance on multiple downstream tasks degrades significantly. Recently, the community has begun to focus more on self-supervised long-tailed learning. Some works attempt to transfer temperature mechanisms to self-supervised learning or use category-space uniformity constraints to balance the representation of different categories in the embedding space to fight against long-tail distributions. However, most of these approaches focus on the joint optimization of all samples in the dataset or on constraining the category distribution, with little attention given to whether each individual sample is optimally guided during training. To address this issue, we propose Temperature Auxiliary Sample-level Encouragement (TASE). We introduce pseudo-labels into self-supervised long-tailed learning, utilizing pseudo-label information to drive a dynamic temperature and re-weighting strategy. Specifically, We assign an optimal temperature parameter to each sample. Additionally, we analyze the lack of quantity awareness in the temperature parameter and use re-weighting to compensate for this deficiency, thereby achieving optimal training patterns at the sample level. Comprehensive experimental results on six benchmarks across three datasets demonstrate that our method achieves outstanding performance in improving long-tail recognition, while also exhibiting high robustness.</description><identifier>EISSN: 2331-8422</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Computer vision ; Datasets ; Performance enhancement ; Self-supervised learning</subject><ispartof>arXiv.org, 2024-11</ispartof><rights>2024. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>780,784</link.rule.ids></links><search><creatorcontrib>Xiao, Haowen</creatorcontrib><creatorcontrib>Liu, Guanghui</creatorcontrib><creatorcontrib>Gao, Xinyi</creatorcontrib><creatorcontrib>Yang, Li</creatorcontrib><creatorcontrib>Lv, Fengmao</creatorcontrib><creatorcontrib>Chu, Jielei</creatorcontrib><title>Dataset Awareness is not Enough: Implementing Sample-level Tail Encouragement in Long-tailed Self-supervised Learning</title><title>arXiv.org</title><description>Self-supervised learning (SSL) has shown remarkable data representation capabilities across a wide range of datasets. However, when applied to real-world datasets with long-tailed distributions, performance on multiple downstream tasks degrades significantly. Recently, the community has begun to focus more on self-supervised long-tailed learning. Some works attempt to transfer temperature mechanisms to self-supervised learning or use category-space uniformity constraints to balance the representation of different categories in the embedding space to fight against long-tail distributions. However, most of these approaches focus on the joint optimization of all samples in the dataset or on constraining the category distribution, with little attention given to whether each individual sample is optimally guided during training. To address this issue, we propose Temperature Auxiliary Sample-level Encouragement (TASE). We introduce pseudo-labels into self-supervised long-tailed learning, utilizing pseudo-label information to drive a dynamic temperature and re-weighting strategy. Specifically, We assign an optimal temperature parameter to each sample. Additionally, we analyze the lack of quantity awareness in the temperature parameter and use re-weighting to compensate for this deficiency, thereby achieving optimal training patterns at the sample level. Comprehensive experimental results on six benchmarks across three datasets demonstrate that our method achieves outstanding performance in improving long-tail recognition, while also exhibiting high robustness.</description><subject>Computer vision</subject><subject>Datasets</subject><subject>Performance enhancement</subject><subject>Self-supervised learning</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><recordid>eNqNjMFqAjEURUNBUKr_8KDrwJg4o-2uqEXBnbOXoM8xQ-Zlmpfo7xvFD3B1OZzD_RAjpfVULmZKDcWEuS2KQlVzVZZ6JNLKRMMY4fdmAhIyg2UgH2FNPjWXH9h2vcMOKVpqYG8eJB1e0UFtrMvZ0adgmmcClmDnqZExKzzBHt1ZcuoxXC1n3qEJlH_GYnA2jnHy2k_x9beulxvZB_-fkOOhzaeU1UFPlZpXuvou9XvVHWSYTK0</recordid><startdate>20241115</startdate><enddate>20241115</enddate><creator>Xiao, Haowen</creator><creator>Liu, Guanghui</creator><creator>Gao, Xinyi</creator><creator>Yang, Li</creator><creator>Lv, Fengmao</creator><creator>Chu, Jielei</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20241115</creationdate><title>Dataset Awareness is not Enough: Implementing Sample-level Tail Encouragement in Long-tailed Self-supervised Learning</title><author>Xiao, Haowen ; Liu, Guanghui ; Gao, Xinyi ; Yang, Li ; Lv, Fengmao ; Chu, Jielei</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_journals_31227636953</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Computer vision</topic><topic>Datasets</topic><topic>Performance enhancement</topic><topic>Self-supervised learning</topic><toplevel>online_resources</toplevel><creatorcontrib>Xiao, Haowen</creatorcontrib><creatorcontrib>Liu, Guanghui</creatorcontrib><creatorcontrib>Gao, Xinyi</creatorcontrib><creatorcontrib>Yang, Li</creatorcontrib><creatorcontrib>Lv, Fengmao</creatorcontrib><creatorcontrib>Chu, Jielei</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science & Engineering Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Xiao, Haowen</au><au>Liu, Guanghui</au><au>Gao, Xinyi</au><au>Yang, Li</au><au>Lv, Fengmao</au><au>Chu, Jielei</au><format>book</format><genre>document</genre><ristype>GEN</ristype><atitle>Dataset Awareness is not Enough: Implementing Sample-level Tail Encouragement in Long-tailed Self-supervised Learning</atitle><jtitle>arXiv.org</jtitle><date>2024-11-15</date><risdate>2024</risdate><eissn>2331-8422</eissn><abstract>Self-supervised learning (SSL) has shown remarkable data representation capabilities across a wide range of datasets. However, when applied to real-world datasets with long-tailed distributions, performance on multiple downstream tasks degrades significantly. Recently, the community has begun to focus more on self-supervised long-tailed learning. Some works attempt to transfer temperature mechanisms to self-supervised learning or use category-space uniformity constraints to balance the representation of different categories in the embedding space to fight against long-tail distributions. However, most of these approaches focus on the joint optimization of all samples in the dataset or on constraining the category distribution, with little attention given to whether each individual sample is optimally guided during training. To address this issue, we propose Temperature Auxiliary Sample-level Encouragement (TASE). We introduce pseudo-labels into self-supervised long-tailed learning, utilizing pseudo-label information to drive a dynamic temperature and re-weighting strategy. Specifically, We assign an optimal temperature parameter to each sample. Additionally, we analyze the lack of quantity awareness in the temperature parameter and use re-weighting to compensate for this deficiency, thereby achieving optimal training patterns at the sample level. Comprehensive experimental results on six benchmarks across three datasets demonstrate that our method achieves outstanding performance in improving long-tail recognition, while also exhibiting high robustness.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	EISSN: 2331-8422
ispartof	arXiv.org, 2024-11
issn	2331-8422
language	eng
recordid	cdi_proquest_journals_3122763695
source	Freely Accessible Journals
subjects	Computer vision Datasets Performance enhancement Self-supervised learning
title	Dataset Awareness is not Enough: Implementing Sample-level Tail Encouragement in Long-tailed Self-supervised Learning
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-21T01%3A36%3A40IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=document&rft.atitle=Dataset%20Awareness%20is%20not%20Enough:%20Implementing%20Sample-level%20Tail%20Encouragement%20in%20Long-tailed%20Self-supervised%20Learning&rft.jtitle=arXiv.org&rft.au=Xiao,%20Haowen&rft.date=2024-11-15&rft.eissn=2331-8422&rft_id=info:doi/&rft_dat=%3Cproquest%3E3122763695%3C/proquest%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=3122763695&rft_id=info:pmid/&rfr_iscdi=true