Dataset Awareness is not Enough: Implementing Sample-level Tail Encouragement in Long-tailed Self-supervised Learning

Self-supervised learning (SSL) has shown remarkable data representation capabilities across a wide range of datasets. However, when applied to real-world datasets with long-tailed distributions, performance on multiple downstream tasks degrades significantly. Recently, the community has begun to foc...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:arXiv.org 2024-11
Hauptverfasser: Xiao, Haowen, Liu, Guanghui, Gao, Xinyi, Yang, Li, Lv, Fengmao, Chu, Jielei
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title arXiv.org
container_volume
creator Xiao, Haowen
Liu, Guanghui
Gao, Xinyi
Yang, Li
Lv, Fengmao
Chu, Jielei
description Self-supervised learning (SSL) has shown remarkable data representation capabilities across a wide range of datasets. However, when applied to real-world datasets with long-tailed distributions, performance on multiple downstream tasks degrades significantly. Recently, the community has begun to focus more on self-supervised long-tailed learning. Some works attempt to transfer temperature mechanisms to self-supervised learning or use category-space uniformity constraints to balance the representation of different categories in the embedding space to fight against long-tail distributions. However, most of these approaches focus on the joint optimization of all samples in the dataset or on constraining the category distribution, with little attention given to whether each individual sample is optimally guided during training. To address this issue, we propose Temperature Auxiliary Sample-level Encouragement (TASE). We introduce pseudo-labels into self-supervised long-tailed learning, utilizing pseudo-label information to drive a dynamic temperature and re-weighting strategy. Specifically, We assign an optimal temperature parameter to each sample. Additionally, we analyze the lack of quantity awareness in the temperature parameter and use re-weighting to compensate for this deficiency, thereby achieving optimal training patterns at the sample level. Comprehensive experimental results on six benchmarks across three datasets demonstrate that our method achieves outstanding performance in improving long-tail recognition, while also exhibiting high robustness.
format Article
fullrecord <record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_3122763695</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>3122763695</sourcerecordid><originalsourceid>FETCH-proquest_journals_31227636953</originalsourceid><addsrcrecordid>eNqNjMFqAjEURUNBUKr_8KDrwJg4o-2uqEXBnbOXoM8xQ-Zlmpfo7xvFD3B1OZzD_RAjpfVULmZKDcWEuS2KQlVzVZZ6JNLKRMMY4fdmAhIyg2UgH2FNPjWXH9h2vcMOKVpqYG8eJB1e0UFtrMvZ0adgmmcClmDnqZExKzzBHt1ZcuoxXC1n3qEJlH_GYnA2jnHy2k_x9beulxvZB_-fkOOhzaeU1UFPlZpXuvou9XvVHWSYTK0</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3122763695</pqid></control><display><type>article</type><title>Dataset Awareness is not Enough: Implementing Sample-level Tail Encouragement in Long-tailed Self-supervised Learning</title><source>Freely Accessible Journals</source><creator>Xiao, Haowen ; Liu, Guanghui ; Gao, Xinyi ; Yang, Li ; Lv, Fengmao ; Chu, Jielei</creator><creatorcontrib>Xiao, Haowen ; Liu, Guanghui ; Gao, Xinyi ; Yang, Li ; Lv, Fengmao ; Chu, Jielei</creatorcontrib><description>Self-supervised learning (SSL) has shown remarkable data representation capabilities across a wide range of datasets. However, when applied to real-world datasets with long-tailed distributions, performance on multiple downstream tasks degrades significantly. Recently, the community has begun to focus more on self-supervised long-tailed learning. Some works attempt to transfer temperature mechanisms to self-supervised learning or use category-space uniformity constraints to balance the representation of different categories in the embedding space to fight against long-tail distributions. However, most of these approaches focus on the joint optimization of all samples in the dataset or on constraining the category distribution, with little attention given to whether each individual sample is optimally guided during training. To address this issue, we propose Temperature Auxiliary Sample-level Encouragement (TASE). We introduce pseudo-labels into self-supervised long-tailed learning, utilizing pseudo-label information to drive a dynamic temperature and re-weighting strategy. Specifically, We assign an optimal temperature parameter to each sample. Additionally, we analyze the lack of quantity awareness in the temperature parameter and use re-weighting to compensate for this deficiency, thereby achieving optimal training patterns at the sample level. Comprehensive experimental results on six benchmarks across three datasets demonstrate that our method achieves outstanding performance in improving long-tail recognition, while also exhibiting high robustness.</description><identifier>EISSN: 2331-8422</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Computer vision ; Datasets ; Performance enhancement ; Self-supervised learning</subject><ispartof>arXiv.org, 2024-11</ispartof><rights>2024. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>780,784</link.rule.ids></links><search><creatorcontrib>Xiao, Haowen</creatorcontrib><creatorcontrib>Liu, Guanghui</creatorcontrib><creatorcontrib>Gao, Xinyi</creatorcontrib><creatorcontrib>Yang, Li</creatorcontrib><creatorcontrib>Lv, Fengmao</creatorcontrib><creatorcontrib>Chu, Jielei</creatorcontrib><title>Dataset Awareness is not Enough: Implementing Sample-level Tail Encouragement in Long-tailed Self-supervised Learning</title><title>arXiv.org</title><description>Self-supervised learning (SSL) has shown remarkable data representation capabilities across a wide range of datasets. However, when applied to real-world datasets with long-tailed distributions, performance on multiple downstream tasks degrades significantly. Recently, the community has begun to focus more on self-supervised long-tailed learning. Some works attempt to transfer temperature mechanisms to self-supervised learning or use category-space uniformity constraints to balance the representation of different categories in the embedding space to fight against long-tail distributions. However, most of these approaches focus on the joint optimization of all samples in the dataset or on constraining the category distribution, with little attention given to whether each individual sample is optimally guided during training. To address this issue, we propose Temperature Auxiliary Sample-level Encouragement (TASE). We introduce pseudo-labels into self-supervised long-tailed learning, utilizing pseudo-label information to drive a dynamic temperature and re-weighting strategy. Specifically, We assign an optimal temperature parameter to each sample. Additionally, we analyze the lack of quantity awareness in the temperature parameter and use re-weighting to compensate for this deficiency, thereby achieving optimal training patterns at the sample level. Comprehensive experimental results on six benchmarks across three datasets demonstrate that our method achieves outstanding performance in improving long-tail recognition, while also exhibiting high robustness.</description><subject>Computer vision</subject><subject>Datasets</subject><subject>Performance enhancement</subject><subject>Self-supervised learning</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><recordid>eNqNjMFqAjEURUNBUKr_8KDrwJg4o-2uqEXBnbOXoM8xQ-Zlmpfo7xvFD3B1OZzD_RAjpfVULmZKDcWEuS2KQlVzVZZ6JNLKRMMY4fdmAhIyg2UgH2FNPjWXH9h2vcMOKVpqYG8eJB1e0UFtrMvZ0adgmmcClmDnqZExKzzBHt1ZcuoxXC1n3qEJlH_GYnA2jnHy2k_x9beulxvZB_-fkOOhzaeU1UFPlZpXuvou9XvVHWSYTK0</recordid><startdate>20241115</startdate><enddate>20241115</enddate><creator>Xiao, Haowen</creator><creator>Liu, Guanghui</creator><creator>Gao, Xinyi</creator><creator>Yang, Li</creator><creator>Lv, Fengmao</creator><creator>Chu, Jielei</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20241115</creationdate><title>Dataset Awareness is not Enough: Implementing Sample-level Tail Encouragement in Long-tailed Self-supervised Learning</title><author>Xiao, Haowen ; Liu, Guanghui ; Gao, Xinyi ; Yang, Li ; Lv, Fengmao ; Chu, Jielei</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_journals_31227636953</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Computer vision</topic><topic>Datasets</topic><topic>Performance enhancement</topic><topic>Self-supervised learning</topic><toplevel>online_resources</toplevel><creatorcontrib>Xiao, Haowen</creatorcontrib><creatorcontrib>Liu, Guanghui</creatorcontrib><creatorcontrib>Gao, Xinyi</creatorcontrib><creatorcontrib>Yang, Li</creatorcontrib><creatorcontrib>Lv, Fengmao</creatorcontrib><creatorcontrib>Chu, Jielei</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science &amp; Engineering Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Xiao, Haowen</au><au>Liu, Guanghui</au><au>Gao, Xinyi</au><au>Yang, Li</au><au>Lv, Fengmao</au><au>Chu, Jielei</au><format>book</format><genre>document</genre><ristype>GEN</ristype><atitle>Dataset Awareness is not Enough: Implementing Sample-level Tail Encouragement in Long-tailed Self-supervised Learning</atitle><jtitle>arXiv.org</jtitle><date>2024-11-15</date><risdate>2024</risdate><eissn>2331-8422</eissn><abstract>Self-supervised learning (SSL) has shown remarkable data representation capabilities across a wide range of datasets. However, when applied to real-world datasets with long-tailed distributions, performance on multiple downstream tasks degrades significantly. Recently, the community has begun to focus more on self-supervised long-tailed learning. Some works attempt to transfer temperature mechanisms to self-supervised learning or use category-space uniformity constraints to balance the representation of different categories in the embedding space to fight against long-tail distributions. However, most of these approaches focus on the joint optimization of all samples in the dataset or on constraining the category distribution, with little attention given to whether each individual sample is optimally guided during training. To address this issue, we propose Temperature Auxiliary Sample-level Encouragement (TASE). We introduce pseudo-labels into self-supervised long-tailed learning, utilizing pseudo-label information to drive a dynamic temperature and re-weighting strategy. Specifically, We assign an optimal temperature parameter to each sample. Additionally, we analyze the lack of quantity awareness in the temperature parameter and use re-weighting to compensate for this deficiency, thereby achieving optimal training patterns at the sample level. Comprehensive experimental results on six benchmarks across three datasets demonstrate that our method achieves outstanding performance in improving long-tail recognition, while also exhibiting high robustness.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier EISSN: 2331-8422
ispartof arXiv.org, 2024-11
issn 2331-8422
language eng
recordid cdi_proquest_journals_3122763695
source Freely Accessible Journals
subjects Computer vision
Datasets
Performance enhancement
Self-supervised learning
title Dataset Awareness is not Enough: Implementing Sample-level Tail Encouragement in Long-tailed Self-supervised Learning
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-21T01%3A36%3A40IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=document&rft.atitle=Dataset%20Awareness%20is%20not%20Enough:%20Implementing%20Sample-level%20Tail%20Encouragement%20in%20Long-tailed%20Self-supervised%20Learning&rft.jtitle=arXiv.org&rft.au=Xiao,%20Haowen&rft.date=2024-11-15&rft.eissn=2331-8422&rft_id=info:doi/&rft_dat=%3Cproquest%3E3122763695%3C/proquest%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=3122763695&rft_id=info:pmid/&rfr_iscdi=true