Dataset Awareness is not Enough: Implementing Sample-level Tail Encouragement in Long-tailed Self-supervised Learning
Self-supervised learning (SSL) has shown remarkable data representation capabilities across a wide range of datasets. However, when applied to real-world datasets with long-tailed distributions, performance on multiple downstream tasks degrades significantly. Recently, the community has begun to foc...
Gespeichert in:
Veröffentlicht in: | arXiv.org 2024-11 |
---|---|
Hauptverfasser: | , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | |
---|---|
container_issue | |
container_start_page | |
container_title | arXiv.org |
container_volume | |
creator | Xiao, Haowen Liu, Guanghui Gao, Xinyi Yang, Li Lv, Fengmao Chu, Jielei |
description | Self-supervised learning (SSL) has shown remarkable data representation capabilities across a wide range of datasets. However, when applied to real-world datasets with long-tailed distributions, performance on multiple downstream tasks degrades significantly. Recently, the community has begun to focus more on self-supervised long-tailed learning. Some works attempt to transfer temperature mechanisms to self-supervised learning or use category-space uniformity constraints to balance the representation of different categories in the embedding space to fight against long-tail distributions. However, most of these approaches focus on the joint optimization of all samples in the dataset or on constraining the category distribution, with little attention given to whether each individual sample is optimally guided during training. To address this issue, we propose Temperature Auxiliary Sample-level Encouragement (TASE). We introduce pseudo-labels into self-supervised long-tailed learning, utilizing pseudo-label information to drive a dynamic temperature and re-weighting strategy. Specifically, We assign an optimal temperature parameter to each sample. Additionally, we analyze the lack of quantity awareness in the temperature parameter and use re-weighting to compensate for this deficiency, thereby achieving optimal training patterns at the sample level. Comprehensive experimental results on six benchmarks across three datasets demonstrate that our method achieves outstanding performance in improving long-tail recognition, while also exhibiting high robustness. |
format | Article |
fullrecord | <record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_3122763695</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>3122763695</sourcerecordid><originalsourceid>FETCH-proquest_journals_31227636953</originalsourceid><addsrcrecordid>eNqNjMFqAjEURUNBUKr_8KDrwJg4o-2uqEXBnbOXoM8xQ-Zlmpfo7xvFD3B1OZzD_RAjpfVULmZKDcWEuS2KQlVzVZZ6JNLKRMMY4fdmAhIyg2UgH2FNPjWXH9h2vcMOKVpqYG8eJB1e0UFtrMvZ0adgmmcClmDnqZExKzzBHt1ZcuoxXC1n3qEJlH_GYnA2jnHy2k_x9beulxvZB_-fkOOhzaeU1UFPlZpXuvou9XvVHWSYTK0</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3122763695</pqid></control><display><type>article</type><title>Dataset Awareness is not Enough: Implementing Sample-level Tail Encouragement in Long-tailed Self-supervised Learning</title><source>Freely Accessible Journals</source><creator>Xiao, Haowen ; Liu, Guanghui ; Gao, Xinyi ; Yang, Li ; Lv, Fengmao ; Chu, Jielei</creator><creatorcontrib>Xiao, Haowen ; Liu, Guanghui ; Gao, Xinyi ; Yang, Li ; Lv, Fengmao ; Chu, Jielei</creatorcontrib><description>Self-supervised learning (SSL) has shown remarkable data representation capabilities across a wide range of datasets. However, when applied to real-world datasets with long-tailed distributions, performance on multiple downstream tasks degrades significantly. Recently, the community has begun to focus more on self-supervised long-tailed learning. Some works attempt to transfer temperature mechanisms to self-supervised learning or use category-space uniformity constraints to balance the representation of different categories in the embedding space to fight against long-tail distributions. However, most of these approaches focus on the joint optimization of all samples in the dataset or on constraining the category distribution, with little attention given to whether each individual sample is optimally guided during training. To address this issue, we propose Temperature Auxiliary Sample-level Encouragement (TASE). We introduce pseudo-labels into self-supervised long-tailed learning, utilizing pseudo-label information to drive a dynamic temperature and re-weighting strategy. Specifically, We assign an optimal temperature parameter to each sample. Additionally, we analyze the lack of quantity awareness in the temperature parameter and use re-weighting to compensate for this deficiency, thereby achieving optimal training patterns at the sample level. Comprehensive experimental results on six benchmarks across three datasets demonstrate that our method achieves outstanding performance in improving long-tail recognition, while also exhibiting high robustness.</description><identifier>EISSN: 2331-8422</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Computer vision ; Datasets ; Performance enhancement ; Self-supervised learning</subject><ispartof>arXiv.org, 2024-11</ispartof><rights>2024. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>780,784</link.rule.ids></links><search><creatorcontrib>Xiao, Haowen</creatorcontrib><creatorcontrib>Liu, Guanghui</creatorcontrib><creatorcontrib>Gao, Xinyi</creatorcontrib><creatorcontrib>Yang, Li</creatorcontrib><creatorcontrib>Lv, Fengmao</creatorcontrib><creatorcontrib>Chu, Jielei</creatorcontrib><title>Dataset Awareness is not Enough: Implementing Sample-level Tail Encouragement in Long-tailed Self-supervised Learning</title><title>arXiv.org</title><description>Self-supervised learning (SSL) has shown remarkable data representation capabilities across a wide range of datasets. However, when applied to real-world datasets with long-tailed distributions, performance on multiple downstream tasks degrades significantly. Recently, the community has begun to focus more on self-supervised long-tailed learning. Some works attempt to transfer temperature mechanisms to self-supervised learning or use category-space uniformity constraints to balance the representation of different categories in the embedding space to fight against long-tail distributions. However, most of these approaches focus on the joint optimization of all samples in the dataset or on constraining the category distribution, with little attention given to whether each individual sample is optimally guided during training. To address this issue, we propose Temperature Auxiliary Sample-level Encouragement (TASE). We introduce pseudo-labels into self-supervised long-tailed learning, utilizing pseudo-label information to drive a dynamic temperature and re-weighting strategy. Specifically, We assign an optimal temperature parameter to each sample. Additionally, we analyze the lack of quantity awareness in the temperature parameter and use re-weighting to compensate for this deficiency, thereby achieving optimal training patterns at the sample level. Comprehensive experimental results on six benchmarks across three datasets demonstrate that our method achieves outstanding performance in improving long-tail recognition, while also exhibiting high robustness.</description><subject>Computer vision</subject><subject>Datasets</subject><subject>Performance enhancement</subject><subject>Self-supervised learning</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><recordid>eNqNjMFqAjEURUNBUKr_8KDrwJg4o-2uqEXBnbOXoM8xQ-Zlmpfo7xvFD3B1OZzD_RAjpfVULmZKDcWEuS2KQlVzVZZ6JNLKRMMY4fdmAhIyg2UgH2FNPjWXH9h2vcMOKVpqYG8eJB1e0UFtrMvZ0adgmmcClmDnqZExKzzBHt1ZcuoxXC1n3qEJlH_GYnA2jnHy2k_x9beulxvZB_-fkOOhzaeU1UFPlZpXuvou9XvVHWSYTK0</recordid><startdate>20241115</startdate><enddate>20241115</enddate><creator>Xiao, Haowen</creator><creator>Liu, Guanghui</creator><creator>Gao, Xinyi</creator><creator>Yang, Li</creator><creator>Lv, Fengmao</creator><creator>Chu, Jielei</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20241115</creationdate><title>Dataset Awareness is not Enough: Implementing Sample-level Tail Encouragement in Long-tailed Self-supervised Learning</title><author>Xiao, Haowen ; Liu, Guanghui ; Gao, Xinyi ; Yang, Li ; Lv, Fengmao ; Chu, Jielei</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_journals_31227636953</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Computer vision</topic><topic>Datasets</topic><topic>Performance enhancement</topic><topic>Self-supervised learning</topic><toplevel>online_resources</toplevel><creatorcontrib>Xiao, Haowen</creatorcontrib><creatorcontrib>Liu, Guanghui</creatorcontrib><creatorcontrib>Gao, Xinyi</creatorcontrib><creatorcontrib>Yang, Li</creatorcontrib><creatorcontrib>Lv, Fengmao</creatorcontrib><creatorcontrib>Chu, Jielei</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science & Engineering Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Xiao, Haowen</au><au>Liu, Guanghui</au><au>Gao, Xinyi</au><au>Yang, Li</au><au>Lv, Fengmao</au><au>Chu, Jielei</au><format>book</format><genre>document</genre><ristype>GEN</ristype><atitle>Dataset Awareness is not Enough: Implementing Sample-level Tail Encouragement in Long-tailed Self-supervised Learning</atitle><jtitle>arXiv.org</jtitle><date>2024-11-15</date><risdate>2024</risdate><eissn>2331-8422</eissn><abstract>Self-supervised learning (SSL) has shown remarkable data representation capabilities across a wide range of datasets. However, when applied to real-world datasets with long-tailed distributions, performance on multiple downstream tasks degrades significantly. Recently, the community has begun to focus more on self-supervised long-tailed learning. Some works attempt to transfer temperature mechanisms to self-supervised learning or use category-space uniformity constraints to balance the representation of different categories in the embedding space to fight against long-tail distributions. However, most of these approaches focus on the joint optimization of all samples in the dataset or on constraining the category distribution, with little attention given to whether each individual sample is optimally guided during training. To address this issue, we propose Temperature Auxiliary Sample-level Encouragement (TASE). We introduce pseudo-labels into self-supervised long-tailed learning, utilizing pseudo-label information to drive a dynamic temperature and re-weighting strategy. Specifically, We assign an optimal temperature parameter to each sample. Additionally, we analyze the lack of quantity awareness in the temperature parameter and use re-weighting to compensate for this deficiency, thereby achieving optimal training patterns at the sample level. Comprehensive experimental results on six benchmarks across three datasets demonstrate that our method achieves outstanding performance in improving long-tail recognition, while also exhibiting high robustness.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | EISSN: 2331-8422 |
ispartof | arXiv.org, 2024-11 |
issn | 2331-8422 |
language | eng |
recordid | cdi_proquest_journals_3122763695 |
source | Freely Accessible Journals |
subjects | Computer vision Datasets Performance enhancement Self-supervised learning |
title | Dataset Awareness is not Enough: Implementing Sample-level Tail Encouragement in Long-tailed Self-supervised Learning |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-21T01%3A36%3A40IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=document&rft.atitle=Dataset%20Awareness%20is%20not%20Enough:%20Implementing%20Sample-level%20Tail%20Encouragement%20in%20Long-tailed%20Self-supervised%20Learning&rft.jtitle=arXiv.org&rft.au=Xiao,%20Haowen&rft.date=2024-11-15&rft.eissn=2331-8422&rft_id=info:doi/&rft_dat=%3Cproquest%3E3122763695%3C/proquest%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=3122763695&rft_id=info:pmid/&rfr_iscdi=true |