Misclassification-guided loss under the weighted cross-entropy loss framework

As deep neural networks for visual recognition gain momentum, many studies have modified the loss function to improve the classification performance on long-tailed data. Typical and effective improvement strategies are to assign different weights to different classes or samples, yielding a series of...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Knowledge and information systems 2024-08, Vol.66 (8), p.4685-4720
Hauptverfasser: Wu, Yan-Xue, Du, Kai, Wang, Xian-Jie, Min, Fan
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 4720
container_issue 8
container_start_page 4685
container_title Knowledge and information systems
container_volume 66
creator Wu, Yan-Xue
Du, Kai
Wang, Xian-Jie
Min, Fan
description As deep neural networks for visual recognition gain momentum, many studies have modified the loss function to improve the classification performance on long-tailed data. Typical and effective improvement strategies are to assign different weights to different classes or samples, yielding a series of cost-sensitive re-weighting cross-entropy losses. Granted, most of these strategies only focus on the properties of the training data, such as the data distribution and the samples’ distinguishability. This paper works these strategies into a weighted cross-entropy loss framework with a simple production form ( WCEL ∏ ), which takes into account different features of different losses. Also, there is this new loss function, misclassification-guided loss (MGL), that generalizes the class-wise difficulty-balanced loss and utilizes the misclassification rate on validation data to update class weights during training. In respect of MGL, a series of weighting functions with different relative preferences are introduced. Both softmax MGL and sigmoid MGL are derived to address the multi-class and multi-label classification problems. Experiments are undertaken on four public datasets, namely MNIST-LT, CIFAR-10-LT, CIFAR-100-LT, ImageNet-LT, and a self-built dataset of 4 main-classes, 44 sub-classes, and altogether 57,944 images, where the results show that on the self-built dataset, the exponential weighting function achieves higher balanced accuracy than the polynomial function does. Ablation studies also show that MGL sees better performance in combination with most of other state-of-the-art loss functions under the WCEL ∏ framework.
doi_str_mv 10.1007/s10115-024-02123-5
format Article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_3082428171</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>3082428171</sourcerecordid><originalsourceid>FETCH-LOGICAL-c270t-ac5f6c1f18f6b9f17d24b25bf272cb1dcbf90bd7984a510d5a66237a84d349c83</originalsourceid><addsrcrecordid>eNp9UMtOwzAQtBBIlMIPcIrE2eC14zg5ooqX1IoLnC3Hj9QlTYqdqOrfY5pK3DisdrQ7M7sahG6B3AMh4iECAeCY0DwVUIb5GZolVGEGUJyfMDAhLtFVjBtCQBQAM7Ra-ahbFaN3XqvB9x1uRm-sydo-xmzsjA3ZsLbZ3vpmPaS5DmmBbTeEfneYWC6ord334esaXTjVRntz6nP0-fz0sXjFy_eXt8XjEmsqyICV5q7Q4KB0RV05EIbmNeW1o4LqGoyuXUVqI6oyVxyI4aooKBOqzA3LK12yObqbfHeh_x5tHOSmH0OXTkpGSprTEgQkFp1Yx5eDdXIX_FaFgwQif2OTU2wyxSaPsUmeRGwSxUTuGhv-rP9R_QC083De</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3082428171</pqid></control><display><type>article</type><title>Misclassification-guided loss under the weighted cross-entropy loss framework</title><source>SpringerLink Journals - AutoHoldings</source><creator>Wu, Yan-Xue ; Du, Kai ; Wang, Xian-Jie ; Min, Fan</creator><creatorcontrib>Wu, Yan-Xue ; Du, Kai ; Wang, Xian-Jie ; Min, Fan</creatorcontrib><description>As deep neural networks for visual recognition gain momentum, many studies have modified the loss function to improve the classification performance on long-tailed data. Typical and effective improvement strategies are to assign different weights to different classes or samples, yielding a series of cost-sensitive re-weighting cross-entropy losses. Granted, most of these strategies only focus on the properties of the training data, such as the data distribution and the samples’ distinguishability. This paper works these strategies into a weighted cross-entropy loss framework with a simple production form ( WCEL ∏ ), which takes into account different features of different losses. Also, there is this new loss function, misclassification-guided loss (MGL), that generalizes the class-wise difficulty-balanced loss and utilizes the misclassification rate on validation data to update class weights during training. In respect of MGL, a series of weighting functions with different relative preferences are introduced. Both softmax MGL and sigmoid MGL are derived to address the multi-class and multi-label classification problems. Experiments are undertaken on four public datasets, namely MNIST-LT, CIFAR-10-LT, CIFAR-100-LT, ImageNet-LT, and a self-built dataset of 4 main-classes, 44 sub-classes, and altogether 57,944 images, where the results show that on the self-built dataset, the exponential weighting function achieves higher balanced accuracy than the polynomial function does. Ablation studies also show that MGL sees better performance in combination with most of other state-of-the-art loss functions under the WCEL ∏ framework.</description><identifier>ISSN: 0219-1377</identifier><identifier>EISSN: 0219-3116</identifier><identifier>DOI: 10.1007/s10115-024-02123-5</identifier><language>eng</language><publisher>London: Springer London</publisher><subject>Ablation ; Artificial neural networks ; Classification ; Computer Science ; Data Mining and Knowledge Discovery ; Database Management ; Datasets ; Entropy ; Information Storage and Retrieval ; Information Systems and Communication Service ; Information Systems Applications (incl.Internet) ; IT in Business ; Polynomials ; Regular Paper ; State-of-the-art reviews ; Weighting functions</subject><ispartof>Knowledge and information systems, 2024-08, Vol.66 (8), p.4685-4720</ispartof><rights>The Author(s), under exclusive licence to Springer-Verlag London Ltd., part of Springer Nature 2024. Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c270t-ac5f6c1f18f6b9f17d24b25bf272cb1dcbf90bd7984a510d5a66237a84d349c83</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://link.springer.com/content/pdf/10.1007/s10115-024-02123-5$$EPDF$$P50$$Gspringer$$H</linktopdf><linktohtml>$$Uhttps://link.springer.com/10.1007/s10115-024-02123-5$$EHTML$$P50$$Gspringer$$H</linktohtml><link.rule.ids>314,776,780,27901,27902,41464,42533,51294</link.rule.ids></links><search><creatorcontrib>Wu, Yan-Xue</creatorcontrib><creatorcontrib>Du, Kai</creatorcontrib><creatorcontrib>Wang, Xian-Jie</creatorcontrib><creatorcontrib>Min, Fan</creatorcontrib><title>Misclassification-guided loss under the weighted cross-entropy loss framework</title><title>Knowledge and information systems</title><addtitle>Knowl Inf Syst</addtitle><description>As deep neural networks for visual recognition gain momentum, many studies have modified the loss function to improve the classification performance on long-tailed data. Typical and effective improvement strategies are to assign different weights to different classes or samples, yielding a series of cost-sensitive re-weighting cross-entropy losses. Granted, most of these strategies only focus on the properties of the training data, such as the data distribution and the samples’ distinguishability. This paper works these strategies into a weighted cross-entropy loss framework with a simple production form ( WCEL ∏ ), which takes into account different features of different losses. Also, there is this new loss function, misclassification-guided loss (MGL), that generalizes the class-wise difficulty-balanced loss and utilizes the misclassification rate on validation data to update class weights during training. In respect of MGL, a series of weighting functions with different relative preferences are introduced. Both softmax MGL and sigmoid MGL are derived to address the multi-class and multi-label classification problems. Experiments are undertaken on four public datasets, namely MNIST-LT, CIFAR-10-LT, CIFAR-100-LT, ImageNet-LT, and a self-built dataset of 4 main-classes, 44 sub-classes, and altogether 57,944 images, where the results show that on the self-built dataset, the exponential weighting function achieves higher balanced accuracy than the polynomial function does. Ablation studies also show that MGL sees better performance in combination with most of other state-of-the-art loss functions under the WCEL ∏ framework.</description><subject>Ablation</subject><subject>Artificial neural networks</subject><subject>Classification</subject><subject>Computer Science</subject><subject>Data Mining and Knowledge Discovery</subject><subject>Database Management</subject><subject>Datasets</subject><subject>Entropy</subject><subject>Information Storage and Retrieval</subject><subject>Information Systems and Communication Service</subject><subject>Information Systems Applications (incl.Internet)</subject><subject>IT in Business</subject><subject>Polynomials</subject><subject>Regular Paper</subject><subject>State-of-the-art reviews</subject><subject>Weighting functions</subject><issn>0219-1377</issn><issn>0219-3116</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><recordid>eNp9UMtOwzAQtBBIlMIPcIrE2eC14zg5ooqX1IoLnC3Hj9QlTYqdqOrfY5pK3DisdrQ7M7sahG6B3AMh4iECAeCY0DwVUIb5GZolVGEGUJyfMDAhLtFVjBtCQBQAM7Ra-ahbFaN3XqvB9x1uRm-sydo-xmzsjA3ZsLbZ3vpmPaS5DmmBbTeEfneYWC6ord334esaXTjVRntz6nP0-fz0sXjFy_eXt8XjEmsqyICV5q7Q4KB0RV05EIbmNeW1o4LqGoyuXUVqI6oyVxyI4aooKBOqzA3LK12yObqbfHeh_x5tHOSmH0OXTkpGSprTEgQkFp1Yx5eDdXIX_FaFgwQif2OTU2wyxSaPsUmeRGwSxUTuGhv-rP9R_QC083De</recordid><startdate>20240801</startdate><enddate>20240801</enddate><creator>Wu, Yan-Xue</creator><creator>Du, Kai</creator><creator>Wang, Xian-Jie</creator><creator>Min, Fan</creator><general>Springer London</general><general>Springer Nature B.V</general><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope></search><sort><creationdate>20240801</creationdate><title>Misclassification-guided loss under the weighted cross-entropy loss framework</title><author>Wu, Yan-Xue ; Du, Kai ; Wang, Xian-Jie ; Min, Fan</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c270t-ac5f6c1f18f6b9f17d24b25bf272cb1dcbf90bd7984a510d5a66237a84d349c83</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Ablation</topic><topic>Artificial neural networks</topic><topic>Classification</topic><topic>Computer Science</topic><topic>Data Mining and Knowledge Discovery</topic><topic>Database Management</topic><topic>Datasets</topic><topic>Entropy</topic><topic>Information Storage and Retrieval</topic><topic>Information Systems and Communication Service</topic><topic>Information Systems Applications (incl.Internet)</topic><topic>IT in Business</topic><topic>Polynomials</topic><topic>Regular Paper</topic><topic>State-of-the-art reviews</topic><topic>Weighting functions</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Wu, Yan-Xue</creatorcontrib><creatorcontrib>Du, Kai</creatorcontrib><creatorcontrib>Wang, Xian-Jie</creatorcontrib><creatorcontrib>Min, Fan</creatorcontrib><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>Knowledge and information systems</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Wu, Yan-Xue</au><au>Du, Kai</au><au>Wang, Xian-Jie</au><au>Min, Fan</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Misclassification-guided loss under the weighted cross-entropy loss framework</atitle><jtitle>Knowledge and information systems</jtitle><stitle>Knowl Inf Syst</stitle><date>2024-08-01</date><risdate>2024</risdate><volume>66</volume><issue>8</issue><spage>4685</spage><epage>4720</epage><pages>4685-4720</pages><issn>0219-1377</issn><eissn>0219-3116</eissn><abstract>As deep neural networks for visual recognition gain momentum, many studies have modified the loss function to improve the classification performance on long-tailed data. Typical and effective improvement strategies are to assign different weights to different classes or samples, yielding a series of cost-sensitive re-weighting cross-entropy losses. Granted, most of these strategies only focus on the properties of the training data, such as the data distribution and the samples’ distinguishability. This paper works these strategies into a weighted cross-entropy loss framework with a simple production form ( WCEL ∏ ), which takes into account different features of different losses. Also, there is this new loss function, misclassification-guided loss (MGL), that generalizes the class-wise difficulty-balanced loss and utilizes the misclassification rate on validation data to update class weights during training. In respect of MGL, a series of weighting functions with different relative preferences are introduced. Both softmax MGL and sigmoid MGL are derived to address the multi-class and multi-label classification problems. Experiments are undertaken on four public datasets, namely MNIST-LT, CIFAR-10-LT, CIFAR-100-LT, ImageNet-LT, and a self-built dataset of 4 main-classes, 44 sub-classes, and altogether 57,944 images, where the results show that on the self-built dataset, the exponential weighting function achieves higher balanced accuracy than the polynomial function does. Ablation studies also show that MGL sees better performance in combination with most of other state-of-the-art loss functions under the WCEL ∏ framework.</abstract><cop>London</cop><pub>Springer London</pub><doi>10.1007/s10115-024-02123-5</doi><tpages>36</tpages></addata></record>
fulltext fulltext
identifier ISSN: 0219-1377
ispartof Knowledge and information systems, 2024-08, Vol.66 (8), p.4685-4720
issn 0219-1377
0219-3116
language eng
recordid cdi_proquest_journals_3082428171
source SpringerLink Journals - AutoHoldings
subjects Ablation
Artificial neural networks
Classification
Computer Science
Data Mining and Knowledge Discovery
Database Management
Datasets
Entropy
Information Storage and Retrieval
Information Systems and Communication Service
Information Systems Applications (incl.Internet)
IT in Business
Polynomials
Regular Paper
State-of-the-art reviews
Weighting functions
title Misclassification-guided loss under the weighted cross-entropy loss framework
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-09T15%3A47%3A43IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Misclassification-guided%20loss%20under%20the%20weighted%20cross-entropy%20loss%20framework&rft.jtitle=Knowledge%20and%20information%20systems&rft.au=Wu,%20Yan-Xue&rft.date=2024-08-01&rft.volume=66&rft.issue=8&rft.spage=4685&rft.epage=4720&rft.pages=4685-4720&rft.issn=0219-1377&rft.eissn=0219-3116&rft_id=info:doi/10.1007/s10115-024-02123-5&rft_dat=%3Cproquest_cross%3E3082428171%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=3082428171&rft_id=info:pmid/&rfr_iscdi=true