Misclassification-guided loss under the weighted cross-entropy loss framework
As deep neural networks for visual recognition gain momentum, many studies have modified the loss function to improve the classification performance on long-tailed data. Typical and effective improvement strategies are to assign different weights to different classes or samples, yielding a series of...
Gespeichert in:
Veröffentlicht in: | Knowledge and information systems 2024-08, Vol.66 (8), p.4685-4720 |
---|---|
Hauptverfasser: | , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 4720 |
---|---|
container_issue | 8 |
container_start_page | 4685 |
container_title | Knowledge and information systems |
container_volume | 66 |
creator | Wu, Yan-Xue Du, Kai Wang, Xian-Jie Min, Fan |
description | As deep neural networks for visual recognition gain momentum, many studies have modified the loss function to improve the classification performance on long-tailed data. Typical and effective improvement strategies are to assign different weights to different classes or samples, yielding a series of cost-sensitive re-weighting cross-entropy losses. Granted, most of these strategies only focus on the properties of the training data, such as the data distribution and the samples’ distinguishability. This paper works these strategies into a weighted cross-entropy loss framework with a simple production form (
WCEL
∏
), which takes into account different features of different losses. Also, there is this new loss function, misclassification-guided loss (MGL), that generalizes the class-wise difficulty-balanced loss and utilizes the misclassification rate on validation data to update class weights during training. In respect of MGL, a series of weighting functions with different relative preferences are introduced. Both softmax MGL and sigmoid MGL are derived to address the multi-class and multi-label classification problems. Experiments are undertaken on four public datasets, namely MNIST-LT, CIFAR-10-LT, CIFAR-100-LT, ImageNet-LT, and a self-built dataset of 4 main-classes, 44 sub-classes, and altogether 57,944 images, where the results show that on the self-built dataset, the exponential weighting function achieves higher balanced accuracy than the polynomial function does. Ablation studies also show that MGL sees better performance in combination with most of other state-of-the-art loss functions under the
WCEL
∏
framework. |
doi_str_mv | 10.1007/s10115-024-02123-5 |
format | Article |
fullrecord | <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_3082428171</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>3082428171</sourcerecordid><originalsourceid>FETCH-LOGICAL-c270t-ac5f6c1f18f6b9f17d24b25bf272cb1dcbf90bd7984a510d5a66237a84d349c83</originalsourceid><addsrcrecordid>eNp9UMtOwzAQtBBIlMIPcIrE2eC14zg5ooqX1IoLnC3Hj9QlTYqdqOrfY5pK3DisdrQ7M7sahG6B3AMh4iECAeCY0DwVUIb5GZolVGEGUJyfMDAhLtFVjBtCQBQAM7Ra-ahbFaN3XqvB9x1uRm-sydo-xmzsjA3ZsLbZ3vpmPaS5DmmBbTeEfneYWC6ord334esaXTjVRntz6nP0-fz0sXjFy_eXt8XjEmsqyICV5q7Q4KB0RV05EIbmNeW1o4LqGoyuXUVqI6oyVxyI4aooKBOqzA3LK12yObqbfHeh_x5tHOSmH0OXTkpGSprTEgQkFp1Yx5eDdXIX_FaFgwQif2OTU2wyxSaPsUmeRGwSxUTuGhv-rP9R_QC083De</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3082428171</pqid></control><display><type>article</type><title>Misclassification-guided loss under the weighted cross-entropy loss framework</title><source>SpringerLink Journals - AutoHoldings</source><creator>Wu, Yan-Xue ; Du, Kai ; Wang, Xian-Jie ; Min, Fan</creator><creatorcontrib>Wu, Yan-Xue ; Du, Kai ; Wang, Xian-Jie ; Min, Fan</creatorcontrib><description>As deep neural networks for visual recognition gain momentum, many studies have modified the loss function to improve the classification performance on long-tailed data. Typical and effective improvement strategies are to assign different weights to different classes or samples, yielding a series of cost-sensitive re-weighting cross-entropy losses. Granted, most of these strategies only focus on the properties of the training data, such as the data distribution and the samples’ distinguishability. This paper works these strategies into a weighted cross-entropy loss framework with a simple production form (
WCEL
∏
), which takes into account different features of different losses. Also, there is this new loss function, misclassification-guided loss (MGL), that generalizes the class-wise difficulty-balanced loss and utilizes the misclassification rate on validation data to update class weights during training. In respect of MGL, a series of weighting functions with different relative preferences are introduced. Both softmax MGL and sigmoid MGL are derived to address the multi-class and multi-label classification problems. Experiments are undertaken on four public datasets, namely MNIST-LT, CIFAR-10-LT, CIFAR-100-LT, ImageNet-LT, and a self-built dataset of 4 main-classes, 44 sub-classes, and altogether 57,944 images, where the results show that on the self-built dataset, the exponential weighting function achieves higher balanced accuracy than the polynomial function does. Ablation studies also show that MGL sees better performance in combination with most of other state-of-the-art loss functions under the
WCEL
∏
framework.</description><identifier>ISSN: 0219-1377</identifier><identifier>EISSN: 0219-3116</identifier><identifier>DOI: 10.1007/s10115-024-02123-5</identifier><language>eng</language><publisher>London: Springer London</publisher><subject>Ablation ; Artificial neural networks ; Classification ; Computer Science ; Data Mining and Knowledge Discovery ; Database Management ; Datasets ; Entropy ; Information Storage and Retrieval ; Information Systems and Communication Service ; Information Systems Applications (incl.Internet) ; IT in Business ; Polynomials ; Regular Paper ; State-of-the-art reviews ; Weighting functions</subject><ispartof>Knowledge and information systems, 2024-08, Vol.66 (8), p.4685-4720</ispartof><rights>The Author(s), under exclusive licence to Springer-Verlag London Ltd., part of Springer Nature 2024. Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c270t-ac5f6c1f18f6b9f17d24b25bf272cb1dcbf90bd7984a510d5a66237a84d349c83</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://link.springer.com/content/pdf/10.1007/s10115-024-02123-5$$EPDF$$P50$$Gspringer$$H</linktopdf><linktohtml>$$Uhttps://link.springer.com/10.1007/s10115-024-02123-5$$EHTML$$P50$$Gspringer$$H</linktohtml><link.rule.ids>314,776,780,27901,27902,41464,42533,51294</link.rule.ids></links><search><creatorcontrib>Wu, Yan-Xue</creatorcontrib><creatorcontrib>Du, Kai</creatorcontrib><creatorcontrib>Wang, Xian-Jie</creatorcontrib><creatorcontrib>Min, Fan</creatorcontrib><title>Misclassification-guided loss under the weighted cross-entropy loss framework</title><title>Knowledge and information systems</title><addtitle>Knowl Inf Syst</addtitle><description>As deep neural networks for visual recognition gain momentum, many studies have modified the loss function to improve the classification performance on long-tailed data. Typical and effective improvement strategies are to assign different weights to different classes or samples, yielding a series of cost-sensitive re-weighting cross-entropy losses. Granted, most of these strategies only focus on the properties of the training data, such as the data distribution and the samples’ distinguishability. This paper works these strategies into a weighted cross-entropy loss framework with a simple production form (
WCEL
∏
), which takes into account different features of different losses. Also, there is this new loss function, misclassification-guided loss (MGL), that generalizes the class-wise difficulty-balanced loss and utilizes the misclassification rate on validation data to update class weights during training. In respect of MGL, a series of weighting functions with different relative preferences are introduced. Both softmax MGL and sigmoid MGL are derived to address the multi-class and multi-label classification problems. Experiments are undertaken on four public datasets, namely MNIST-LT, CIFAR-10-LT, CIFAR-100-LT, ImageNet-LT, and a self-built dataset of 4 main-classes, 44 sub-classes, and altogether 57,944 images, where the results show that on the self-built dataset, the exponential weighting function achieves higher balanced accuracy than the polynomial function does. Ablation studies also show that MGL sees better performance in combination with most of other state-of-the-art loss functions under the
WCEL
∏
framework.</description><subject>Ablation</subject><subject>Artificial neural networks</subject><subject>Classification</subject><subject>Computer Science</subject><subject>Data Mining and Knowledge Discovery</subject><subject>Database Management</subject><subject>Datasets</subject><subject>Entropy</subject><subject>Information Storage and Retrieval</subject><subject>Information Systems and Communication Service</subject><subject>Information Systems Applications (incl.Internet)</subject><subject>IT in Business</subject><subject>Polynomials</subject><subject>Regular Paper</subject><subject>State-of-the-art reviews</subject><subject>Weighting functions</subject><issn>0219-1377</issn><issn>0219-3116</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><recordid>eNp9UMtOwzAQtBBIlMIPcIrE2eC14zg5ooqX1IoLnC3Hj9QlTYqdqOrfY5pK3DisdrQ7M7sahG6B3AMh4iECAeCY0DwVUIb5GZolVGEGUJyfMDAhLtFVjBtCQBQAM7Ra-ahbFaN3XqvB9x1uRm-sydo-xmzsjA3ZsLbZ3vpmPaS5DmmBbTeEfneYWC6ord334esaXTjVRntz6nP0-fz0sXjFy_eXt8XjEmsqyICV5q7Q4KB0RV05EIbmNeW1o4LqGoyuXUVqI6oyVxyI4aooKBOqzA3LK12yObqbfHeh_x5tHOSmH0OXTkpGSprTEgQkFp1Yx5eDdXIX_FaFgwQif2OTU2wyxSaPsUmeRGwSxUTuGhv-rP9R_QC083De</recordid><startdate>20240801</startdate><enddate>20240801</enddate><creator>Wu, Yan-Xue</creator><creator>Du, Kai</creator><creator>Wang, Xian-Jie</creator><creator>Min, Fan</creator><general>Springer London</general><general>Springer Nature B.V</general><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope></search><sort><creationdate>20240801</creationdate><title>Misclassification-guided loss under the weighted cross-entropy loss framework</title><author>Wu, Yan-Xue ; Du, Kai ; Wang, Xian-Jie ; Min, Fan</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c270t-ac5f6c1f18f6b9f17d24b25bf272cb1dcbf90bd7984a510d5a66237a84d349c83</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Ablation</topic><topic>Artificial neural networks</topic><topic>Classification</topic><topic>Computer Science</topic><topic>Data Mining and Knowledge Discovery</topic><topic>Database Management</topic><topic>Datasets</topic><topic>Entropy</topic><topic>Information Storage and Retrieval</topic><topic>Information Systems and Communication Service</topic><topic>Information Systems Applications (incl.Internet)</topic><topic>IT in Business</topic><topic>Polynomials</topic><topic>Regular Paper</topic><topic>State-of-the-art reviews</topic><topic>Weighting functions</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Wu, Yan-Xue</creatorcontrib><creatorcontrib>Du, Kai</creatorcontrib><creatorcontrib>Wang, Xian-Jie</creatorcontrib><creatorcontrib>Min, Fan</creatorcontrib><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>Knowledge and information systems</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Wu, Yan-Xue</au><au>Du, Kai</au><au>Wang, Xian-Jie</au><au>Min, Fan</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Misclassification-guided loss under the weighted cross-entropy loss framework</atitle><jtitle>Knowledge and information systems</jtitle><stitle>Knowl Inf Syst</stitle><date>2024-08-01</date><risdate>2024</risdate><volume>66</volume><issue>8</issue><spage>4685</spage><epage>4720</epage><pages>4685-4720</pages><issn>0219-1377</issn><eissn>0219-3116</eissn><abstract>As deep neural networks for visual recognition gain momentum, many studies have modified the loss function to improve the classification performance on long-tailed data. Typical and effective improvement strategies are to assign different weights to different classes or samples, yielding a series of cost-sensitive re-weighting cross-entropy losses. Granted, most of these strategies only focus on the properties of the training data, such as the data distribution and the samples’ distinguishability. This paper works these strategies into a weighted cross-entropy loss framework with a simple production form (
WCEL
∏
), which takes into account different features of different losses. Also, there is this new loss function, misclassification-guided loss (MGL), that generalizes the class-wise difficulty-balanced loss and utilizes the misclassification rate on validation data to update class weights during training. In respect of MGL, a series of weighting functions with different relative preferences are introduced. Both softmax MGL and sigmoid MGL are derived to address the multi-class and multi-label classification problems. Experiments are undertaken on four public datasets, namely MNIST-LT, CIFAR-10-LT, CIFAR-100-LT, ImageNet-LT, and a self-built dataset of 4 main-classes, 44 sub-classes, and altogether 57,944 images, where the results show that on the self-built dataset, the exponential weighting function achieves higher balanced accuracy than the polynomial function does. Ablation studies also show that MGL sees better performance in combination with most of other state-of-the-art loss functions under the
WCEL
∏
framework.</abstract><cop>London</cop><pub>Springer London</pub><doi>10.1007/s10115-024-02123-5</doi><tpages>36</tpages></addata></record> |
fulltext | fulltext |
identifier | ISSN: 0219-1377 |
ispartof | Knowledge and information systems, 2024-08, Vol.66 (8), p.4685-4720 |
issn | 0219-1377 0219-3116 |
language | eng |
recordid | cdi_proquest_journals_3082428171 |
source | SpringerLink Journals - AutoHoldings |
subjects | Ablation Artificial neural networks Classification Computer Science Data Mining and Knowledge Discovery Database Management Datasets Entropy Information Storage and Retrieval Information Systems and Communication Service Information Systems Applications (incl.Internet) IT in Business Polynomials Regular Paper State-of-the-art reviews Weighting functions |
title | Misclassification-guided loss under the weighted cross-entropy loss framework |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-09T15%3A47%3A43IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Misclassification-guided%20loss%20under%20the%20weighted%20cross-entropy%20loss%20framework&rft.jtitle=Knowledge%20and%20information%20systems&rft.au=Wu,%20Yan-Xue&rft.date=2024-08-01&rft.volume=66&rft.issue=8&rft.spage=4685&rft.epage=4720&rft.pages=4685-4720&rft.issn=0219-1377&rft.eissn=0219-3116&rft_id=info:doi/10.1007/s10115-024-02123-5&rft_dat=%3Cproquest_cross%3E3082428171%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=3082428171&rft_id=info:pmid/&rfr_iscdi=true |