Mitigating the Impact of Labeling Errors on Training via Rockafellian Relaxation

Labeling errors in datasets are common, if not systematic, in practice. They naturally arise in a variety of contexts-human labeling, noisy labeling, and weak labeling (i.e., image classification), for example. This presents a persistent and pervasive stress on machine learning practice. In particul...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:arXiv.org 2024-05
Hauptverfasser: Chen, Louis L, Chern, Bobbie, Eckstrand, Eric, Mahapatra, Amogh, Royset, Johannes O
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title arXiv.org
container_volume
creator Chen, Louis L
Chern, Bobbie
Eckstrand, Eric
Mahapatra, Amogh
Royset, Johannes O
description Labeling errors in datasets are common, if not systematic, in practice. They naturally arise in a variety of contexts-human labeling, noisy labeling, and weak labeling (i.e., image classification), for example. This presents a persistent and pervasive stress on machine learning practice. In particular, neural network (NN) architectures can withstand minor amounts of dataset imperfection with traditional countermeasures such as regularization, data augmentation, and batch normalization. However, major dataset imperfections often prove insurmountable. We propose and study the implementation of Rockafellian Relaxation (RR), a new loss reweighting, architecture-independent methodology, for neural network training. Experiments indicate RR can enhance standard neural network methods to achieve robust performance across classification tasks in computer vision and natural language processing (sentiment analysis). We find that RR can mitigate the effects of dataset corruption due to both (heavy) labeling error and/or adversarial perturbation, demonstrating effectiveness across a variety of data domains and machine learning tasks.
format Article
fullrecord <record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_3063930184</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>3063930184</sourcerecordid><originalsourceid>FETCH-proquest_journals_30639301843</originalsourceid><addsrcrecordid>eNqNjNEKgjAYRkcQJOU7_NC1MDc1uw6joCCke_mTabO12Tajx0-hB-jqwDkf34wEjPM4yhPGFiR0rqOUsmzD0pQH5HKWXrbopW7B3wUcnz3WHkwDJ7wJNenCWmMdGA1Xi1JP6i0RSlM_sBFKSdRQCoWf8cXoFZk3qJwIf1yS9b647g5Rb81rEM5XnRmsHlPFaca3nMZ5wv9bfQE4tz62</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3063930184</pqid></control><display><type>article</type><title>Mitigating the Impact of Labeling Errors on Training via Rockafellian Relaxation</title><source>Free E- Journals</source><creator>Chen, Louis L ; Chern, Bobbie ; Eckstrand, Eric ; Mahapatra, Amogh ; Royset, Johannes O</creator><creatorcontrib>Chen, Louis L ; Chern, Bobbie ; Eckstrand, Eric ; Mahapatra, Amogh ; Royset, Johannes O</creatorcontrib><description>Labeling errors in datasets are common, if not systematic, in practice. They naturally arise in a variety of contexts-human labeling, noisy labeling, and weak labeling (i.e., image classification), for example. This presents a persistent and pervasive stress on machine learning practice. In particular, neural network (NN) architectures can withstand minor amounts of dataset imperfection with traditional countermeasures such as regularization, data augmentation, and batch normalization. However, major dataset imperfections often prove insurmountable. We propose and study the implementation of Rockafellian Relaxation (RR), a new loss reweighting, architecture-independent methodology, for neural network training. Experiments indicate RR can enhance standard neural network methods to achieve robust performance across classification tasks in computer vision and natural language processing (sentiment analysis). We find that RR can mitigate the effects of dataset corruption due to both (heavy) labeling error and/or adversarial perturbation, demonstrating effectiveness across a variety of data domains and machine learning tasks.</description><identifier>EISSN: 2331-8422</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Cognitive tasks ; Computer vision ; Data augmentation ; Data mining ; Datasets ; Defects ; Errors ; Image classification ; Labeling ; Machine learning ; Natural language processing ; Neural networks ; Regularization ; Sentiment analysis</subject><ispartof>arXiv.org, 2024-05</ispartof><rights>2024. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>776,780</link.rule.ids></links><search><creatorcontrib>Chen, Louis L</creatorcontrib><creatorcontrib>Chern, Bobbie</creatorcontrib><creatorcontrib>Eckstrand, Eric</creatorcontrib><creatorcontrib>Mahapatra, Amogh</creatorcontrib><creatorcontrib>Royset, Johannes O</creatorcontrib><title>Mitigating the Impact of Labeling Errors on Training via Rockafellian Relaxation</title><title>arXiv.org</title><description>Labeling errors in datasets are common, if not systematic, in practice. They naturally arise in a variety of contexts-human labeling, noisy labeling, and weak labeling (i.e., image classification), for example. This presents a persistent and pervasive stress on machine learning practice. In particular, neural network (NN) architectures can withstand minor amounts of dataset imperfection with traditional countermeasures such as regularization, data augmentation, and batch normalization. However, major dataset imperfections often prove insurmountable. We propose and study the implementation of Rockafellian Relaxation (RR), a new loss reweighting, architecture-independent methodology, for neural network training. Experiments indicate RR can enhance standard neural network methods to achieve robust performance across classification tasks in computer vision and natural language processing (sentiment analysis). We find that RR can mitigate the effects of dataset corruption due to both (heavy) labeling error and/or adversarial perturbation, demonstrating effectiveness across a variety of data domains and machine learning tasks.</description><subject>Cognitive tasks</subject><subject>Computer vision</subject><subject>Data augmentation</subject><subject>Data mining</subject><subject>Datasets</subject><subject>Defects</subject><subject>Errors</subject><subject>Image classification</subject><subject>Labeling</subject><subject>Machine learning</subject><subject>Natural language processing</subject><subject>Neural networks</subject><subject>Regularization</subject><subject>Sentiment analysis</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>BENPR</sourceid><recordid>eNqNjNEKgjAYRkcQJOU7_NC1MDc1uw6joCCke_mTabO12Tajx0-hB-jqwDkf34wEjPM4yhPGFiR0rqOUsmzD0pQH5HKWXrbopW7B3wUcnz3WHkwDJ7wJNenCWmMdGA1Xi1JP6i0RSlM_sBFKSdRQCoWf8cXoFZk3qJwIf1yS9b647g5Rb81rEM5XnRmsHlPFaca3nMZ5wv9bfQE4tz62</recordid><startdate>20240530</startdate><enddate>20240530</enddate><creator>Chen, Louis L</creator><creator>Chern, Bobbie</creator><creator>Eckstrand, Eric</creator><creator>Mahapatra, Amogh</creator><creator>Royset, Johannes O</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20240530</creationdate><title>Mitigating the Impact of Labeling Errors on Training via Rockafellian Relaxation</title><author>Chen, Louis L ; Chern, Bobbie ; Eckstrand, Eric ; Mahapatra, Amogh ; Royset, Johannes O</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_journals_30639301843</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Cognitive tasks</topic><topic>Computer vision</topic><topic>Data augmentation</topic><topic>Data mining</topic><topic>Datasets</topic><topic>Defects</topic><topic>Errors</topic><topic>Image classification</topic><topic>Labeling</topic><topic>Machine learning</topic><topic>Natural language processing</topic><topic>Neural networks</topic><topic>Regularization</topic><topic>Sentiment analysis</topic><toplevel>online_resources</toplevel><creatorcontrib>Chen, Louis L</creatorcontrib><creatorcontrib>Chern, Bobbie</creatorcontrib><creatorcontrib>Eckstrand, Eric</creatorcontrib><creatorcontrib>Mahapatra, Amogh</creatorcontrib><creatorcontrib>Royset, Johannes O</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science &amp; Engineering Database (Proquest)</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>AUTh Library subscriptions: ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>ProQuest Engineering Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering collection</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Chen, Louis L</au><au>Chern, Bobbie</au><au>Eckstrand, Eric</au><au>Mahapatra, Amogh</au><au>Royset, Johannes O</au><format>book</format><genre>document</genre><ristype>GEN</ristype><atitle>Mitigating the Impact of Labeling Errors on Training via Rockafellian Relaxation</atitle><jtitle>arXiv.org</jtitle><date>2024-05-30</date><risdate>2024</risdate><eissn>2331-8422</eissn><abstract>Labeling errors in datasets are common, if not systematic, in practice. They naturally arise in a variety of contexts-human labeling, noisy labeling, and weak labeling (i.e., image classification), for example. This presents a persistent and pervasive stress on machine learning practice. In particular, neural network (NN) architectures can withstand minor amounts of dataset imperfection with traditional countermeasures such as regularization, data augmentation, and batch normalization. However, major dataset imperfections often prove insurmountable. We propose and study the implementation of Rockafellian Relaxation (RR), a new loss reweighting, architecture-independent methodology, for neural network training. Experiments indicate RR can enhance standard neural network methods to achieve robust performance across classification tasks in computer vision and natural language processing (sentiment analysis). We find that RR can mitigate the effects of dataset corruption due to both (heavy) labeling error and/or adversarial perturbation, demonstrating effectiveness across a variety of data domains and machine learning tasks.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier EISSN: 2331-8422
ispartof arXiv.org, 2024-05
issn 2331-8422
language eng
recordid cdi_proquest_journals_3063930184
source Free E- Journals
subjects Cognitive tasks
Computer vision
Data augmentation
Data mining
Datasets
Defects
Errors
Image classification
Labeling
Machine learning
Natural language processing
Neural networks
Regularization
Sentiment analysis
title Mitigating the Impact of Labeling Errors on Training via Rockafellian Relaxation
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-01T07%3A16%3A48IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=document&rft.atitle=Mitigating%20the%20Impact%20of%20Labeling%20Errors%20on%20Training%20via%20Rockafellian%20Relaxation&rft.jtitle=arXiv.org&rft.au=Chen,%20Louis%20L&rft.date=2024-05-30&rft.eissn=2331-8422&rft_id=info:doi/&rft_dat=%3Cproquest%3E3063930184%3C/proquest%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=3063930184&rft_id=info:pmid/&rfr_iscdi=true