Benchmarking Popular Classification Models' Robustness to Random and Targeted Corruptions

Text classification models, especially neural networks based models, have reached very high accuracy on many popular benchmark datasets. Yet, such models when deployed in real world applications, tend to perform badly. The primary reason is that these models are not tested against sufficient real wo...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	arXiv.org 2020-01
Hauptverfasser:	Desai, Utkarsh, Tamilselvam, Srikanth, Kaur, Jassimran, Mani, Senthil, Khare, Shreya
Format:	Artikel
Sprache:	eng
Schlagworte:	Benchmarks Classification Datasets Model accuracy Model testing Neural networks
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title	arXiv.org
container_volume
creator	Desai, Utkarsh Tamilselvam, Srikanth Kaur, Jassimran Mani, Senthil Khare, Shreya
description	Text classification models, especially neural networks based models, have reached very high accuracy on many popular benchmark datasets. Yet, such models when deployed in real world applications, tend to perform badly. The primary reason is that these models are not tested against sufficient real world natural data. Based on the application users, the vocabulary and the style of the model's input may greatly vary. This emphasizes the need for a model agnostic test dataset, which consists of various corruptions that are natural to appear in the wild. Models trained and tested on such benchmark datasets, will be more robust against real world data. However, such data sets are not easily available. In this work, we address this problem, by extending the benchmark datasets along naturally occurring corruptions such as Spelling Errors, Text Noise and Synonyms and making them publicly available. Through extensive experiments, we compare random and targeted corruption strategies using Local Interpretable Model-Agnostic Explanations(LIME). We report the vulnerabilities in two popular text classification models along these corruptions and also find that targeted corruptions can expose vulnerabilities of a model better than random choices in most cases.
format	Article
fullrecord	<record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_2350657003</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2350657003</sourcerecordid><originalsourceid>FETCH-proquest_journals_23506570033</originalsourceid><addsrcrecordid>eNqNi8EKgkAURYcgSMp_eNCilTDNpLZOijZBSJtWMuloms6zeTP_n0Ef0OaexTl3xgIh5Tba74RYsJCo45yLJBVxLAN2P2hTPgdlX61p4Iqj75WFrFdEbd2WyrVo4IKV7mkDOT48OaOJwCHkylQ4wLRwU7bRTleQobV-_J5oxea16kmHPy7Z-nS8ZedotPj2mlzRobdmUoWQMU_ilHMp_6s-HB1Cpg</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2350657003</pqid></control><display><type>article</type><title>Benchmarking Popular Classification Models' Robustness to Random and Targeted Corruptions</title><source>Free E- Journals</source><creator>Desai, Utkarsh ; Tamilselvam, Srikanth ; Kaur, Jassimran ; Mani, Senthil ; Khare, Shreya</creator><creatorcontrib>Desai, Utkarsh ; Tamilselvam, Srikanth ; Kaur, Jassimran ; Mani, Senthil ; Khare, Shreya</creatorcontrib><description>Text classification models, especially neural networks based models, have reached very high accuracy on many popular benchmark datasets. Yet, such models when deployed in real world applications, tend to perform badly. The primary reason is that these models are not tested against sufficient real world natural data. Based on the application users, the vocabulary and the style of the model's input may greatly vary. This emphasizes the need for a model agnostic test dataset, which consists of various corruptions that are natural to appear in the wild. Models trained and tested on such benchmark datasets, will be more robust against real world data. However, such data sets are not easily available. In this work, we address this problem, by extending the benchmark datasets along naturally occurring corruptions such as Spelling Errors, Text Noise and Synonyms and making them publicly available. Through extensive experiments, we compare random and targeted corruption strategies using Local Interpretable Model-Agnostic Explanations(LIME). We report the vulnerabilities in two popular text classification models along these corruptions and also find that targeted corruptions can expose vulnerabilities of a model better than random choices in most cases.</description><identifier>EISSN: 2331-8422</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Benchmarks ; Classification ; Datasets ; Model accuracy ; Model testing ; Neural networks</subject><ispartof>arXiv.org, 2020-01</ispartof><rights>2020. This work is published under http://arxiv.org/licenses/nonexclusive-distrib/1.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>780,784</link.rule.ids></links><search><creatorcontrib>Desai, Utkarsh</creatorcontrib><creatorcontrib>Tamilselvam, Srikanth</creatorcontrib><creatorcontrib>Kaur, Jassimran</creatorcontrib><creatorcontrib>Mani, Senthil</creatorcontrib><creatorcontrib>Khare, Shreya</creatorcontrib><title>Benchmarking Popular Classification Models' Robustness to Random and Targeted Corruptions</title><title>arXiv.org</title><description>Text classification models, especially neural networks based models, have reached very high accuracy on many popular benchmark datasets. Yet, such models when deployed in real world applications, tend to perform badly. The primary reason is that these models are not tested against sufficient real world natural data. Based on the application users, the vocabulary and the style of the model's input may greatly vary. This emphasizes the need for a model agnostic test dataset, which consists of various corruptions that are natural to appear in the wild. Models trained and tested on such benchmark datasets, will be more robust against real world data. However, such data sets are not easily available. In this work, we address this problem, by extending the benchmark datasets along naturally occurring corruptions such as Spelling Errors, Text Noise and Synonyms and making them publicly available. Through extensive experiments, we compare random and targeted corruption strategies using Local Interpretable Model-Agnostic Explanations(LIME). We report the vulnerabilities in two popular text classification models along these corruptions and also find that targeted corruptions can expose vulnerabilities of a model better than random choices in most cases.</description><subject>Benchmarks</subject><subject>Classification</subject><subject>Datasets</subject><subject>Model accuracy</subject><subject>Model testing</subject><subject>Neural networks</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2020</creationdate><recordtype>article</recordtype><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><recordid>eNqNi8EKgkAURYcgSMp_eNCilTDNpLZOijZBSJtWMuloms6zeTP_n0Ef0OaexTl3xgIh5Tba74RYsJCo45yLJBVxLAN2P2hTPgdlX61p4Iqj75WFrFdEbd2WyrVo4IKV7mkDOT48OaOJwCHkylQ4wLRwU7bRTleQobV-_J5oxea16kmHPy7Z-nS8ZedotPj2mlzRobdmUoWQMU_ilHMp_6s-HB1Cpg</recordid><startdate>20200131</startdate><enddate>20200131</enddate><creator>Desai, Utkarsh</creator><creator>Tamilselvam, Srikanth</creator><creator>Kaur, Jassimran</creator><creator>Mani, Senthil</creator><creator>Khare, Shreya</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20200131</creationdate><title>Benchmarking Popular Classification Models' Robustness to Random and Targeted Corruptions</title><author>Desai, Utkarsh ; Tamilselvam, Srikanth ; Kaur, Jassimran ; Mani, Senthil ; Khare, Shreya</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_journals_23506570033</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2020</creationdate><topic>Benchmarks</topic><topic>Classification</topic><topic>Datasets</topic><topic>Model accuracy</topic><topic>Model testing</topic><topic>Neural networks</topic><toplevel>online_resources</toplevel><creatorcontrib>Desai, Utkarsh</creatorcontrib><creatorcontrib>Tamilselvam, Srikanth</creatorcontrib><creatorcontrib>Kaur, Jassimran</creatorcontrib><creatorcontrib>Mani, Senthil</creatorcontrib><creatorcontrib>Khare, Shreya</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science & Engineering Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Desai, Utkarsh</au><au>Tamilselvam, Srikanth</au><au>Kaur, Jassimran</au><au>Mani, Senthil</au><au>Khare, Shreya</au><format>book</format><genre>document</genre><ristype>GEN</ristype><atitle>Benchmarking Popular Classification Models' Robustness to Random and Targeted Corruptions</atitle><jtitle>arXiv.org</jtitle><date>2020-01-31</date><risdate>2020</risdate><eissn>2331-8422</eissn><abstract>Text classification models, especially neural networks based models, have reached very high accuracy on many popular benchmark datasets. Yet, such models when deployed in real world applications, tend to perform badly. The primary reason is that these models are not tested against sufficient real world natural data. Based on the application users, the vocabulary and the style of the model's input may greatly vary. This emphasizes the need for a model agnostic test dataset, which consists of various corruptions that are natural to appear in the wild. Models trained and tested on such benchmark datasets, will be more robust against real world data. However, such data sets are not easily available. In this work, we address this problem, by extending the benchmark datasets along naturally occurring corruptions such as Spelling Errors, Text Noise and Synonyms and making them publicly available. Through extensive experiments, we compare random and targeted corruption strategies using Local Interpretable Model-Agnostic Explanations(LIME). We report the vulnerabilities in two popular text classification models along these corruptions and also find that targeted corruptions can expose vulnerabilities of a model better than random choices in most cases.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	EISSN: 2331-8422
ispartof	arXiv.org, 2020-01
issn	2331-8422
language	eng
recordid	cdi_proquest_journals_2350657003
source	Free E- Journals
subjects	Benchmarks Classification Datasets Model accuracy Model testing Neural networks
title	Benchmarking Popular Classification Models' Robustness to Random and Targeted Corruptions
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-19T11%3A53%3A33IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=document&rft.atitle=Benchmarking%20Popular%20Classification%20Models'%20Robustness%20to%20Random%20and%20Targeted%20Corruptions&rft.jtitle=arXiv.org&rft.au=Desai,%20Utkarsh&rft.date=2020-01-31&rft.eissn=2331-8422&rft_id=info:doi/&rft_dat=%3Cproquest%3E2350657003%3C/proquest%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2350657003&rft_id=info:pmid/&rfr_iscdi=true