Assessing Classifiers from Two Independent Data Sets Using ROC Analysis: A Nonparametric Approach

This paper considers binary classification. We assess a classifier in terms of the area under the ROC curve (AUC). We estimate three important parameters, the conditional AUC (conditional on a particular training set) and the mean and variance of this AUC. We derive, as well, a closed form expressio...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on pattern analysis and machine intelligence 2006-11, Vol.28 (11), p.1809-1817
Hauptverfasser:	Yousef, W.A., Wagner, R.F., Loew, M.H.
Format:	Artikel
Sprache:	eng
Schlagworte:	Algorithms Applied sciences Artificial Intelligence Classification Classifiers Cluster Analysis Computer science control theory systems Connectionism. Neural networks Databases, Factual Decision theory Estimates Estimators Exact sciences and technology Image Enhancement - methods Image Interpretation, Computer-Assisted - methods Information Storage and Retrieval - methods Mathematical analysis Mathematical models Medical diagnosis nonparametric statistics Parameter estimation Pattern Recognition, Automated - methods Probability density function Random variables ROC analysis ROC Curve Statistical analysis Statistical distributions Testing Training Training data Uncertainty Variance
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	1817
container_issue	11
container_start_page	1809
container_title	IEEE transactions on pattern analysis and machine intelligence
container_volume	28
creator	Yousef, W.A. Wagner, R.F. Loew, M.H.
description	This paper considers binary classification. We assess a classifier in terms of the area under the ROC curve (AUC). We estimate three important parameters, the conditional AUC (conditional on a particular training set) and the mean and variance of this AUC. We derive, as well, a closed form expression of the variance of the estimator of the AUG. This expression exhibits several components of variance that facilitate an understanding for the sources of uncertainty of that estimate. In addition, we estimate this variance, i.e., the variance of the conditional AUC estimator. Our approach is nonparametric and based on general methods from U-statistics; it addresses the case where the data distribution is neither known nor modeled and where there are only two available data sets, the training and testing sets. Finally, we illustrate some simulation results for these estimators
doi_str_mv	10.1109/TPAMI.2006.218
format	Article
fullrecord	<record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_proquest_journals_865711871</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>1704836</ieee_id><sourcerecordid>896193090</sourcerecordid><originalsourceid>FETCH-LOGICAL-c402t-aeb03d758065f79966d62850ea071193616ba7513a86a09ea8df2310442e8db83</originalsourceid><addsrcrecordid>eNp9kU1v1DAQhi1ERZfSay9IyEKCnrKM48SxuUULbVcqFLXbczSbTCBVvvBkVfXf19tdqagHLrYPz7zjV48QJwrmSoH7svqV_1jOYwAzj5V9JWbKaRfpVLvXYgbKxJG1sT0Ub5nvAFSSgn4jDlUGRhubzgTmzMTc9L_losXwqBvyLGs_dHJ1P8hlX9FI4egn-Q0nlDc0sbx9Gri-Wsi8x_aBG_4qc_lz6Ef02NHkm1Lm4-gHLP-8Ewc1tkzH-_tI3J59Xy0uosur8-Uiv4zKBOIpQlqDrrLUgknrzDljKhPbFAghU6GUUWaNWao0WoPgCG1Vx1pBksRkq7XVR-J0lxvW_t0QT0XXcEltiz0NGy6sMyEGHATy839JY90WTgL48QV4N2x8aBzSTBq-ZTMVoPkOKv3A7KkuRt906B8KBcXWUfHkqNg6KoKjMPBhn7pZd1Q943spAfi0B5BLbGuPfdnwM2dVFopsN7_fcQ0R_RuTWG30Ixibn-A</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>865711871</pqid></control><display><type>article</type><title>Assessing Classifiers from Two Independent Data Sets Using ROC Analysis: A Nonparametric Approach</title><source>IEEE Electronic Library (IEL)</source><creator>Yousef, W.A. ; Wagner, R.F. ; Loew, M.H.</creator><creatorcontrib>Yousef, W.A. ; Wagner, R.F. ; Loew, M.H.</creatorcontrib><description>This paper considers binary classification. We assess a classifier in terms of the area under the ROC curve (AUC). We estimate three important parameters, the conditional AUC (conditional on a particular training set) and the mean and variance of this AUC. We derive, as well, a closed form expression of the variance of the estimator of the AUG. This expression exhibits several components of variance that facilitate an understanding for the sources of uncertainty of that estimate. In addition, we estimate this variance, i.e., the variance of the conditional AUC estimator. Our approach is nonparametric and based on general methods from U-statistics; it addresses the case where the data distribution is neither known nor modeled and where there are only two available data sets, the training and testing sets. Finally, we illustrate some simulation results for these estimators</description><identifier>ISSN: 0162-8828</identifier><identifier>EISSN: 1939-3539</identifier><identifier>EISSN: 2160-9292</identifier><identifier>DOI: 10.1109/TPAMI.2006.218</identifier><identifier>PMID: 17063685</identifier><identifier>CODEN: ITPIDJ</identifier><language>eng</language><publisher>Los Alamitos, CA: IEEE</publisher><subject>Algorithms ; Applied sciences ; Artificial Intelligence ; Classification ; Classifiers ; Cluster Analysis ; Computer science; control theory; systems ; Connectionism. Neural networks ; Databases, Factual ; Decision theory ; Estimates ; Estimators ; Exact sciences and technology ; Image Enhancement - methods ; Image Interpretation, Computer-Assisted - methods ; Information Storage and Retrieval - methods ; Mathematical analysis ; Mathematical models ; Medical diagnosis ; nonparametric statistics ; Parameter estimation ; Pattern Recognition, Automated - methods ; Probability density function ; Random variables ; ROC analysis ; ROC Curve ; Statistical analysis ; Statistical distributions ; Testing ; Training ; Training data ; Uncertainty ; Variance</subject><ispartof>IEEE transactions on pattern analysis and machine intelligence, 2006-11, Vol.28 (11), p.1809-1817</ispartof><rights>2006 INIST-CNRS</rights><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2006</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c402t-aeb03d758065f79966d62850ea071193616ba7513a86a09ea8df2310442e8db83</citedby><cites>FETCH-LOGICAL-c402t-aeb03d758065f79966d62850ea071193616ba7513a86a09ea8df2310442e8db83</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/1704836$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,776,780,792,27901,27902,54733</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/1704836$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc><backlink>$$Uhttp://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&idt=18179301$$DView record in Pascal Francis$$Hfree_for_read</backlink><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/17063685$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Yousef, W.A.</creatorcontrib><creatorcontrib>Wagner, R.F.</creatorcontrib><creatorcontrib>Loew, M.H.</creatorcontrib><title>Assessing Classifiers from Two Independent Data Sets Using ROC Analysis: A Nonparametric Approach</title><title>IEEE transactions on pattern analysis and machine intelligence</title><addtitle>TPAMI</addtitle><addtitle>IEEE Trans Pattern Anal Mach Intell</addtitle><description>This paper considers binary classification. We assess a classifier in terms of the area under the ROC curve (AUC). We estimate three important parameters, the conditional AUC (conditional on a particular training set) and the mean and variance of this AUC. We derive, as well, a closed form expression of the variance of the estimator of the AUG. This expression exhibits several components of variance that facilitate an understanding for the sources of uncertainty of that estimate. In addition, we estimate this variance, i.e., the variance of the conditional AUC estimator. Our approach is nonparametric and based on general methods from U-statistics; it addresses the case where the data distribution is neither known nor modeled and where there are only two available data sets, the training and testing sets. Finally, we illustrate some simulation results for these estimators</description><subject>Algorithms</subject><subject>Applied sciences</subject><subject>Artificial Intelligence</subject><subject>Classification</subject><subject>Classifiers</subject><subject>Cluster Analysis</subject><subject>Computer science; control theory; systems</subject><subject>Connectionism. Neural networks</subject><subject>Databases, Factual</subject><subject>Decision theory</subject><subject>Estimates</subject><subject>Estimators</subject><subject>Exact sciences and technology</subject><subject>Image Enhancement - methods</subject><subject>Image Interpretation, Computer-Assisted - methods</subject><subject>Information Storage and Retrieval - methods</subject><subject>Mathematical analysis</subject><subject>Mathematical models</subject><subject>Medical diagnosis</subject><subject>nonparametric statistics</subject><subject>Parameter estimation</subject><subject>Pattern Recognition, Automated - methods</subject><subject>Probability density function</subject><subject>Random variables</subject><subject>ROC analysis</subject><subject>ROC Curve</subject><subject>Statistical analysis</subject><subject>Statistical distributions</subject><subject>Testing</subject><subject>Training</subject><subject>Training data</subject><subject>Uncertainty</subject><subject>Variance</subject><issn>0162-8828</issn><issn>1939-3539</issn><issn>2160-9292</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2006</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><sourceid>EIF</sourceid><recordid>eNp9kU1v1DAQhi1ERZfSay9IyEKCnrKM48SxuUULbVcqFLXbczSbTCBVvvBkVfXf19tdqagHLrYPz7zjV48QJwrmSoH7svqV_1jOYwAzj5V9JWbKaRfpVLvXYgbKxJG1sT0Ub5nvAFSSgn4jDlUGRhubzgTmzMTc9L_losXwqBvyLGs_dHJ1P8hlX9FI4egn-Q0nlDc0sbx9Gri-Wsi8x_aBG_4qc_lz6Ef02NHkm1Lm4-gHLP-8Ewc1tkzH-_tI3J59Xy0uosur8-Uiv4zKBOIpQlqDrrLUgknrzDljKhPbFAghU6GUUWaNWao0WoPgCG1Vx1pBksRkq7XVR-J0lxvW_t0QT0XXcEltiz0NGy6sMyEGHATy839JY90WTgL48QV4N2x8aBzSTBq-ZTMVoPkOKv3A7KkuRt906B8KBcXWUfHkqNg6KoKjMPBhn7pZd1Q943spAfi0B5BLbGuPfdnwM2dVFopsN7_fcQ0R_RuTWG30Ixibn-A</recordid><startdate>20061101</startdate><enddate>20061101</enddate><creator>Yousef, W.A.</creator><creator>Wagner, R.F.</creator><creator>Loew, M.H.</creator><general>IEEE</general><general>IEEE Computer Society</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>IQODW</scope><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>7X8</scope><scope>F28</scope><scope>FR3</scope></search><sort><creationdate>20061101</creationdate><title>Assessing Classifiers from Two Independent Data Sets Using ROC Analysis: A Nonparametric Approach</title><author>Yousef, W.A. ; Wagner, R.F. ; Loew, M.H.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c402t-aeb03d758065f79966d62850ea071193616ba7513a86a09ea8df2310442e8db83</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2006</creationdate><topic>Algorithms</topic><topic>Applied sciences</topic><topic>Artificial Intelligence</topic><topic>Classification</topic><topic>Classifiers</topic><topic>Cluster Analysis</topic><topic>Computer science; control theory; systems</topic><topic>Connectionism. Neural networks</topic><topic>Databases, Factual</topic><topic>Decision theory</topic><topic>Estimates</topic><topic>Estimators</topic><topic>Exact sciences and technology</topic><topic>Image Enhancement - methods</topic><topic>Image Interpretation, Computer-Assisted - methods</topic><topic>Information Storage and Retrieval - methods</topic><topic>Mathematical analysis</topic><topic>Mathematical models</topic><topic>Medical diagnosis</topic><topic>nonparametric statistics</topic><topic>Parameter estimation</topic><topic>Pattern Recognition, Automated - methods</topic><topic>Probability density function</topic><topic>Random variables</topic><topic>ROC analysis</topic><topic>ROC Curve</topic><topic>Statistical analysis</topic><topic>Statistical distributions</topic><topic>Testing</topic><topic>Training</topic><topic>Training data</topic><topic>Uncertainty</topic><topic>Variance</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Yousef, W.A.</creatorcontrib><creatorcontrib>Wagner, R.F.</creatorcontrib><creatorcontrib>Loew, M.H.</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>Pascal-Francis</collection><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>MEDLINE - Academic</collection><collection>ANTE: Abstracts in New Technology & Engineering</collection><collection>Engineering Research Database</collection><jtitle>IEEE transactions on pattern analysis and machine intelligence</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Yousef, W.A.</au><au>Wagner, R.F.</au><au>Loew, M.H.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Assessing Classifiers from Two Independent Data Sets Using ROC Analysis: A Nonparametric Approach</atitle><jtitle>IEEE transactions on pattern analysis and machine intelligence</jtitle><stitle>TPAMI</stitle><addtitle>IEEE Trans Pattern Anal Mach Intell</addtitle><date>2006-11-01</date><risdate>2006</risdate><volume>28</volume><issue>11</issue><spage>1809</spage><epage>1817</epage><pages>1809-1817</pages><issn>0162-8828</issn><eissn>1939-3539</eissn><eissn>2160-9292</eissn><coden>ITPIDJ</coden><abstract>This paper considers binary classification. We assess a classifier in terms of the area under the ROC curve (AUC). We estimate three important parameters, the conditional AUC (conditional on a particular training set) and the mean and variance of this AUC. We derive, as well, a closed form expression of the variance of the estimator of the AUG. This expression exhibits several components of variance that facilitate an understanding for the sources of uncertainty of that estimate. In addition, we estimate this variance, i.e., the variance of the conditional AUC estimator. Our approach is nonparametric and based on general methods from U-statistics; it addresses the case where the data distribution is neither known nor modeled and where there are only two available data sets, the training and testing sets. Finally, we illustrate some simulation results for these estimators</abstract><cop>Los Alamitos, CA</cop><pub>IEEE</pub><pmid>17063685</pmid><doi>10.1109/TPAMI.2006.218</doi><tpages>9</tpages></addata></record>
fulltext	fulltext_linktorsrc
identifier	ISSN: 0162-8828
ispartof	IEEE transactions on pattern analysis and machine intelligence, 2006-11, Vol.28 (11), p.1809-1817
issn	0162-8828 1939-3539 2160-9292
language	eng
recordid	cdi_proquest_journals_865711871
source	IEEE Electronic Library (IEL)
subjects	Algorithms Applied sciences Artificial Intelligence Classification Classifiers Cluster Analysis Computer science control theory systems Connectionism. Neural networks Databases, Factual Decision theory Estimates Estimators Exact sciences and technology Image Enhancement - methods Image Interpretation, Computer-Assisted - methods Information Storage and Retrieval - methods Mathematical analysis Mathematical models Medical diagnosis nonparametric statistics Parameter estimation Pattern Recognition, Automated - methods Probability density function Random variables ROC analysis ROC Curve Statistical analysis Statistical distributions Testing Training Training data Uncertainty Variance
title	Assessing Classifiers from Two Independent Data Sets Using ROC Analysis: A Nonparametric Approach
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-21T18%3A40%3A43IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Assessing%20Classifiers%20from%20Two%20Independent%20Data%20Sets%20Using%20ROC%20Analysis:%20A%20Nonparametric%20Approach&rft.jtitle=IEEE%20transactions%20on%20pattern%20analysis%20and%20machine%20intelligence&rft.au=Yousef,%20W.A.&rft.date=2006-11-01&rft.volume=28&rft.issue=11&rft.spage=1809&rft.epage=1817&rft.pages=1809-1817&rft.issn=0162-8828&rft.eissn=1939-3539&rft.coden=ITPIDJ&rft_id=info:doi/10.1109/TPAMI.2006.218&rft_dat=%3Cproquest_RIE%3E896193090%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=865711871&rft_id=info:pmid/17063685&rft_ieee_id=1704836&rfr_iscdi=true