Assessing Classifiers from Two Independent Data Sets Using ROC Analysis: A Nonparametric Approach

This paper considers binary classification. We assess a classifier in terms of the area under the ROC curve (AUC). We estimate three important parameters, the conditional AUC (conditional on a particular training set) and the mean and variance of this AUC. We derive, as well, a closed form expressio...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on pattern analysis and machine intelligence 2006-11, Vol.28 (11), p.1809-1817
Hauptverfasser: Yousef, W.A., Wagner, R.F., Loew, M.H.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 1817
container_issue 11
container_start_page 1809
container_title IEEE transactions on pattern analysis and machine intelligence
container_volume 28
creator Yousef, W.A.
Wagner, R.F.
Loew, M.H.
description This paper considers binary classification. We assess a classifier in terms of the area under the ROC curve (AUC). We estimate three important parameters, the conditional AUC (conditional on a particular training set) and the mean and variance of this AUC. We derive, as well, a closed form expression of the variance of the estimator of the AUG. This expression exhibits several components of variance that facilitate an understanding for the sources of uncertainty of that estimate. In addition, we estimate this variance, i.e., the variance of the conditional AUC estimator. Our approach is nonparametric and based on general methods from U-statistics; it addresses the case where the data distribution is neither known nor modeled and where there are only two available data sets, the training and testing sets. Finally, we illustrate some simulation results for these estimators
doi_str_mv 10.1109/TPAMI.2006.218
format Article
fullrecord <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_proquest_journals_865711871</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>1704836</ieee_id><sourcerecordid>896193090</sourcerecordid><originalsourceid>FETCH-LOGICAL-c402t-aeb03d758065f79966d62850ea071193616ba7513a86a09ea8df2310442e8db83</originalsourceid><addsrcrecordid>eNp9kU1v1DAQhi1ERZfSay9IyEKCnrKM48SxuUULbVcqFLXbczSbTCBVvvBkVfXf19tdqagHLrYPz7zjV48QJwrmSoH7svqV_1jOYwAzj5V9JWbKaRfpVLvXYgbKxJG1sT0Ub5nvAFSSgn4jDlUGRhubzgTmzMTc9L_losXwqBvyLGs_dHJ1P8hlX9FI4egn-Q0nlDc0sbx9Gri-Wsi8x_aBG_4qc_lz6Ef02NHkm1Lm4-gHLP-8Ewc1tkzH-_tI3J59Xy0uosur8-Uiv4zKBOIpQlqDrrLUgknrzDljKhPbFAghU6GUUWaNWao0WoPgCG1Vx1pBksRkq7XVR-J0lxvW_t0QT0XXcEltiz0NGy6sMyEGHATy839JY90WTgL48QV4N2x8aBzSTBq-ZTMVoPkOKv3A7KkuRt906B8KBcXWUfHkqNg6KoKjMPBhn7pZd1Q943spAfi0B5BLbGuPfdnwM2dVFopsN7_fcQ0R_RuTWG30Ixibn-A</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>865711871</pqid></control><display><type>article</type><title>Assessing Classifiers from Two Independent Data Sets Using ROC Analysis: A Nonparametric Approach</title><source>IEEE Electronic Library (IEL)</source><creator>Yousef, W.A. ; Wagner, R.F. ; Loew, M.H.</creator><creatorcontrib>Yousef, W.A. ; Wagner, R.F. ; Loew, M.H.</creatorcontrib><description>This paper considers binary classification. We assess a classifier in terms of the area under the ROC curve (AUC). We estimate three important parameters, the conditional AUC (conditional on a particular training set) and the mean and variance of this AUC. We derive, as well, a closed form expression of the variance of the estimator of the AUG. This expression exhibits several components of variance that facilitate an understanding for the sources of uncertainty of that estimate. In addition, we estimate this variance, i.e., the variance of the conditional AUC estimator. Our approach is nonparametric and based on general methods from U-statistics; it addresses the case where the data distribution is neither known nor modeled and where there are only two available data sets, the training and testing sets. Finally, we illustrate some simulation results for these estimators</description><identifier>ISSN: 0162-8828</identifier><identifier>EISSN: 1939-3539</identifier><identifier>EISSN: 2160-9292</identifier><identifier>DOI: 10.1109/TPAMI.2006.218</identifier><identifier>PMID: 17063685</identifier><identifier>CODEN: ITPIDJ</identifier><language>eng</language><publisher>Los Alamitos, CA: IEEE</publisher><subject>Algorithms ; Applied sciences ; Artificial Intelligence ; Classification ; Classifiers ; Cluster Analysis ; Computer science; control theory; systems ; Connectionism. Neural networks ; Databases, Factual ; Decision theory ; Estimates ; Estimators ; Exact sciences and technology ; Image Enhancement - methods ; Image Interpretation, Computer-Assisted - methods ; Information Storage and Retrieval - methods ; Mathematical analysis ; Mathematical models ; Medical diagnosis ; nonparametric statistics ; Parameter estimation ; Pattern Recognition, Automated - methods ; Probability density function ; Random variables ; ROC analysis ; ROC Curve ; Statistical analysis ; Statistical distributions ; Testing ; Training ; Training data ; Uncertainty ; Variance</subject><ispartof>IEEE transactions on pattern analysis and machine intelligence, 2006-11, Vol.28 (11), p.1809-1817</ispartof><rights>2006 INIST-CNRS</rights><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2006</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c402t-aeb03d758065f79966d62850ea071193616ba7513a86a09ea8df2310442e8db83</citedby><cites>FETCH-LOGICAL-c402t-aeb03d758065f79966d62850ea071193616ba7513a86a09ea8df2310442e8db83</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/1704836$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,776,780,792,27901,27902,54733</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/1704836$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc><backlink>$$Uhttp://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&amp;idt=18179301$$DView record in Pascal Francis$$Hfree_for_read</backlink><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/17063685$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Yousef, W.A.</creatorcontrib><creatorcontrib>Wagner, R.F.</creatorcontrib><creatorcontrib>Loew, M.H.</creatorcontrib><title>Assessing Classifiers from Two Independent Data Sets Using ROC Analysis: A Nonparametric Approach</title><title>IEEE transactions on pattern analysis and machine intelligence</title><addtitle>TPAMI</addtitle><addtitle>IEEE Trans Pattern Anal Mach Intell</addtitle><description>This paper considers binary classification. We assess a classifier in terms of the area under the ROC curve (AUC). We estimate three important parameters, the conditional AUC (conditional on a particular training set) and the mean and variance of this AUC. We derive, as well, a closed form expression of the variance of the estimator of the AUG. This expression exhibits several components of variance that facilitate an understanding for the sources of uncertainty of that estimate. In addition, we estimate this variance, i.e., the variance of the conditional AUC estimator. Our approach is nonparametric and based on general methods from U-statistics; it addresses the case where the data distribution is neither known nor modeled and where there are only two available data sets, the training and testing sets. Finally, we illustrate some simulation results for these estimators</description><subject>Algorithms</subject><subject>Applied sciences</subject><subject>Artificial Intelligence</subject><subject>Classification</subject><subject>Classifiers</subject><subject>Cluster Analysis</subject><subject>Computer science; control theory; systems</subject><subject>Connectionism. Neural networks</subject><subject>Databases, Factual</subject><subject>Decision theory</subject><subject>Estimates</subject><subject>Estimators</subject><subject>Exact sciences and technology</subject><subject>Image Enhancement - methods</subject><subject>Image Interpretation, Computer-Assisted - methods</subject><subject>Information Storage and Retrieval - methods</subject><subject>Mathematical analysis</subject><subject>Mathematical models</subject><subject>Medical diagnosis</subject><subject>nonparametric statistics</subject><subject>Parameter estimation</subject><subject>Pattern Recognition, Automated - methods</subject><subject>Probability density function</subject><subject>Random variables</subject><subject>ROC analysis</subject><subject>ROC Curve</subject><subject>Statistical analysis</subject><subject>Statistical distributions</subject><subject>Testing</subject><subject>Training</subject><subject>Training data</subject><subject>Uncertainty</subject><subject>Variance</subject><issn>0162-8828</issn><issn>1939-3539</issn><issn>2160-9292</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2006</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><sourceid>EIF</sourceid><recordid>eNp9kU1v1DAQhi1ERZfSay9IyEKCnrKM48SxuUULbVcqFLXbczSbTCBVvvBkVfXf19tdqagHLrYPz7zjV48QJwrmSoH7svqV_1jOYwAzj5V9JWbKaRfpVLvXYgbKxJG1sT0Ub5nvAFSSgn4jDlUGRhubzgTmzMTc9L_losXwqBvyLGs_dHJ1P8hlX9FI4egn-Q0nlDc0sbx9Gri-Wsi8x_aBG_4qc_lz6Ef02NHkm1Lm4-gHLP-8Ewc1tkzH-_tI3J59Xy0uosur8-Uiv4zKBOIpQlqDrrLUgknrzDljKhPbFAghU6GUUWaNWao0WoPgCG1Vx1pBksRkq7XVR-J0lxvW_t0QT0XXcEltiz0NGy6sMyEGHATy839JY90WTgL48QV4N2x8aBzSTBq-ZTMVoPkOKv3A7KkuRt906B8KBcXWUfHkqNg6KoKjMPBhn7pZd1Q943spAfi0B5BLbGuPfdnwM2dVFopsN7_fcQ0R_RuTWG30Ixibn-A</recordid><startdate>20061101</startdate><enddate>20061101</enddate><creator>Yousef, W.A.</creator><creator>Wagner, R.F.</creator><creator>Loew, M.H.</creator><general>IEEE</general><general>IEEE Computer Society</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>IQODW</scope><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>7X8</scope><scope>F28</scope><scope>FR3</scope></search><sort><creationdate>20061101</creationdate><title>Assessing Classifiers from Two Independent Data Sets Using ROC Analysis: A Nonparametric Approach</title><author>Yousef, W.A. ; Wagner, R.F. ; Loew, M.H.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c402t-aeb03d758065f79966d62850ea071193616ba7513a86a09ea8df2310442e8db83</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2006</creationdate><topic>Algorithms</topic><topic>Applied sciences</topic><topic>Artificial Intelligence</topic><topic>Classification</topic><topic>Classifiers</topic><topic>Cluster Analysis</topic><topic>Computer science; control theory; systems</topic><topic>Connectionism. Neural networks</topic><topic>Databases, Factual</topic><topic>Decision theory</topic><topic>Estimates</topic><topic>Estimators</topic><topic>Exact sciences and technology</topic><topic>Image Enhancement - methods</topic><topic>Image Interpretation, Computer-Assisted - methods</topic><topic>Information Storage and Retrieval - methods</topic><topic>Mathematical analysis</topic><topic>Mathematical models</topic><topic>Medical diagnosis</topic><topic>nonparametric statistics</topic><topic>Parameter estimation</topic><topic>Pattern Recognition, Automated - methods</topic><topic>Probability density function</topic><topic>Random variables</topic><topic>ROC analysis</topic><topic>ROC Curve</topic><topic>Statistical analysis</topic><topic>Statistical distributions</topic><topic>Testing</topic><topic>Training</topic><topic>Training data</topic><topic>Uncertainty</topic><topic>Variance</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Yousef, W.A.</creatorcontrib><creatorcontrib>Wagner, R.F.</creatorcontrib><creatorcontrib>Loew, M.H.</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>Pascal-Francis</collection><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics &amp; Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>MEDLINE - Academic</collection><collection>ANTE: Abstracts in New Technology &amp; Engineering</collection><collection>Engineering Research Database</collection><jtitle>IEEE transactions on pattern analysis and machine intelligence</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Yousef, W.A.</au><au>Wagner, R.F.</au><au>Loew, M.H.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Assessing Classifiers from Two Independent Data Sets Using ROC Analysis: A Nonparametric Approach</atitle><jtitle>IEEE transactions on pattern analysis and machine intelligence</jtitle><stitle>TPAMI</stitle><addtitle>IEEE Trans Pattern Anal Mach Intell</addtitle><date>2006-11-01</date><risdate>2006</risdate><volume>28</volume><issue>11</issue><spage>1809</spage><epage>1817</epage><pages>1809-1817</pages><issn>0162-8828</issn><eissn>1939-3539</eissn><eissn>2160-9292</eissn><coden>ITPIDJ</coden><abstract>This paper considers binary classification. We assess a classifier in terms of the area under the ROC curve (AUC). We estimate three important parameters, the conditional AUC (conditional on a particular training set) and the mean and variance of this AUC. We derive, as well, a closed form expression of the variance of the estimator of the AUG. This expression exhibits several components of variance that facilitate an understanding for the sources of uncertainty of that estimate. In addition, we estimate this variance, i.e., the variance of the conditional AUC estimator. Our approach is nonparametric and based on general methods from U-statistics; it addresses the case where the data distribution is neither known nor modeled and where there are only two available data sets, the training and testing sets. Finally, we illustrate some simulation results for these estimators</abstract><cop>Los Alamitos, CA</cop><pub>IEEE</pub><pmid>17063685</pmid><doi>10.1109/TPAMI.2006.218</doi><tpages>9</tpages></addata></record>
fulltext fulltext_linktorsrc
identifier ISSN: 0162-8828
ispartof IEEE transactions on pattern analysis and machine intelligence, 2006-11, Vol.28 (11), p.1809-1817
issn 0162-8828
1939-3539
2160-9292
language eng
recordid cdi_proquest_journals_865711871
source IEEE Electronic Library (IEL)
subjects Algorithms
Applied sciences
Artificial Intelligence
Classification
Classifiers
Cluster Analysis
Computer science
control theory
systems
Connectionism. Neural networks
Databases, Factual
Decision theory
Estimates
Estimators
Exact sciences and technology
Image Enhancement - methods
Image Interpretation, Computer-Assisted - methods
Information Storage and Retrieval - methods
Mathematical analysis
Mathematical models
Medical diagnosis
nonparametric statistics
Parameter estimation
Pattern Recognition, Automated - methods
Probability density function
Random variables
ROC analysis
ROC Curve
Statistical analysis
Statistical distributions
Testing
Training
Training data
Uncertainty
Variance
title Assessing Classifiers from Two Independent Data Sets Using ROC Analysis: A Nonparametric Approach
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-21T18%3A40%3A43IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Assessing%20Classifiers%20from%20Two%20Independent%20Data%20Sets%20Using%20ROC%20Analysis:%20A%20Nonparametric%20Approach&rft.jtitle=IEEE%20transactions%20on%20pattern%20analysis%20and%20machine%20intelligence&rft.au=Yousef,%20W.A.&rft.date=2006-11-01&rft.volume=28&rft.issue=11&rft.spage=1809&rft.epage=1817&rft.pages=1809-1817&rft.issn=0162-8828&rft.eissn=1939-3539&rft.coden=ITPIDJ&rft_id=info:doi/10.1109/TPAMI.2006.218&rft_dat=%3Cproquest_RIE%3E896193090%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=865711871&rft_id=info:pmid/17063685&rft_ieee_id=1704836&rfr_iscdi=true