Assessing Classifiers from Two Independent Data Sets Using ROC Analysis: A Nonparametric Approach
This paper considers binary classification. We assess a classifier in terms of the area under the ROC curve (AUC). We estimate three important parameters, the conditional AUC (conditional on a particular training set) and the mean and variance of this AUC. We derive, as well, a closed form expressio...
Gespeichert in:
Veröffentlicht in: | IEEE transactions on pattern analysis and machine intelligence 2006-11, Vol.28 (11), p.1809-1817 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 1817 |
---|---|
container_issue | 11 |
container_start_page | 1809 |
container_title | IEEE transactions on pattern analysis and machine intelligence |
container_volume | 28 |
creator | Yousef, W.A. Wagner, R.F. Loew, M.H. |
description | This paper considers binary classification. We assess a classifier in terms of the area under the ROC curve (AUC). We estimate three important parameters, the conditional AUC (conditional on a particular training set) and the mean and variance of this AUC. We derive, as well, a closed form expression of the variance of the estimator of the AUG. This expression exhibits several components of variance that facilitate an understanding for the sources of uncertainty of that estimate. In addition, we estimate this variance, i.e., the variance of the conditional AUC estimator. Our approach is nonparametric and based on general methods from U-statistics; it addresses the case where the data distribution is neither known nor modeled and where there are only two available data sets, the training and testing sets. Finally, we illustrate some simulation results for these estimators |
doi_str_mv | 10.1109/TPAMI.2006.218 |
format | Article |
fullrecord | <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_proquest_journals_865711871</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>1704836</ieee_id><sourcerecordid>896193090</sourcerecordid><originalsourceid>FETCH-LOGICAL-c402t-aeb03d758065f79966d62850ea071193616ba7513a86a09ea8df2310442e8db83</originalsourceid><addsrcrecordid>eNp9kU1v1DAQhi1ERZfSay9IyEKCnrKM48SxuUULbVcqFLXbczSbTCBVvvBkVfXf19tdqagHLrYPz7zjV48QJwrmSoH7svqV_1jOYwAzj5V9JWbKaRfpVLvXYgbKxJG1sT0Ub5nvAFSSgn4jDlUGRhubzgTmzMTc9L_losXwqBvyLGs_dHJ1P8hlX9FI4egn-Q0nlDc0sbx9Gri-Wsi8x_aBG_4qc_lz6Ef02NHkm1Lm4-gHLP-8Ewc1tkzH-_tI3J59Xy0uosur8-Uiv4zKBOIpQlqDrrLUgknrzDljKhPbFAghU6GUUWaNWao0WoPgCG1Vx1pBksRkq7XVR-J0lxvW_t0QT0XXcEltiz0NGy6sMyEGHATy839JY90WTgL48QV4N2x8aBzSTBq-ZTMVoPkOKv3A7KkuRt906B8KBcXWUfHkqNg6KoKjMPBhn7pZd1Q943spAfi0B5BLbGuPfdnwM2dVFopsN7_fcQ0R_RuTWG30Ixibn-A</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>865711871</pqid></control><display><type>article</type><title>Assessing Classifiers from Two Independent Data Sets Using ROC Analysis: A Nonparametric Approach</title><source>IEEE Electronic Library (IEL)</source><creator>Yousef, W.A. ; Wagner, R.F. ; Loew, M.H.</creator><creatorcontrib>Yousef, W.A. ; Wagner, R.F. ; Loew, M.H.</creatorcontrib><description>This paper considers binary classification. We assess a classifier in terms of the area under the ROC curve (AUC). We estimate three important parameters, the conditional AUC (conditional on a particular training set) and the mean and variance of this AUC. We derive, as well, a closed form expression of the variance of the estimator of the AUG. This expression exhibits several components of variance that facilitate an understanding for the sources of uncertainty of that estimate. In addition, we estimate this variance, i.e., the variance of the conditional AUC estimator. Our approach is nonparametric and based on general methods from U-statistics; it addresses the case where the data distribution is neither known nor modeled and where there are only two available data sets, the training and testing sets. Finally, we illustrate some simulation results for these estimators</description><identifier>ISSN: 0162-8828</identifier><identifier>EISSN: 1939-3539</identifier><identifier>EISSN: 2160-9292</identifier><identifier>DOI: 10.1109/TPAMI.2006.218</identifier><identifier>PMID: 17063685</identifier><identifier>CODEN: ITPIDJ</identifier><language>eng</language><publisher>Los Alamitos, CA: IEEE</publisher><subject>Algorithms ; Applied sciences ; Artificial Intelligence ; Classification ; Classifiers ; Cluster Analysis ; Computer science; control theory; systems ; Connectionism. Neural networks ; Databases, Factual ; Decision theory ; Estimates ; Estimators ; Exact sciences and technology ; Image Enhancement - methods ; Image Interpretation, Computer-Assisted - methods ; Information Storage and Retrieval - methods ; Mathematical analysis ; Mathematical models ; Medical diagnosis ; nonparametric statistics ; Parameter estimation ; Pattern Recognition, Automated - methods ; Probability density function ; Random variables ; ROC analysis ; ROC Curve ; Statistical analysis ; Statistical distributions ; Testing ; Training ; Training data ; Uncertainty ; Variance</subject><ispartof>IEEE transactions on pattern analysis and machine intelligence, 2006-11, Vol.28 (11), p.1809-1817</ispartof><rights>2006 INIST-CNRS</rights><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2006</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c402t-aeb03d758065f79966d62850ea071193616ba7513a86a09ea8df2310442e8db83</citedby><cites>FETCH-LOGICAL-c402t-aeb03d758065f79966d62850ea071193616ba7513a86a09ea8df2310442e8db83</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/1704836$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,776,780,792,27901,27902,54733</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/1704836$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc><backlink>$$Uhttp://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&idt=18179301$$DView record in Pascal Francis$$Hfree_for_read</backlink><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/17063685$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Yousef, W.A.</creatorcontrib><creatorcontrib>Wagner, R.F.</creatorcontrib><creatorcontrib>Loew, M.H.</creatorcontrib><title>Assessing Classifiers from Two Independent Data Sets Using ROC Analysis: A Nonparametric Approach</title><title>IEEE transactions on pattern analysis and machine intelligence</title><addtitle>TPAMI</addtitle><addtitle>IEEE Trans Pattern Anal Mach Intell</addtitle><description>This paper considers binary classification. We assess a classifier in terms of the area under the ROC curve (AUC). We estimate three important parameters, the conditional AUC (conditional on a particular training set) and the mean and variance of this AUC. We derive, as well, a closed form expression of the variance of the estimator of the AUG. This expression exhibits several components of variance that facilitate an understanding for the sources of uncertainty of that estimate. In addition, we estimate this variance, i.e., the variance of the conditional AUC estimator. Our approach is nonparametric and based on general methods from U-statistics; it addresses the case where the data distribution is neither known nor modeled and where there are only two available data sets, the training and testing sets. Finally, we illustrate some simulation results for these estimators</description><subject>Algorithms</subject><subject>Applied sciences</subject><subject>Artificial Intelligence</subject><subject>Classification</subject><subject>Classifiers</subject><subject>Cluster Analysis</subject><subject>Computer science; control theory; systems</subject><subject>Connectionism. Neural networks</subject><subject>Databases, Factual</subject><subject>Decision theory</subject><subject>Estimates</subject><subject>Estimators</subject><subject>Exact sciences and technology</subject><subject>Image Enhancement - methods</subject><subject>Image Interpretation, Computer-Assisted - methods</subject><subject>Information Storage and Retrieval - methods</subject><subject>Mathematical analysis</subject><subject>Mathematical models</subject><subject>Medical diagnosis</subject><subject>nonparametric statistics</subject><subject>Parameter estimation</subject><subject>Pattern Recognition, Automated - methods</subject><subject>Probability density function</subject><subject>Random variables</subject><subject>ROC analysis</subject><subject>ROC Curve</subject><subject>Statistical analysis</subject><subject>Statistical distributions</subject><subject>Testing</subject><subject>Training</subject><subject>Training data</subject><subject>Uncertainty</subject><subject>Variance</subject><issn>0162-8828</issn><issn>1939-3539</issn><issn>2160-9292</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2006</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><sourceid>EIF</sourceid><recordid>eNp9kU1v1DAQhi1ERZfSay9IyEKCnrKM48SxuUULbVcqFLXbczSbTCBVvvBkVfXf19tdqagHLrYPz7zjV48QJwrmSoH7svqV_1jOYwAzj5V9JWbKaRfpVLvXYgbKxJG1sT0Ub5nvAFSSgn4jDlUGRhubzgTmzMTc9L_losXwqBvyLGs_dHJ1P8hlX9FI4egn-Q0nlDc0sbx9Gri-Wsi8x_aBG_4qc_lz6Ef02NHkm1Lm4-gHLP-8Ewc1tkzH-_tI3J59Xy0uosur8-Uiv4zKBOIpQlqDrrLUgknrzDljKhPbFAghU6GUUWaNWao0WoPgCG1Vx1pBksRkq7XVR-J0lxvW_t0QT0XXcEltiz0NGy6sMyEGHATy839JY90WTgL48QV4N2x8aBzSTBq-ZTMVoPkOKv3A7KkuRt906B8KBcXWUfHkqNg6KoKjMPBhn7pZd1Q943spAfi0B5BLbGuPfdnwM2dVFopsN7_fcQ0R_RuTWG30Ixibn-A</recordid><startdate>20061101</startdate><enddate>20061101</enddate><creator>Yousef, W.A.</creator><creator>Wagner, R.F.</creator><creator>Loew, M.H.</creator><general>IEEE</general><general>IEEE Computer Society</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>IQODW</scope><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>7X8</scope><scope>F28</scope><scope>FR3</scope></search><sort><creationdate>20061101</creationdate><title>Assessing Classifiers from Two Independent Data Sets Using ROC Analysis: A Nonparametric Approach</title><author>Yousef, W.A. ; Wagner, R.F. ; Loew, M.H.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c402t-aeb03d758065f79966d62850ea071193616ba7513a86a09ea8df2310442e8db83</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2006</creationdate><topic>Algorithms</topic><topic>Applied sciences</topic><topic>Artificial Intelligence</topic><topic>Classification</topic><topic>Classifiers</topic><topic>Cluster Analysis</topic><topic>Computer science; control theory; systems</topic><topic>Connectionism. Neural networks</topic><topic>Databases, Factual</topic><topic>Decision theory</topic><topic>Estimates</topic><topic>Estimators</topic><topic>Exact sciences and technology</topic><topic>Image Enhancement - methods</topic><topic>Image Interpretation, Computer-Assisted - methods</topic><topic>Information Storage and Retrieval - methods</topic><topic>Mathematical analysis</topic><topic>Mathematical models</topic><topic>Medical diagnosis</topic><topic>nonparametric statistics</topic><topic>Parameter estimation</topic><topic>Pattern Recognition, Automated - methods</topic><topic>Probability density function</topic><topic>Random variables</topic><topic>ROC analysis</topic><topic>ROC Curve</topic><topic>Statistical analysis</topic><topic>Statistical distributions</topic><topic>Testing</topic><topic>Training</topic><topic>Training data</topic><topic>Uncertainty</topic><topic>Variance</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Yousef, W.A.</creatorcontrib><creatorcontrib>Wagner, R.F.</creatorcontrib><creatorcontrib>Loew, M.H.</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>Pascal-Francis</collection><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>MEDLINE - Academic</collection><collection>ANTE: Abstracts in New Technology & Engineering</collection><collection>Engineering Research Database</collection><jtitle>IEEE transactions on pattern analysis and machine intelligence</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Yousef, W.A.</au><au>Wagner, R.F.</au><au>Loew, M.H.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Assessing Classifiers from Two Independent Data Sets Using ROC Analysis: A Nonparametric Approach</atitle><jtitle>IEEE transactions on pattern analysis and machine intelligence</jtitle><stitle>TPAMI</stitle><addtitle>IEEE Trans Pattern Anal Mach Intell</addtitle><date>2006-11-01</date><risdate>2006</risdate><volume>28</volume><issue>11</issue><spage>1809</spage><epage>1817</epage><pages>1809-1817</pages><issn>0162-8828</issn><eissn>1939-3539</eissn><eissn>2160-9292</eissn><coden>ITPIDJ</coden><abstract>This paper considers binary classification. We assess a classifier in terms of the area under the ROC curve (AUC). We estimate three important parameters, the conditional AUC (conditional on a particular training set) and the mean and variance of this AUC. We derive, as well, a closed form expression of the variance of the estimator of the AUG. This expression exhibits several components of variance that facilitate an understanding for the sources of uncertainty of that estimate. In addition, we estimate this variance, i.e., the variance of the conditional AUC estimator. Our approach is nonparametric and based on general methods from U-statistics; it addresses the case where the data distribution is neither known nor modeled and where there are only two available data sets, the training and testing sets. Finally, we illustrate some simulation results for these estimators</abstract><cop>Los Alamitos, CA</cop><pub>IEEE</pub><pmid>17063685</pmid><doi>10.1109/TPAMI.2006.218</doi><tpages>9</tpages></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | ISSN: 0162-8828 |
ispartof | IEEE transactions on pattern analysis and machine intelligence, 2006-11, Vol.28 (11), p.1809-1817 |
issn | 0162-8828 1939-3539 2160-9292 |
language | eng |
recordid | cdi_proquest_journals_865711871 |
source | IEEE Electronic Library (IEL) |
subjects | Algorithms Applied sciences Artificial Intelligence Classification Classifiers Cluster Analysis Computer science control theory systems Connectionism. Neural networks Databases, Factual Decision theory Estimates Estimators Exact sciences and technology Image Enhancement - methods Image Interpretation, Computer-Assisted - methods Information Storage and Retrieval - methods Mathematical analysis Mathematical models Medical diagnosis nonparametric statistics Parameter estimation Pattern Recognition, Automated - methods Probability density function Random variables ROC analysis ROC Curve Statistical analysis Statistical distributions Testing Training Training data Uncertainty Variance |
title | Assessing Classifiers from Two Independent Data Sets Using ROC Analysis: A Nonparametric Approach |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-21T18%3A40%3A43IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Assessing%20Classifiers%20from%20Two%20Independent%20Data%20Sets%20Using%20ROC%20Analysis:%20A%20Nonparametric%20Approach&rft.jtitle=IEEE%20transactions%20on%20pattern%20analysis%20and%20machine%20intelligence&rft.au=Yousef,%20W.A.&rft.date=2006-11-01&rft.volume=28&rft.issue=11&rft.spage=1809&rft.epage=1817&rft.pages=1809-1817&rft.issn=0162-8828&rft.eissn=1939-3539&rft.coden=ITPIDJ&rft_id=info:doi/10.1109/TPAMI.2006.218&rft_dat=%3Cproquest_RIE%3E896193090%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=865711871&rft_id=info:pmid/17063685&rft_ieee_id=1704836&rfr_iscdi=true |