Data Fusion for Outlier Detection through Pseudo-ROC Curves and Rank Distributions

This paper proposes a novel method of fusing models for classification of unbalanced data. The unbalanced data contains a majority of healthy (negative) instances, and a minority of unhealthy (positive) instances. The applicability of this type of classification problem with security applications in...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Evangelista, P.F., Embrechts, M.J., Szymanski, B.K.
Format: Tagungsbericht
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 2173
container_issue
container_start_page 2166
container_title
container_volume
creator Evangelista, P.F.
Embrechts, M.J.
Szymanski, B.K.
description This paper proposes a novel method of fusing models for classification of unbalanced data. The unbalanced data contains a majority of healthy (negative) instances, and a minority of unhealthy (positive) instances. The applicability of this type of classification problem with security applications inspired the naming of such problems as security classification problems (SCP). The area under the ROC curve (AUC) is the metric utilized to measure classifier performance, and in order to better understand AUC and ROC behavior, pseudo-ROC curves created from simulated data are introduced. ROC curves depend entirely upon the rankings created by classifiers. The rank distributions discussed in this paper display classifier performance in a novel form, and the behavior of these rank distributions provides insight into classifier fusion for the SCP. Rank distributions, which illustrate the probability of a particular rank containing a positive or negative instance, will be introduced and used to explain why synergistic classifier fusion occurs.
doi_str_mv 10.1109/IJCNN.2006.246989
format Conference Proceeding
fullrecord <record><control><sourceid>ieee_6IE</sourceid><recordid>TN_cdi_ieee_primary_1716379</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>1716379</ieee_id><sourcerecordid>1716379</sourcerecordid><originalsourceid>FETCH-LOGICAL-i175t-df34ff523df4dd42e442b3e1185095441cbbea7e247b55d80b957f2c673b63563</originalsourceid><addsrcrecordid>eNo1jNtKAzEYhIMHsNY-gHiTF9g152wuZWu1Ulopel2SzR8brV1JsoJvr0WFgYFvPgahS0pqSom5nj-0y2XNCFE1E8o05giNGFW0EoLoYzQxuiE_4UYYwk7-N274GTrP-ZUQxo3hI7Se2mLxbMix3-PQJ7wayi5CwlMo0JUDLdvUDy9b_Jhh8H21XrW4HdInZGz3Hq_t_g1PYy4puuHg5wt0Guwuw-Svx-h5dvvU3leL1d28vVlUkWpZKh-4CEEy7oPwXjAQgjkOlDaSGCkE7ZwDq4EJ7aT0DXFG6sA6pblTXCo-Rle_vxEANh8pvtv0taGaKq4N_wYn8FF6</addsrcrecordid><sourcetype>Publisher</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype></control><display><type>conference_proceeding</type><title>Data Fusion for Outlier Detection through Pseudo-ROC Curves and Rank Distributions</title><source>IEEE Electronic Library (IEL) Conference Proceedings</source><creator>Evangelista, P.F. ; Embrechts, M.J. ; Szymanski, B.K.</creator><creatorcontrib>Evangelista, P.F. ; Embrechts, M.J. ; Szymanski, B.K.</creatorcontrib><description>This paper proposes a novel method of fusing models for classification of unbalanced data. The unbalanced data contains a majority of healthy (negative) instances, and a minority of unhealthy (positive) instances. The applicability of this type of classification problem with security applications inspired the naming of such problems as security classification problems (SCP). The area under the ROC curve (AUC) is the metric utilized to measure classifier performance, and in order to better understand AUC and ROC behavior, pseudo-ROC curves created from simulated data are introduced. ROC curves depend entirely upon the rankings created by classifiers. The rank distributions discussed in this paper display classifier performance in a novel form, and the behavior of these rank distributions provides insight into classifier fusion for the SCP. Rank distributions, which illustrate the probability of a particular rank containing a positive or negative instance, will be introduced and used to explain why synergistic classifier fusion occurs.</description><identifier>ISSN: 2161-4393</identifier><identifier>ISBN: 9780780394902</identifier><identifier>ISBN: 0780394909</identifier><identifier>EISSN: 2161-4407</identifier><identifier>DOI: 10.1109/IJCNN.2006.246989</identifier><language>eng</language><publisher>IEEE</publisher><subject>Area measurement ; Computer science ; Data engineering ; Data security ; Displays ; Electronic mail ; Military computing ; Modeling ; Robustness ; Systems engineering and theory</subject><ispartof>The 2006 IEEE International Joint Conference on Neural Network Proceedings, 2006, p.2166-2173</ispartof><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/1716379$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>309,310,777,781,786,787,2052,4036,4037,27906,54901</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/1716379$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Evangelista, P.F.</creatorcontrib><creatorcontrib>Embrechts, M.J.</creatorcontrib><creatorcontrib>Szymanski, B.K.</creatorcontrib><title>Data Fusion for Outlier Detection through Pseudo-ROC Curves and Rank Distributions</title><title>The 2006 IEEE International Joint Conference on Neural Network Proceedings</title><addtitle>IJCNN</addtitle><description>This paper proposes a novel method of fusing models for classification of unbalanced data. The unbalanced data contains a majority of healthy (negative) instances, and a minority of unhealthy (positive) instances. The applicability of this type of classification problem with security applications inspired the naming of such problems as security classification problems (SCP). The area under the ROC curve (AUC) is the metric utilized to measure classifier performance, and in order to better understand AUC and ROC behavior, pseudo-ROC curves created from simulated data are introduced. ROC curves depend entirely upon the rankings created by classifiers. The rank distributions discussed in this paper display classifier performance in a novel form, and the behavior of these rank distributions provides insight into classifier fusion for the SCP. Rank distributions, which illustrate the probability of a particular rank containing a positive or negative instance, will be introduced and used to explain why synergistic classifier fusion occurs.</description><subject>Area measurement</subject><subject>Computer science</subject><subject>Data engineering</subject><subject>Data security</subject><subject>Displays</subject><subject>Electronic mail</subject><subject>Military computing</subject><subject>Modeling</subject><subject>Robustness</subject><subject>Systems engineering and theory</subject><issn>2161-4393</issn><issn>2161-4407</issn><isbn>9780780394902</isbn><isbn>0780394909</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2006</creationdate><recordtype>conference_proceeding</recordtype><sourceid>6IE</sourceid><sourceid>RIE</sourceid><recordid>eNo1jNtKAzEYhIMHsNY-gHiTF9g152wuZWu1Ulopel2SzR8brV1JsoJvr0WFgYFvPgahS0pqSom5nj-0y2XNCFE1E8o05giNGFW0EoLoYzQxuiE_4UYYwk7-N274GTrP-ZUQxo3hI7Se2mLxbMix3-PQJ7wayi5CwlMo0JUDLdvUDy9b_Jhh8H21XrW4HdInZGz3Hq_t_g1PYy4puuHg5wt0Guwuw-Svx-h5dvvU3leL1d28vVlUkWpZKh-4CEEy7oPwXjAQgjkOlDaSGCkE7ZwDq4EJ7aT0DXFG6sA6pblTXCo-Rle_vxEANh8pvtv0taGaKq4N_wYn8FF6</recordid><startdate>2006</startdate><enddate>2006</enddate><creator>Evangelista, P.F.</creator><creator>Embrechts, M.J.</creator><creator>Szymanski, B.K.</creator><general>IEEE</general><scope>6IE</scope><scope>6IH</scope><scope>CBEJK</scope><scope>RIE</scope><scope>RIO</scope></search><sort><creationdate>2006</creationdate><title>Data Fusion for Outlier Detection through Pseudo-ROC Curves and Rank Distributions</title><author>Evangelista, P.F. ; Embrechts, M.J. ; Szymanski, B.K.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-i175t-df34ff523df4dd42e442b3e1185095441cbbea7e247b55d80b957f2c673b63563</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2006</creationdate><topic>Area measurement</topic><topic>Computer science</topic><topic>Data engineering</topic><topic>Data security</topic><topic>Displays</topic><topic>Electronic mail</topic><topic>Military computing</topic><topic>Modeling</topic><topic>Robustness</topic><topic>Systems engineering and theory</topic><toplevel>online_resources</toplevel><creatorcontrib>Evangelista, P.F.</creatorcontrib><creatorcontrib>Embrechts, M.J.</creatorcontrib><creatorcontrib>Szymanski, B.K.</creatorcontrib><collection>IEEE Electronic Library (IEL) Conference Proceedings</collection><collection>IEEE Proceedings Order Plan (POP) 1998-present by volume</collection><collection>IEEE Xplore All Conference Proceedings</collection><collection>IEEE Electronic Library (IEL)</collection><collection>IEEE Proceedings Order Plans (POP) 1998-present</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Evangelista, P.F.</au><au>Embrechts, M.J.</au><au>Szymanski, B.K.</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>Data Fusion for Outlier Detection through Pseudo-ROC Curves and Rank Distributions</atitle><btitle>The 2006 IEEE International Joint Conference on Neural Network Proceedings</btitle><stitle>IJCNN</stitle><date>2006</date><risdate>2006</risdate><spage>2166</spage><epage>2173</epage><pages>2166-2173</pages><issn>2161-4393</issn><eissn>2161-4407</eissn><isbn>9780780394902</isbn><isbn>0780394909</isbn><abstract>This paper proposes a novel method of fusing models for classification of unbalanced data. The unbalanced data contains a majority of healthy (negative) instances, and a minority of unhealthy (positive) instances. The applicability of this type of classification problem with security applications inspired the naming of such problems as security classification problems (SCP). The area under the ROC curve (AUC) is the metric utilized to measure classifier performance, and in order to better understand AUC and ROC behavior, pseudo-ROC curves created from simulated data are introduced. ROC curves depend entirely upon the rankings created by classifiers. The rank distributions discussed in this paper display classifier performance in a novel form, and the behavior of these rank distributions provides insight into classifier fusion for the SCP. Rank distributions, which illustrate the probability of a particular rank containing a positive or negative instance, will be introduced and used to explain why synergistic classifier fusion occurs.</abstract><pub>IEEE</pub><doi>10.1109/IJCNN.2006.246989</doi><tpages>8</tpages></addata></record>
fulltext fulltext_linktorsrc
identifier ISSN: 2161-4393
ispartof The 2006 IEEE International Joint Conference on Neural Network Proceedings, 2006, p.2166-2173
issn 2161-4393
2161-4407
language eng
recordid cdi_ieee_primary_1716379
source IEEE Electronic Library (IEL) Conference Proceedings
subjects Area measurement
Computer science
Data engineering
Data security
Displays
Electronic mail
Military computing
Modeling
Robustness
Systems engineering and theory
title Data Fusion for Outlier Detection through Pseudo-ROC Curves and Rank Distributions
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-19T10%3A36%3A06IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-ieee_6IE&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=Data%20Fusion%20for%20Outlier%20Detection%20through%20Pseudo-ROC%20Curves%20and%20Rank%20Distributions&rft.btitle=The%202006%20IEEE%20International%20Joint%20Conference%20on%20Neural%20Network%20Proceedings&rft.au=Evangelista,%20P.F.&rft.date=2006&rft.spage=2166&rft.epage=2173&rft.pages=2166-2173&rft.issn=2161-4393&rft.eissn=2161-4407&rft.isbn=9780780394902&rft.isbn_list=0780394909&rft_id=info:doi/10.1109/IJCNN.2006.246989&rft_dat=%3Cieee_6IE%3E1716379%3C/ieee_6IE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=1716379&rfr_iscdi=true