A Novel Class Imbalance Learning using Ordering Points Clustering

In Data mining and Knowledge Discovery hidden and valuable knowledge from the data sources is discovered. The traditional algorithms used for knowledge discovery are bottle necked due to wide range of data sources availability. Class imbalance is a one of the problem arises due to data source which...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:International journal of computer applications 2012-01, Vol.51 (16), p.33-42
Hauptverfasser: Rao, K Nageswara, Rao, T Venkateswara, Lakshmi, D Rajya
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 42
container_issue 16
container_start_page 33
container_title International journal of computer applications
container_volume 51
creator Rao, K Nageswara
Rao, T Venkateswara
Lakshmi, D Rajya
description In Data mining and Knowledge Discovery hidden and valuable knowledge from the data sources is discovered. The traditional algorithms used for knowledge discovery are bottle necked due to wide range of data sources availability. Class imbalance is a one of the problem arises due to data source which provide unequal class i. e. examples of one class in a training data set vastly outnumber examples of the other class(es). This paper proposes a method belonging to under sampling approach which uses OPTICS one of the best visualization clustering technique for handling class imbalance problem. In the proposed approach, further Classification of new data is performed by applying C4. 5 algorithm as the base algorithm. The method is optimized by the selection of the most suitable clusters for deletion of the majority dataset based on visualization algorithms. An experimental analysis is carried out over a wide range of highly imbalanced data sets and uses the statistical tests suggested in the specialized literature. The results obtained show that our novel proposal outperforms other classic and recent models in terms of Area under the ROC Curve, F-measure, precision, TP rate and TN rate.
doi_str_mv 10.5120/8128-1863
format Article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_1671368460</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>1671368460</sourcerecordid><originalsourceid>FETCH-LOGICAL-c1303-3d32e9c98fea6280e1c71e8b94eec1374d6ab5f70e2663612f8fec330cdf89dc3</originalsourceid><addsrcrecordid>eNpd0E1LxDAQBuAgCi7rHvwHBS96qCaZNk2Oy-LHwuJ60HNI06l06cea2Qr-e1PXg5jDZBgehuFl7FLw21xIfqeF1KnQCk7YjJsiT7XWxemf_pwtiHY8PjBSmWzGlsvkefjENlm1jihZd6VrXe8x2aALfdO_JyNNdRsqDFPzMjT9gSIf6fAzuWBntWsJF7__nL093L-untLN9nG9Wm5SL4BDChVINN7oGp2SmqPwhUBdmgwxiiKrlCvzuuAolQIlZB2lB-C-qrWpPMzZ9XHvPgwfI9LBdg15bOO5OIxkhSoEKJ0pHunVP7obxtDH66zgJgeR5zqL6uaofBiIAtZ2H5rOha-I7JSnnfK0U57wDSxkZWA</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1095315584</pqid></control><display><type>article</type><title>A Novel Class Imbalance Learning using Ordering Points Clustering</title><source>EZB-FREE-00999 freely available EZB journals</source><creator>Rao, K Nageswara ; Rao, T Venkateswara ; Lakshmi, D Rajya</creator><creatorcontrib>Rao, K Nageswara ; Rao, T Venkateswara ; Lakshmi, D Rajya</creatorcontrib><description>In Data mining and Knowledge Discovery hidden and valuable knowledge from the data sources is discovered. The traditional algorithms used for knowledge discovery are bottle necked due to wide range of data sources availability. Class imbalance is a one of the problem arises due to data source which provide unequal class i. e. examples of one class in a training data set vastly outnumber examples of the other class(es). This paper proposes a method belonging to under sampling approach which uses OPTICS one of the best visualization clustering technique for handling class imbalance problem. In the proposed approach, further Classification of new data is performed by applying C4. 5 algorithm as the base algorithm. The method is optimized by the selection of the most suitable clusters for deletion of the majority dataset based on visualization algorithms. An experimental analysis is carried out over a wide range of highly imbalanced data sets and uses the statistical tests suggested in the specialized literature. The results obtained show that our novel proposal outperforms other classic and recent models in terms of Area under the ROC Curve, F-measure, precision, TP rate and TN rate.</description><identifier>ISSN: 0975-8887</identifier><identifier>EISSN: 0975-8887</identifier><identifier>DOI: 10.5120/8128-1863</identifier><language>eng</language><publisher>New York: Foundation of Computer Science</publisher><subject>Algorithms ; Clustering ; Data sources ; Materials handling ; Mathematical models ; Order disorder ; Proposals ; Visualization</subject><ispartof>International journal of computer applications, 2012-01, Vol.51 (16), p.33-42</ispartof><rights>Copyright Foundation of Computer Science 2012</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,780,784,27922,27923</link.rule.ids></links><search><creatorcontrib>Rao, K Nageswara</creatorcontrib><creatorcontrib>Rao, T Venkateswara</creatorcontrib><creatorcontrib>Lakshmi, D Rajya</creatorcontrib><title>A Novel Class Imbalance Learning using Ordering Points Clustering</title><title>International journal of computer applications</title><description>In Data mining and Knowledge Discovery hidden and valuable knowledge from the data sources is discovered. The traditional algorithms used for knowledge discovery are bottle necked due to wide range of data sources availability. Class imbalance is a one of the problem arises due to data source which provide unequal class i. e. examples of one class in a training data set vastly outnumber examples of the other class(es). This paper proposes a method belonging to under sampling approach which uses OPTICS one of the best visualization clustering technique for handling class imbalance problem. In the proposed approach, further Classification of new data is performed by applying C4. 5 algorithm as the base algorithm. The method is optimized by the selection of the most suitable clusters for deletion of the majority dataset based on visualization algorithms. An experimental analysis is carried out over a wide range of highly imbalanced data sets and uses the statistical tests suggested in the specialized literature. The results obtained show that our novel proposal outperforms other classic and recent models in terms of Area under the ROC Curve, F-measure, precision, TP rate and TN rate.</description><subject>Algorithms</subject><subject>Clustering</subject><subject>Data sources</subject><subject>Materials handling</subject><subject>Mathematical models</subject><subject>Order disorder</subject><subject>Proposals</subject><subject>Visualization</subject><issn>0975-8887</issn><issn>0975-8887</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2012</creationdate><recordtype>article</recordtype><recordid>eNpd0E1LxDAQBuAgCi7rHvwHBS96qCaZNk2Oy-LHwuJ60HNI06l06cea2Qr-e1PXg5jDZBgehuFl7FLw21xIfqeF1KnQCk7YjJsiT7XWxemf_pwtiHY8PjBSmWzGlsvkefjENlm1jihZd6VrXe8x2aALfdO_JyNNdRsqDFPzMjT9gSIf6fAzuWBntWsJF7__nL093L-untLN9nG9Wm5SL4BDChVINN7oGp2SmqPwhUBdmgwxiiKrlCvzuuAolQIlZB2lB-C-qrWpPMzZ9XHvPgwfI9LBdg15bOO5OIxkhSoEKJ0pHunVP7obxtDH66zgJgeR5zqL6uaofBiIAtZ2H5rOha-I7JSnnfK0U57wDSxkZWA</recordid><startdate>20120101</startdate><enddate>20120101</enddate><creator>Rao, K Nageswara</creator><creator>Rao, T Venkateswara</creator><creator>Lakshmi, D Rajya</creator><general>Foundation of Computer Science</general><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope></search><sort><creationdate>20120101</creationdate><title>A Novel Class Imbalance Learning using Ordering Points Clustering</title><author>Rao, K Nageswara ; Rao, T Venkateswara ; Lakshmi, D Rajya</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c1303-3d32e9c98fea6280e1c71e8b94eec1374d6ab5f70e2663612f8fec330cdf89dc3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2012</creationdate><topic>Algorithms</topic><topic>Clustering</topic><topic>Data sources</topic><topic>Materials handling</topic><topic>Mathematical models</topic><topic>Order disorder</topic><topic>Proposals</topic><topic>Visualization</topic><toplevel>online_resources</toplevel><creatorcontrib>Rao, K Nageswara</creatorcontrib><creatorcontrib>Rao, T Venkateswara</creatorcontrib><creatorcontrib>Lakshmi, D Rajya</creatorcontrib><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>International journal of computer applications</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Rao, K Nageswara</au><au>Rao, T Venkateswara</au><au>Lakshmi, D Rajya</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>A Novel Class Imbalance Learning using Ordering Points Clustering</atitle><jtitle>International journal of computer applications</jtitle><date>2012-01-01</date><risdate>2012</risdate><volume>51</volume><issue>16</issue><spage>33</spage><epage>42</epage><pages>33-42</pages><issn>0975-8887</issn><eissn>0975-8887</eissn><abstract>In Data mining and Knowledge Discovery hidden and valuable knowledge from the data sources is discovered. The traditional algorithms used for knowledge discovery are bottle necked due to wide range of data sources availability. Class imbalance is a one of the problem arises due to data source which provide unequal class i. e. examples of one class in a training data set vastly outnumber examples of the other class(es). This paper proposes a method belonging to under sampling approach which uses OPTICS one of the best visualization clustering technique for handling class imbalance problem. In the proposed approach, further Classification of new data is performed by applying C4. 5 algorithm as the base algorithm. The method is optimized by the selection of the most suitable clusters for deletion of the majority dataset based on visualization algorithms. An experimental analysis is carried out over a wide range of highly imbalanced data sets and uses the statistical tests suggested in the specialized literature. The results obtained show that our novel proposal outperforms other classic and recent models in terms of Area under the ROC Curve, F-measure, precision, TP rate and TN rate.</abstract><cop>New York</cop><pub>Foundation of Computer Science</pub><doi>10.5120/8128-1863</doi><tpages>10</tpages><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 0975-8887
ispartof International journal of computer applications, 2012-01, Vol.51 (16), p.33-42
issn 0975-8887
0975-8887
language eng
recordid cdi_proquest_miscellaneous_1671368460
source EZB-FREE-00999 freely available EZB journals
subjects Algorithms
Clustering
Data sources
Materials handling
Mathematical models
Order disorder
Proposals
Visualization
title A Novel Class Imbalance Learning using Ordering Points Clustering
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-14T02%3A47%3A33IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=A%20Novel%20Class%20Imbalance%20Learning%20using%20Ordering%20Points%20Clustering&rft.jtitle=International%20journal%20of%20computer%20applications&rft.au=Rao,%20K%20Nageswara&rft.date=2012-01-01&rft.volume=51&rft.issue=16&rft.spage=33&rft.epage=42&rft.pages=33-42&rft.issn=0975-8887&rft.eissn=0975-8887&rft_id=info:doi/10.5120/8128-1863&rft_dat=%3Cproquest_cross%3E1671368460%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=1095315584&rft_id=info:pmid/&rfr_iscdi=true