SMOTE_EASY: UM ALGORITMO PARA TRATAR O PROBLEMA DE CLASSIFICACAO EM BASES DE DADOS REAIS/SOMOTE_EASY: AN ALGORITHM TO TREAT THE CLASSIFICATION ISSUE IN REAL DATABASES
Most classification tools assume that data distribution be balanced or with similar costs, when not properly classified. Nevertheless, in practical terms, the existence of database where unbalanced classes occur is commonplace, such as in the diagnosis of diseases, in which the confirmed cases are u...
Gespeichert in:
Veröffentlicht in: | Revista de gestão da tecnologia e sistemas de informação 2016-01, Vol.13 (1), p.61-61 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 61 |
---|---|
container_issue | 1 |
container_start_page | 61 |
container_title | Revista de gestão da tecnologia e sistemas de informação |
container_volume | 13 |
creator | Rufino, Hugo Leonardo Pereira Veiga, Antonio Claudio Paschoarelli Nakamoto, Paula Teixeira |
description | Most classification tools assume that data distribution be balanced or with similar costs, when not properly classified. Nevertheless, in practical terms, the existence of database where unbalanced classes occur is commonplace, such as in the diagnosis of diseases, in which the confirmed cases are usually rare when compared with a healthy population. Other examples are the detection of fraudulent calls and the detection of system intruders. In these cases, the improper classification of a minority class (for instance, to diagnose a person with cancer as healthy) may result in more serious consequences that incorrectly classify a majority class. Therefore, it is important to treat the database where unbalanced classes occur. This paper presents the SMOTE_Easy algorithm, which can classify data, even if there is a high level of unbalancing between different classes. In order to prove its efficiency, a comparison with the main algorithms to treat classification issues was made, where unbalanced data exist. This process was successful in nearly all tested databases |
doi_str_mv | 10.4301/S1807-17752016000100004 |
format | Article |
fullrecord | <record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_miscellaneous_1816075577</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>1816075577</sourcerecordid><originalsourceid>FETCH-proquest_miscellaneous_18160755773</originalsourceid><addsrcrecordid>eNqV0MFOwzAMBuAIgcQEewZ85FKWbO3ScvPajEZqGhRnB07VQEWaVBgQ9ko8J9kEaFcOlm391ncwY1eC36QzLiYkci4TIWU25WLOORexeHrCRn_B6WEukuk85edsHMLmkadFnnEpihH7ImO96hTSwy2sDGBzZ532xsI9OgTv0KODuDm7aJRBqBSUDRLppS6xRAvKwAJJ0T6psLIETqGmCdkjGdtfuTbgbXQVevD1Mea1bUETrRTodo800fN4wC_Z2fN6CP34p1-w66XyZZ28fWzfd3347F424akfhvVrv92FTuTxGzLLpJz94_Qbm6halg</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1816075577</pqid></control><display><type>article</type><title>SMOTE_EASY: UM ALGORITMO PARA TRATAR O PROBLEMA DE CLASSIFICACAO EM BASES DE DADOS REAIS/SOMOTE_EASY: AN ALGORITHM TO TREAT THE CLASSIFICATION ISSUE IN REAL DATABASES</title><source>EZB-FREE-00999 freely available EZB journals</source><creator>Rufino, Hugo Leonardo Pereira ; Veiga, Antonio Claudio Paschoarelli ; Nakamoto, Paula Teixeira</creator><creatorcontrib>Rufino, Hugo Leonardo Pereira ; Veiga, Antonio Claudio Paschoarelli ; Nakamoto, Paula Teixeira</creatorcontrib><description>Most classification tools assume that data distribution be balanced or with similar costs, when not properly classified. Nevertheless, in practical terms, the existence of database where unbalanced classes occur is commonplace, such as in the diagnosis of diseases, in which the confirmed cases are usually rare when compared with a healthy population. Other examples are the detection of fraudulent calls and the detection of system intruders. In these cases, the improper classification of a minority class (for instance, to diagnose a person with cancer as healthy) may result in more serious consequences that incorrectly classify a majority class. Therefore, it is important to treat the database where unbalanced classes occur. This paper presents the SMOTE_Easy algorithm, which can classify data, even if there is a high level of unbalancing between different classes. In order to prove its efficiency, a comparison with the main algorithms to treat classification issues was made, where unbalanced data exist. This process was successful in nearly all tested databases</description><identifier>ISSN: 1809-2640</identifier><identifier>EISSN: 1807-1775</identifier><identifier>DOI: 10.4301/S1807-17752016000100004</identifier><language>eng</language><subject>Algorithms ; Cancer ; Classification ; Diagnosis ; Information systems ; Level (quantity) ; Management</subject><ispartof>Revista de gestão da tecnologia e sistemas de informação, 2016-01, Vol.13 (1), p.61-61</ispartof><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,780,784,27924,27925</link.rule.ids></links><search><creatorcontrib>Rufino, Hugo Leonardo Pereira</creatorcontrib><creatorcontrib>Veiga, Antonio Claudio Paschoarelli</creatorcontrib><creatorcontrib>Nakamoto, Paula Teixeira</creatorcontrib><title>SMOTE_EASY: UM ALGORITMO PARA TRATAR O PROBLEMA DE CLASSIFICACAO EM BASES DE DADOS REAIS/SOMOTE_EASY: AN ALGORITHM TO TREAT THE CLASSIFICATION ISSUE IN REAL DATABASES</title><title>Revista de gestão da tecnologia e sistemas de informação</title><description>Most classification tools assume that data distribution be balanced or with similar costs, when not properly classified. Nevertheless, in practical terms, the existence of database where unbalanced classes occur is commonplace, such as in the diagnosis of diseases, in which the confirmed cases are usually rare when compared with a healthy population. Other examples are the detection of fraudulent calls and the detection of system intruders. In these cases, the improper classification of a minority class (for instance, to diagnose a person with cancer as healthy) may result in more serious consequences that incorrectly classify a majority class. Therefore, it is important to treat the database where unbalanced classes occur. This paper presents the SMOTE_Easy algorithm, which can classify data, even if there is a high level of unbalancing between different classes. In order to prove its efficiency, a comparison with the main algorithms to treat classification issues was made, where unbalanced data exist. This process was successful in nearly all tested databases</description><subject>Algorithms</subject><subject>Cancer</subject><subject>Classification</subject><subject>Diagnosis</subject><subject>Information systems</subject><subject>Level (quantity)</subject><subject>Management</subject><issn>1809-2640</issn><issn>1807-1775</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2016</creationdate><recordtype>article</recordtype><recordid>eNqV0MFOwzAMBuAIgcQEewZ85FKWbO3ScvPajEZqGhRnB07VQEWaVBgQ9ko8J9kEaFcOlm391ncwY1eC36QzLiYkci4TIWU25WLOORexeHrCRn_B6WEukuk85edsHMLmkadFnnEpihH7ImO96hTSwy2sDGBzZ532xsI9OgTv0KODuDm7aJRBqBSUDRLppS6xRAvKwAJJ0T6psLIETqGmCdkjGdtfuTbgbXQVevD1Mea1bUETrRTodo800fN4wC_Z2fN6CP34p1-w66XyZZ28fWzfd3347F424akfhvVrv92FTuTxGzLLpJz94_Qbm6halg</recordid><startdate>20160101</startdate><enddate>20160101</enddate><creator>Rufino, Hugo Leonardo Pereira</creator><creator>Veiga, Antonio Claudio Paschoarelli</creator><creator>Nakamoto, Paula Teixeira</creator><scope>7SC</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope></search><sort><creationdate>20160101</creationdate><title>SMOTE_EASY: UM ALGORITMO PARA TRATAR O PROBLEMA DE CLASSIFICACAO EM BASES DE DADOS REAIS/SOMOTE_EASY: AN ALGORITHM TO TREAT THE CLASSIFICATION ISSUE IN REAL DATABASES</title><author>Rufino, Hugo Leonardo Pereira ; Veiga, Antonio Claudio Paschoarelli ; Nakamoto, Paula Teixeira</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_miscellaneous_18160755773</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2016</creationdate><topic>Algorithms</topic><topic>Cancer</topic><topic>Classification</topic><topic>Diagnosis</topic><topic>Information systems</topic><topic>Level (quantity)</topic><topic>Management</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Rufino, Hugo Leonardo Pereira</creatorcontrib><creatorcontrib>Veiga, Antonio Claudio Paschoarelli</creatorcontrib><creatorcontrib>Nakamoto, Paula Teixeira</creatorcontrib><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>Revista de gestão da tecnologia e sistemas de informação</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Rufino, Hugo Leonardo Pereira</au><au>Veiga, Antonio Claudio Paschoarelli</au><au>Nakamoto, Paula Teixeira</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>SMOTE_EASY: UM ALGORITMO PARA TRATAR O PROBLEMA DE CLASSIFICACAO EM BASES DE DADOS REAIS/SOMOTE_EASY: AN ALGORITHM TO TREAT THE CLASSIFICATION ISSUE IN REAL DATABASES</atitle><jtitle>Revista de gestão da tecnologia e sistemas de informação</jtitle><date>2016-01-01</date><risdate>2016</risdate><volume>13</volume><issue>1</issue><spage>61</spage><epage>61</epage><pages>61-61</pages><issn>1809-2640</issn><eissn>1807-1775</eissn><abstract>Most classification tools assume that data distribution be balanced or with similar costs, when not properly classified. Nevertheless, in practical terms, the existence of database where unbalanced classes occur is commonplace, such as in the diagnosis of diseases, in which the confirmed cases are usually rare when compared with a healthy population. Other examples are the detection of fraudulent calls and the detection of system intruders. In these cases, the improper classification of a minority class (for instance, to diagnose a person with cancer as healthy) may result in more serious consequences that incorrectly classify a majority class. Therefore, it is important to treat the database where unbalanced classes occur. This paper presents the SMOTE_Easy algorithm, which can classify data, even if there is a high level of unbalancing between different classes. In order to prove its efficiency, a comparison with the main algorithms to treat classification issues was made, where unbalanced data exist. This process was successful in nearly all tested databases</abstract><doi>10.4301/S1807-17752016000100004</doi></addata></record> |
fulltext | fulltext |
identifier | ISSN: 1809-2640 |
ispartof | Revista de gestão da tecnologia e sistemas de informação, 2016-01, Vol.13 (1), p.61-61 |
issn | 1809-2640 1807-1775 |
language | eng |
recordid | cdi_proquest_miscellaneous_1816075577 |
source | EZB-FREE-00999 freely available EZB journals |
subjects | Algorithms Cancer Classification Diagnosis Information systems Level (quantity) Management |
title | SMOTE_EASY: UM ALGORITMO PARA TRATAR O PROBLEMA DE CLASSIFICACAO EM BASES DE DADOS REAIS/SOMOTE_EASY: AN ALGORITHM TO TREAT THE CLASSIFICATION ISSUE IN REAL DATABASES |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-28T03%3A57%3A38IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=SMOTE_EASY:%20UM%20ALGORITMO%20PARA%20TRATAR%20O%20PROBLEMA%20DE%20CLASSIFICACAO%20EM%20BASES%20DE%20DADOS%20REAIS/SOMOTE_EASY:%20AN%20ALGORITHM%20TO%20TREAT%20THE%20CLASSIFICATION%20ISSUE%20IN%20REAL%20DATABASES&rft.jtitle=Revista%20de%20gest%C3%A3o%20da%20tecnologia%20e%20sistemas%20de%20informa%C3%A7%C3%A3o&rft.au=Rufino,%20Hugo%20Leonardo%20Pereira&rft.date=2016-01-01&rft.volume=13&rft.issue=1&rft.spage=61&rft.epage=61&rft.pages=61-61&rft.issn=1809-2640&rft.eissn=1807-1775&rft_id=info:doi/10.4301/S1807-17752016000100004&rft_dat=%3Cproquest%3E1816075577%3C/proquest%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=1816075577&rft_id=info:pmid/&rfr_iscdi=true |