SMOTE_EASY: UM ALGORITMO PARA TRATAR O PROBLEMA DE CLASSIFICACAO EM BASES DE DADOS REAIS/SOMOTE_EASY: AN ALGORITHM TO TREAT THE CLASSIFICATION ISSUE IN REAL DATABASES
Most classification tools assume that data distribution be balanced or with similar costs, when not properly classified. Nevertheless, in practical terms, the existence of database where unbalanced classes occur is commonplace, such as in the diagnosis of diseases, in which the confirmed cases are u...
Gespeichert in:
Veröffentlicht in: | Revista de gestão da tecnologia e sistemas de informação 2016-01, Vol.13 (1), p.61-61 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Most classification tools assume that data distribution be balanced or with similar costs, when not properly classified. Nevertheless, in practical terms, the existence of database where unbalanced classes occur is commonplace, such as in the diagnosis of diseases, in which the confirmed cases are usually rare when compared with a healthy population. Other examples are the detection of fraudulent calls and the detection of system intruders. In these cases, the improper classification of a minority class (for instance, to diagnose a person with cancer as healthy) may result in more serious consequences that incorrectly classify a majority class. Therefore, it is important to treat the database where unbalanced classes occur. This paper presents the SMOTE_Easy algorithm, which can classify data, even if there is a high level of unbalancing between different classes. In order to prove its efficiency, a comparison with the main algorithms to treat classification issues was made, where unbalanced data exist. This process was successful in nearly all tested databases |
---|---|
ISSN: | 1809-2640 1807-1775 |
DOI: | 10.4301/S1807-17752016000100004 |