SGBBA: An Efficient Method for Prediction System in Machine Learning using Imbalance Dataset

A real world big dataset with disproportionate classification is called imbalance dataset which badly impacts the predictive result of machine learning classification algorithms. Most of the datasets faces the class imbalance problem in machine learning. Most of the algorithms in machine learning wo...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	International journal of advanced computer science & applications 2021, Vol.12 (3)
Hauptverfasser:	Islam, Saiful, Sara, Umme, Kawsar, Abu, Rahman, Anichur, Kundu, Dipanjali, Das, Diganta, Rezaul, A.N.M., Hasan, Mahedi
Format:	Artikel
Sprache:	eng
Schlagworte:	Algorithms Classification Datasets Impact prediction Machine learning Resampling
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	A real world big dataset with disproportionate classification is called imbalance dataset which badly impacts the predictive result of machine learning classification algorithms. Most of the datasets faces the class imbalance problem in machine learning. Most of the algorithms in machine learning work perfectly with about equal samples counts for every class. A variety of solutions have been suggested in the past time by the different researchers and applied to deal with the imbalance dataset. The performance of these methods is lower than the satisfactory level. It is very difficult to design an efficient method using machine learning algorithms without making the imbalance dataset to balance dataset. In this paper we have designed an method named SGBBA: an efficient method for prediction system in machine learning using Imbalance dataset. The method that is addressed in this paper increases the performance to the maximum in terms of accuracy and confusion matrix. The proposed method is consisted of two modules such as designing the method and method based prediction. The experiments with two benchmark datasets and one highly imbalanced credit card datasets are performed and the performances are compared with the performance of SMOTE resampling method. F-score, specificity, precision and recall are used as the evaluation matrices to test the performance of the proposed method in terms of any kind of imbalance dataset. According to the comparison of the result of the proposed method computationally attains the effective and robust performance than the existing methods.
ISSN:	2158-107X 2156-5570
DOI:	10.14569/IJACSA.2021.0120351