Improving the quality of big data using machine learning and artificial intelligence techniques

Currently, due to the large and continuous increase in the volume of data, there are systems and applications developed to deal with the preparation of large data in storage, management and processing, and the traditional methods are not suitable for dealing with it due to the large and complex size...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Muhammad, Roaa Yahya, Hamad, Murtadha M.
Format: Tagungsbericht
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Currently, due to the large and continuous increase in the volume of data, there are systems and applications developed to deal with the preparation of large data in storage, management and processing, and the traditional methods are not suitable for dealing with it due to the large and complex size Which leads to consumption of time and lack of efficiency and performance of the techniques used In this research paper, we proposed different algorithms and techniques to improve the performance and quality of big data including the process of data collection and processing in different ways to deal with it. In addition to the techniques used in the processing process, such as flaw detection technology and under sample technology, data clean and other technologies. A hybrid model was used, represented by feature selection based on the Genetic algorithm’s best solution. With the Decision Tree algorithm as well as with the ANN algorithm. Accuracy calculation, recall, support, precision, and f1-score. The proposed system contains a data set of European card holders that contains 122 features. Finally, the results obtained by applying the proposed algorithms were compared using evaluation scales to determine the best model among the proposed models. After conducting the comparison process, it became clear that the GA model with DT beat the other model, as it achieved higher results, with an accuracy of 0.98, while the other model achieved an accuracy of 0.97.
ISSN:0094-243X
1551-7616
DOI:10.1063/5.0191266