Data Preprocessing: A preliminary step for web data mining

In recent years immense growth of data i.e. big data is observed resulting in a brighter and more optimized future. Big Data demands large computational infrastructure with high–performance processing capabilities. Preparing big data for mining and analysis is a challenging task and requires data to...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:3C tecnología 2019-05, Vol.8 (1), p.206-221
Hauptverfasser: Jamshed, Huma, Khan, M. Sadiq Ali, Khurram, Muhammad, Inayatullah, Syed, Athar, Sameen
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:In recent years immense growth of data i.e. big data is observed resulting in a brighter and more optimized future. Big Data demands large computational infrastructure with high–performance processing capabilities. Preparing big data for mining and analysis is a challenging task and requires data to be preprocessed to improve the quality of raw data. The data instance representation and quality are foremost. Data preprocessing is preliminary data mining practice in which raw data is transformed into a format suitable for another processing procedure. Data preprocessing improves the data quality by cleaning, normalizing, transforming and extracting relevant feature from raw data. Data preprocessing significantly improve the performance of machine learning algorithms which in turn leads to accurate data mining. Knowledge discovery from noisy, irrelevant and redundant data is a difficult task therefore precise identification of extreme values and outlier, filling up missing values poses challenges. This paper discusses various big data pre–processing techniques in order to prepare it for mining and analysis tasks.
ISSN:2254-4143
2254-4143
DOI:10.17993/3ctecno.2019.specialissue2.206-221