Big data aggregation in the case of heterogeneity: a feasibility study for digital health

In big data applications, an important factor that may affect the value of the acquired data is the missing data, which arises when data is lost either during acquisition or during storage. The former can be a result of faulty acquisition devices or non responsive sensors whereas the latter can occu...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:International journal of machine learning and cybernetics 2019-10, Vol.10 (10), p.2643-2655
Hauptverfasser: Obinikpo, Alex Adim, Kantarci, Burak
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:In big data applications, an important factor that may affect the value of the acquired data is the missing data, which arises when data is lost either during acquisition or during storage. The former can be a result of faulty acquisition devices or non responsive sensors whereas the latter can occur as a result of hardware failures at the storage units. In this paper, we consider human activity recognition as a case study of a typical machine learning application on big datasets. We conduct a comprehensive feasibility study on the fusion of sensory data that is acquired from heterogeneous sources. We present insights on the aggregation of heterogeneous datasets with minimal missing data values for future use. Our experiments on the accuracy, F-1 score, and PPV of various key machine learning algorithms show that sensory data acquired by wearables are less vulnerable to missing data and smaller training sets whereas smart portable devices require larger training sets to reduce the impacts of possibly missing data.
ISSN:1868-8071
1868-808X
DOI:10.1007/s13042-018-00904-3