Big data aggregation in the case of heterogeneity: a feasibility study for digital health
In big data applications, an important factor that may affect the value of the acquired data is the missing data, which arises when data is lost either during acquisition or during storage. The former can be a result of faulty acquisition devices or non responsive sensors whereas the latter can occu...
Gespeichert in:
Veröffentlicht in: | International journal of machine learning and cybernetics 2019-10, Vol.10 (10), p.2643-2655 |
---|---|
Hauptverfasser: | , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | In big data applications, an important factor that may affect the value of the acquired data is the missing data, which arises when data is lost either during acquisition or during storage. The former can be a result of faulty acquisition devices or non responsive sensors whereas the latter can occur as a result of hardware failures at the storage units. In this paper, we consider human activity recognition as a case study of a typical machine learning application on big datasets. We conduct a comprehensive feasibility study on the fusion of sensory data that is acquired from heterogeneous sources. We present insights on the aggregation of heterogeneous datasets with minimal missing data values for future use. Our experiments on the accuracy, F-1 score, and PPV of various key machine learning algorithms show that sensory data acquired by wearables are less vulnerable to missing data and smaller training sets whereas smart portable devices require larger training sets to reduce the impacts of possibly missing data. |
---|---|
ISSN: | 1868-8071 1868-808X |
DOI: | 10.1007/s13042-018-00904-3 |