Outlier aware data aggregation in distributed wireless sensor network using robust principal component analysis
To address the problem of outlier detection in wireless sensor networks, in this paper we propose a robust principal component analysis based technique to detect anomalous or faulty sensor data in a distributed wireless sensor network with a focus on data integrity and accuracy problem. The main key...
Gespeichert in:
Hauptverfasser: | , , , |
---|---|
Format: | Tagungsbericht |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | To address the problem of outlier detection in wireless sensor networks, in this paper we propose a robust principal component analysis based technique to detect anomalous or faulty sensor data in a distributed wireless sensor network with a focus on data integrity and accuracy problem. The main key features are that it considers the correlation existing among the sensor data in order to disclose anomalies that span through a number of neighboring sensors, does not require error free data for PCA model construction and the operation takes place in a distributed fashion. In this paper, a two-step algorithm is proposed. First, the intent was to find an accurate estimate of the correlation of sensor data to build up a robust PCA model that could then be used for fault detection. This locally developed correlation based robust PCA model tends to accentuate the contribution of close observations in comparison with distant observations and does not impose any constraints in model design. Second, we use mahalanobis distance, a multivariate distance metric to determine the similarity between the current sensor readings against the developed sensor data model. Combined with component analysis, mahalanobis distance is extended to examine whether a sensor node is an outlier from a model defined by principal components based on principal component analysis. We examined the algorithm's performance using simulation with synthetic and real sensor data streams. The results clearly show that our approach outperforms existing methods in terms of accuracy even when processing corrupted data. |
---|---|
DOI: | 10.1109/ICCCNT.2010.5591850 |