Data Anomaly Detection through Semisupervised Learning Aided by Customised Data Augmentation Techniques

Structural health monitoring (SHM) systems may suffer from multiple patterns of data anomalies. Anomaly detection is an essential preprocessing step prior to the use of monitoring data for structural condition assessment or other decision making. Deep learning techniques have been extensively used f...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Structural control and health monitoring 2023-07, Vol.2023, p.1-14
Hauptverfasser:	Wang, Xiaoyou, Du, Yao, Zhou, Xiaoqing, Xia, Yong
Format:	Artikel
Sprache:	eng
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Structural health monitoring (SHM) systems may suffer from multiple patterns of data anomalies. Anomaly detection is an essential preprocessing step prior to the use of monitoring data for structural condition assessment or other decision making. Deep learning techniques have been extensively used for automatic category classification by training the network with labelled data. However, because the SHM data are usually large in quantity, manually labelling these abnormal data is time consuming and labour intensive. This study develops a semisupervised learning-based data anomaly detection method using a small set of labelled data and massive unlabelled data. The MixMatch technique, which could mix labelled and unlabelled data using MixUp, is adopted to enhance the generalisation and robustness of the model. A unified loss function is defined to combine information from labelled and unlabelled data by incorporating consistency regularisation, entropy minimisation, and regular model regularisation items. In addition, customised data augmentation strategies for time series are investigated to further improve the model performance. The proposed method is applied to the SHM data from a real bridge for anomaly detection. Results demonstrate the superior performance of the developed method with very limited labelled data, greatly reducing the time and cost of labelling efforts compared with the traditional supervised learning methods.
ISSN:	1545-2255 1545-2263
DOI:	10.1155/2023/2430011