Enhancement Methods of Hydropower Unit Monitoring Data Quality Based on the Hierarchical Density-Based Spatial Clustering of Applications with a Noise-Wasserstein Slim Generative Adversarial Imputation Network with a Gradient Penalty

In order to solve low-quality problems such as data anomalies and missing data in the condition monitoring data of hydropower units, this paper proposes a monitoring data quality enhancement method based on HDBSCAN-WSGAIN-GP, which improves the quality and usability of the condition monitoring data...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Sensors (Basel, Switzerland) Switzerland), 2023-12, Vol.24 (1), p.118
Hauptverfasser: Zhang, Fangqing, Guo, Jiang, Yuan, Fang, Qiu, Yuanfeng, Wang, Pei, Cheng, Fangjuan, Gu, Yifeng
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:In order to solve low-quality problems such as data anomalies and missing data in the condition monitoring data of hydropower units, this paper proposes a monitoring data quality enhancement method based on HDBSCAN-WSGAIN-GP, which improves the quality and usability of the condition monitoring data of hydropower units by combining the advantages of density clustering and a generative adversarial network. First, the monitoring data are grouped according to the density level by the HDBSCAN clustering method in combination with the working conditions, and the anomalies in this dataset are detected, recognized adaptively and cleaned. Further combining the superiority of the WSGAIN-GP model in data filling, the missing values in the cleaned data are automatically generated by the unsupervised learning of the features and the distribution of real monitoring data. The validation analysis is carried out by the online monitoring dataset of the actual operating units, and the comparison experiments show that the clustering contour coefficient (SCI) of the HDBSCAN-based anomaly detection model reaches 0.4935, which is higher than that of the other comparative models, indicating that the proposed model has superiority in distinguishing between the valid samples and anomalous samples. The probability density distribution of the data filling model based on WSGAIN-GP is similar to that of the measured data, and the KL dispersion, JS dispersion and Hellinger's distance of the distribution between the filled data and the original data are close to 0. Compared with the filling methods such as SGAIN, GAIN, KNN, etc., the effect of data filling with different missing rates is verified, and the RMSE error of data filling with WSGAIN-GP is lower than that of other comparative models. The WSGAIN-GP method has the lowest RMSE error under different missing rates, which proves that the proposed filling model has good accuracy and generalization, and the research results in this paper provide a high-quality data basis for the subsequent trend prediction and state warning.
ISSN:1424-8220
1424-8220
DOI:10.3390/s24010118