Interpretable prediction, classification and regulation of water quality: A case study of Poyang Lake, China

Effective identification and regulation of water quality impact factors is essential for water resource management and environmental protection. However, the complex coupling of water quality systems poses a significant challenge to this task. This study proposes coherent model for water quality pre...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:The Science of the total environment 2024-11, Vol.951, p.175407, Article 175407
Hauptverfasser: Yao, Zhiyuan, Wang, Zhaocai, Huang, Jinghan, Xu, Nannan, Cui, Xuefei, Wu, Tunhua
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Effective identification and regulation of water quality impact factors is essential for water resource management and environmental protection. However, the complex coupling of water quality systems poses a significant challenge to this task. This study proposes coherent model for water quality prediction, classification and regulation based on interpretable machine learning. The decomposition-reconstruction module is used to transform non-stationary water quality series into stationary series while effectively reducing the feature dimensions. Spatiotemporal multi-source data is introduced by using the Maximum Information Coefficient (MIC) for feature selection. The Temporal Convolutional Network (TCN) is used to extract the temporal features of different variables, followed by the introduction of External Attention mechanism (EA) to construct the relationship between these features. Finally, the target water quality sequence is simulated using Gated Recurrent Unit (GRU). The proposed model was applied to Poyang Lake in China to predict six water quality indicators: ammonia nitrogen (NH3-N), dissolved oxygen (DO), pH, total nitrogen (TN), total phosphorus (TP), water temperature (WT). The water quality was then classified based on the prediction results using the XGBoost algorithm. The findings indicate that the proposed model's Nash-Sutcliff Efficiency (NSE) value ranges from 0.88 to 0.99, surpassing that of the benchmark model, and demonstrates strong interval prediction performance. The results highlight the superior performance of the XGBoost algorithm (with an accuracy of 0.89) in addressing water quality classification issues, particularly in cases of category imbalance. Subsequently, interpretability analysis using the SHapley Additive exPlanation (SHAP) method revealed that the model is capable of learning relationships between different variables and there exists a possibility of learning the physical laws. Ultimately, this study proposes a water quality regulation mechanism that improves TN and DO levels by stepwise changing the magnitude of water temperature, which significantly improves in the case of data limitations. In conclusion, this study presents an overall framework for integrating water quality prediction, classification and improvement for the first time, forming a complete set of water quality early warning and improvement management strategies. This framework provides new ideas and ways for lake water quality management. [Display o
ISSN:0048-9697
1879-1026
1879-1026
DOI:10.1016/j.scitotenv.2024.175407