Improving Water Quality Index Prediction Using Regression Learning Models

Rivers are the main sources of freshwater supply for the world population. However, many economic activities contribute to river water pollution. River water quality can be monitored using various parameters, such as the pH level, dissolved oxygen, total suspended solids, and the chemical properties...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:International journal of environmental research and public health 2022-10, Vol.19 (20), p.13702
Hauptverfasser: Mohd Zebaral Hoque, Jesmeen, Ab Aziz, Nor Azlina, Alelyani, Salem, Mohana, Mohamed, Hosain, Maruf
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Rivers are the main sources of freshwater supply for the world population. However, many economic activities contribute to river water pollution. River water quality can be monitored using various parameters, such as the pH level, dissolved oxygen, total suspended solids, and the chemical properties. Analyzing the trend and pattern of these parameters enables the prediction of the water quality so that proactive measures can be made by relevant authorities to prevent water pollution and predict the effectiveness of water restoration measures. Machine learning regression algorithms can be applied for this purpose. Here, eight machine learning regression techniques, including decision tree regression, linear regression, ridge, Lasso, support vector regression, random forest regression, extra tree regression, and the artificial neural network, are applied for the purpose of water quality index prediction. Historical data from Indian rivers are adopted for this study. The data refer to six water parameters. Twelve other features are then derived from the original six parameters. The performances of the models using different algorithms and sets of features are compared. The derived water quality rating scale features are identified to contribute toward the development of better regression models, while the linear regression and ridge offer the best performance. The best mean square error achieved is 0 and the correlation coefficient is 1.
ISSN:1660-4601
1661-7827
1660-4601
DOI:10.3390/ijerph192013702