Hybridization of hierarchical clustering with persistent homology in assessing haze episodes between air quality monitoring stations

Haze has been a major issue afflicting Southeast Asian countries, including Malaysia, for the past few decades. Hierarchical agglomerative cluster analysis (HACA) is commonly used to evaluate the spatial behavior between areas in which pollutants interact. Typically, using HACA, the Euclidean distan...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Journal of environmental management 2022-03, Vol.306, p.114434-114434, Article 114434
Hauptverfasser: Zulkepli, Nur Fariha Syaqina, Noorani, Mohd Salmi Md, Razak, Fatimah Abdul, Ismail, Munira, Alias, Mohd Almie
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Haze has been a major issue afflicting Southeast Asian countries, including Malaysia, for the past few decades. Hierarchical agglomerative cluster analysis (HACA) is commonly used to evaluate the spatial behavior between areas in which pollutants interact. Typically, using HACA, the Euclidean distance acts as the dissimilarity measure and air quality monitoring stations are grouped according to this measure, thus revealing the most polluted areas. In this study, a framework for the hybridization of the HACA technique is proposed by considering the topological similarity (Wasserstein distance) between stations to evaluate the spatial patterns of the affected areas by haze episodes. For this, a tool in the topological data analysis (TDA), namely, persistent homology, is used to extract essential topological features hidden in the dataset. The performance of the proposed method is compared with that of traditional HACA and evaluated based on its ability to categorize areas according to the exceedance level of the particulate matter (PM10). Results show that additional topological features have yielded better accuracy compared to without the case that does not consider topological features. The cluster validity indices are computed to verify the results, and the proposed method outperforms the traditional method, suggesting a practical alternative approach for assessing the similarity in air pollution behaviors based on topological characterizations. •A framework of hybrid hierarchical clustering (HACA) with persistent homology is proposed.•Hybrid HACA outperformed traditional HACA in categorizing stations according to haze severity.•The superiority of hybrid HACA is verified by cluster validity indices.•The most affected areas could be identified through hybrid HACA.•A better air quality management can be achieved by using the new approach to categorize affected areas.
ISSN:0301-4797
1095-8630
DOI:10.1016/j.jenvman.2022.114434