A freshwater algae classification system based on machine learning with StyleGAN2-ADA augmentation for limited and imbalanced datasets
•Overcame challenge of limited and imbalanced data in automated algae classification.•Image augmentation by StyleGAN2-ADA notably improved algae classification models.•Improvement of 16% in accuracy and 21% in F1-score for rare algae classification.•Usage of generated algal images for training model...
Gespeichert in:
Veröffentlicht in: | Water research (Oxford) 2023-09, Vol.243, p.120409-120409, Article 120409 |
---|---|
Hauptverfasser: | , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | •Overcame challenge of limited and imbalanced data in automated algae classification.•Image augmentation by StyleGAN2-ADA notably improved algae classification models.•Improvement of 16% in accuracy and 21% in F1-score for rare algae classification.•Usage of generated algal images for training models effectively and efficiently.
Automated algae classification using machine learning is a more efficient and effective solution compared to manual classification, which can be tedious and time-consuming. However, the practical application of such a classification approach is restricted by the scarcity of labeled freshwater algae datasets, especially for rarer algae. To overcome these challenges, this study proposes to generate artificial algal images with StyleGAN2-ADA and use both the generated and real images to train machine-learning-driven algae classification models. This approach significantly enhances the performance of classification models, particularly in their ability to identify rare algae. Overall, the proposed approach improves the F1-score of lightweight MobileNetV3 classification models covering all 20 freshwater algae covered in this research from 88.4% to 96.2%, while for the models that cover only the rarer algae, the experiments show an improvement from 80% to 96.5% in terms of F1-score. The results show that the approach enables the trained algae classification systems to effectively cover algae with limited image data.
[Display omitted] |
---|---|
ISSN: | 0043-1354 1879-2448 |
DOI: | 10.1016/j.watres.2023.120409 |