Data selection in frog chorusing recognition with acoustic indices
This research explores the data selection problem in acoustic recognition of two co-existing sibling frog species from long-duration field recordings. This study explores the data selection problem in species recognition with machine learning, including instance selection and acoustic index feature...
Gespeichert in:
Veröffentlicht in: | Ecological informatics 2020-11, Vol.60, p.101160, Article 101160 |
---|---|
Hauptverfasser: | , , , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | This research explores the data selection problem in acoustic recognition of two co-existing sibling frog species from long-duration field recordings. This study explores the data selection problem in species recognition with machine learning, including instance selection and acoustic index feature selection. Our target species are two co-existing frog species. The Wallum Sedgefrog (Litoria olongburensis) is the most threatened acid frog species, facing habitat loss and degradation across much of their distribution, in addition to further pressures associated with anecdotally-recognised competition from it's sibling species, the Eastern Sedgefrog (Litoria fallax). Monitoring the calling behaviours of these two species is essential for informing L. olongburensis management and protection, and for obtaining ecological information about the process and implications of their competition. In order to recognise them from recordings, automated recognisers, instead of manual surveys, are required due to the overwhelmingly large volume of acoustic data. However, it costs much time and effort to annotate acoustic data, which results in a lack of labelled data for training recognisers. In addition, the composition of field audio recordings is complex and varies according to weather conditions, seasonal changes and animal activities. In this case, the chorusing behaviours of these two frog species can be greatly affected by weather conditions, for example, rains. Since we can only select limited amount of data from a very diverse data pool to annotate, it is important to explore how the selection of instances and features affects the performance of recognisers. In this paper, we selected two 24-h audio datasets from different weather conditions as our dataset. We first give a detailed visual comparison of them with false-colour spectrograms. Then we use datasets from individual days, their combination and a synthetic set constructed from them to evaluate the impact of different training instances on the recognition performance. We also analyses the effectiveness of acoustic index features. The experimental results show that models trained on data of a vocalisation-intense day or a normal day do not work well on each other. However, extra data from a normal-day dataset does help the recognition on a vocalisation-intense day, especially when the composition of the training set is consistent with the real-life weather condition ratio. Generally, including more data in tr |
---|---|
ISSN: | 1574-9541 |
DOI: | 10.1016/j.ecoinf.2020.101160 |