A review of deep learning techniques in audio event recognition (AER) applications
In our day-to-day life, observation of human and social actions are highly important for public protection and security. Additionally, identifying suspicious activity is also essential in critical environments, such as industry, smart homes, nursing homes, and old age homes. In most of the audio-bas...
Gespeichert in:
Veröffentlicht in: | Multimedia tools and applications 2024, Vol.83 (3), p.8129-8143 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | In our day-to-day life, observation of human and social actions are highly important for public protection and security. Additionally, identifying suspicious activity is also essential in critical environments, such as industry, smart homes, nursing homes, and old age homes. In most of the audio-based applications, the Audio Event Recognition (AER) task plays a vital role to recognize audio events. Even though many approaches focus on the effective implementation of audio-based applications, still there exist major research problems such as overlapping events, the presence of background noise, and the lack of benchmark data sets. The main objective of this survey is to identify effective feature extraction methods, robust classifiers, and benchmark datasets. To achieve this, we have presented a detailed survey on features, deep learning classifiers, and data sets used in the AER applications. Also, we summarised the various methods involved in AER applications such as audio spoofing, audio surveillance, and audio fingerprinting. The future direction includes setting up a benchmark dataset, identifying the semantic features, and exploring the transfer learning-based classifiers. |
---|---|
ISSN: | 1380-7501 1573-7721 |
DOI: | 10.1007/s11042-023-15891-z |