A review of deep learning techniques in audio event recognition (AER) applications

In our day-to-day life, observation of human and social actions are highly important for public protection and security. Additionally, identifying suspicious activity is also essential in critical environments, such as industry, smart homes, nursing homes, and old age homes. In most of the audio-bas...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Multimedia tools and applications 2024, Vol.83 (3), p.8129-8143
Hauptverfasser:	Prashanth, Arjun, Jayalakshmi, S. L., Vedhapriyavadhana, R.
Format:	Artikel
Sprache:	eng
Schlagworte:	Background noise Benchmarks Classifiers Computer Communication Networks Computer Science Data Structures and Information Theory Datasets Deep learning Feature extraction Fingerprinting Machine learning Multimedia Information Systems Nursing homes Recognition Smart buildings Special Purpose and Application-Based Systems Spoofing
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	In our day-to-day life, observation of human and social actions are highly important for public protection and security. Additionally, identifying suspicious activity is also essential in critical environments, such as industry, smart homes, nursing homes, and old age homes. In most of the audio-based applications, the Audio Event Recognition (AER) task plays a vital role to recognize audio events. Even though many approaches focus on the effective implementation of audio-based applications, still there exist major research problems such as overlapping events, the presence of background noise, and the lack of benchmark data sets. The main objective of this survey is to identify effective feature extraction methods, robust classifiers, and benchmark datasets. To achieve this, we have presented a detailed survey on features, deep learning classifiers, and data sets used in the AER applications. Also, we summarised the various methods involved in AER applications such as audio spoofing, audio surveillance, and audio fingerprinting. The future direction includes setting up a benchmark dataset, identifying the semantic features, and exploring the transfer learning-based classifiers.
ISSN:	1380-7501 1573-7721
DOI:	10.1007/s11042-023-15891-z