Sound Events Recognition and Retrieval Using Multi-Convolutional-Channel Sparse Coding Convolutional Neural Networks

This article proposes two novel deep convolutional neural networks (CNN), which are called the sparse coding convolutional neural network (SC-CNN) and the multi-convolutional-channel SC-CNN (MSC-CNN), to address the sound event recognition and retrieval problem. Unlike the general framework of a CNN...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE/ACM transactions on audio, speech, and language processing speech, and language processing, 2020, Vol.28, p.1875-1887
Hauptverfasser:	Wang, Chien-Yao, Tai, Tzu-Chiang, Wang, Jia-Ching, Santoso, Andri, Mathulaprangsan, Seksan, Chiang, Chin-Chin, Wu, Chung-Hsien
Format:	Artikel
Sprache:	eng
Schlagworte:	Artificial neural networks Coding Convolutional neural networks deep learning Dictionaries Hidden Markov models Neural networks Recognition Retrieval Sound Sound event recognition sound event retrieval sparse coding convolutional neural network Spectrogram Speech coding Speech processing Speech recognition Voice recognition
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	This article proposes two novel deep convolutional neural networks (CNN), which are called the sparse coding convolutional neural network (SC-CNN) and the multi-convolutional-channel SC-CNN (MSC-CNN), to address the sound event recognition and retrieval problem. Unlike the general framework of a CNN, in which the feature learning process is performed hierarchically, the proposed framework models the whole memorization process in the human brain, including encoding, storage, and recollection. In particular, the MSC-CNN is designed to recognize multiple sound events that occur simultaneously. The experimental results indicate that the proposed SC-CNN and MSC-CNN outperforms the state-of-the-art systems in sound event recognition and retrieval.
ISSN:	2329-9290 2329-9304
DOI:	10.1109/TASLP.2020.2964959