Sound Events Recognition and Retrieval Using Multi-Convolutional-Channel Sparse Coding Convolutional Neural Networks

This article proposes two novel deep convolutional neural networks (CNN), which are called the sparse coding convolutional neural network (SC-CNN) and the multi-convolutional-channel SC-CNN (MSC-CNN), to address the sound event recognition and retrieval problem. Unlike the general framework of a CNN...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE/ACM transactions on audio, speech, and language processing speech, and language processing, 2020, Vol.28, p.1875-1887
Hauptverfasser: Wang, Chien-Yao, Tai, Tzu-Chiang, Wang, Jia-Ching, Santoso, Andri, Mathulaprangsan, Seksan, Chiang, Chin-Chin, Wu, Chung-Hsien
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:This article proposes two novel deep convolutional neural networks (CNN), which are called the sparse coding convolutional neural network (SC-CNN) and the multi-convolutional-channel SC-CNN (MSC-CNN), to address the sound event recognition and retrieval problem. Unlike the general framework of a CNN, in which the feature learning process is performed hierarchically, the proposed framework models the whole memorization process in the human brain, including encoding, storage, and recollection. In particular, the MSC-CNN is designed to recognize multiple sound events that occur simultaneously. The experimental results indicate that the proposed SC-CNN and MSC-CNN outperforms the state-of-the-art systems in sound event recognition and retrieval.
ISSN:2329-9290
2329-9304
DOI:10.1109/TASLP.2020.2964959