RQNet: Residual Quaternion CNN for Performance Enhancement in Low Complexity and Device Robust Acoustic Scene Classification
Acoustic Scene Classification aims to recognize the unique acoustic characteristics of an environment. Recently, Convolutional Neural Networks (CNNs) have boosted the accuracy of ASC algorithms. However, the focus of ASC system designers has shifted from improving accuracy to incorporating real-worl...
Gespeichert in:
Veröffentlicht in: | IEEE transactions on multimedia 2023-01, Vol.25, p.1-13 |
---|---|
Hauptverfasser: | , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Acoustic Scene Classification aims to recognize the unique acoustic characteristics of an environment. Recently, Convolutional Neural Networks (CNNs) have boosted the accuracy of ASC algorithms. However, the focus of ASC system designers has shifted from improving accuracy to incorporating real-world considerations like device robustness and model complexity. In this paper, we address the problem of developing a low complexity system for ASC which can generalize across multiple recording devices. We propose to employ residual quaternion CNNs for low complexity, device-robust ASC. The proposed model RQNet uses quaternion encoding to increase the accuracy with fewer parameters. To further enhance the performance of RQNet, we employ a variant of log-mel spectrogram called multi-scale mel spectrogram (ms2) to represent the acoustic signal. Experiments on two benchmark ASC datasets indicate that RQNet outperforms a log-mel spectrum-based baseline by more than twofold. In addition, it has a good measure of separability between the individual classes, as indicated by an AUC (Area Under the ROC Curve) scores of 0.906 and 0.994. Furthermore, it reduces the model size by 82.19% and floating-point operations by 23.25%. Consequently, RQNet is suitable for deployment in context-aware devices. |
---|---|
ISSN: | 1520-9210 1941-0077 |
DOI: | 10.1109/TMM.2023.3241553 |