Hybrid Deep Learning Model for Singing Voice Separation

Monaural source separation is a challenging issue due to the fact that there is only a single channel available; however, there is an unlimited range of possible solutions. In this paper, a monaural source separation model based hybrid deep learning model, which consists of convolution neural networ...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Mendel (Brno (Czech Republic)) 2021-12, Vol.27 (2)
Hauptverfasser:	Rusul Amer, Ahmed Al Tmeme
Format:	Artikel
Sprache:	eng
Schlagworte:	Convolution Neural Network Dense Neural Network Hybrid Deep Learning Monaural Source Separation Recurrent Neural Network Time Frequency Masking
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Monaural source separation is a challenging issue due to the fact that there is only a single channel available; however, there is an unlimited range of possible solutions. In this paper, a monaural source separation model based hybrid deep learning model, which consists of convolution neural network (CNN), dense neural network (DNN) and recurrent neural network (RNN), will be presented. A trial and error method will be used to optimize the number of layers in the proposed model. Moreover, the effects of the learning rate, optimization algorithms, and the number of epochs on the separation performance will be explored. Our model was evaluated using the MIR-1K dataset for singing voice separation. Moreover, the proposed approach achieves (4.81) dB GNSDR gain, (7.28) dB GSIR gain, and (3.39) dB GSAR gain in comparison to current approaches
ISSN:	1803-3814 2571-3701
DOI:	10.13164/mendel.2021.2.044