A Tandem Algorithm for Singing Pitch Extraction and Voice Separation From Music Accompaniment

Singing pitch estimation and singing voice separation are challenging due to the presence of music accompaniments that are often nonstationary and harmonic. Inspired by computational auditory scene analysis (CASA), this paper investigates a tandem algorithm that estimates the singing pitch and separ...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on audio, speech, and language processing speech, and language processing, 2012-07, Vol.20 (5), p.1482-1491
Hauptverfasser:	Chao-Ling Hsu, DeLiang Wang, Jang, Jyh-Shing Roger, Ke Hu
Format:	Artikel
Sprache:	eng
Schlagworte:	Algorithms Applied sciences Computational auditory scene analysis (CASA) Detection, estimation, filtering, equalization, prediction Estimation Exact sciences and technology Extraction Harmonic analysis Hidden Markov models Information, signal and communications theory Instruments iterative procedure Miscellaneous Music Musical instruments Musical recordings pitch extraction Separation Signal and communications theory Signal processing Signal representation. Spectral analysis Signal, noise Singing singing voice separation Spectrogram Speech Speech processing tandem algorithm Telecommunications and information theory Time frequency analysis Trends Voice
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Singing pitch estimation and singing voice separation are challenging due to the presence of music accompaniments that are often nonstationary and harmonic. Inspired by computational auditory scene analysis (CASA), this paper investigates a tandem algorithm that estimates the singing pitch and separates the singing voice jointly and iteratively. Rough pitches are first estimated and then used to separate the target singer by considering harmonicity and temporal continuity. The separated singing voice and estimated pitches are used to improve each other iteratively. To enhance the performance of the tandem algorithm for dealing with musical recordings, we propose a trend estimation algorithm to detect the pitch ranges of a singing voice in each time frame. The detected trend substantially reduces the difficulty of singing pitch detection by removing a large number of wrong pitch candidates either produced by musical instruments or the overtones of the singing voice. Systematic evaluation shows that the tandem algorithm outperforms previous systems for pitch extraction and singing voice separation.
ISSN:	1558-7916 2329-9290 1558-7924 2329-9304
DOI:	10.1109/TASL.2011.2182510