Audio-Visual Underdetermined Blind Source Separation Algorithm Based on Gaussian Potential Function

Most existing algorithms for the underdetermined blind source separation （UBSS） problem are two-stage algorithm, i.e., mixing parameters estimation and sources estimation. In the mixing parameters estimation, the previously proposed traditional clustering algorithms are sensitive to the initializati...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	China communications 2014-06, Vol.11 (6), p.71-80
Hauptverfasser:	Ye, Zhang, Kang, Cao, Kangrui, Wu, Tenglong, Yu, Nanrun, Zhou
Format:	Artikel
Sprache:	eng
Schlagworte:	Algorithm design and analysis Audio-visual systems Clustering algorithms Gaussian potential function Hidden Markov models interaural level difference interaural time difference Signal processing algorithms underdetermined blind source separation visual information Visualization 分离算法势函数参数估计参数初始化混合参数混合物视听高斯
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Most existing algorithms for the underdetermined blind source separation （UBSS） problem are two-stage algorithm, i.e., mixing parameters estimation and sources estimation. In the mixing parameters estimation, the previously proposed traditional clustering algorithms are sensitive to the initializations of the mixing parameters. To reduce the sensitiveness to the initialization, we propose a new algorithm for the UBSS problem based on anechoic speech mixtures by employing the visual information, i.e., the interaural time difference （ITD） and the interaural level difference （ILD）, as the initializations of the mixing parameters. In our algorithm, the video signals are utilized to estimate the distances between microphones and sources, and then the estimations of the ITD and ILD can be obtained. With the sparsity assumption in the time-frequency domain, the Gaussian potential function algorithm is utilized to estimate the mixing parameters by using the ITDs and ILDs as the initializations of the mixing parameters. And the time-frequency masking is used to recover the sources by evaluating the various ITDs and ILDs. Experimental results demonstrate the competitive performance of the proposed algorithm compared with the baseline algorithms.
ISSN:	1673-5447
DOI:	10.1109/CC.2014.6879005