Speech extraction based on ICA and audio-visual coherence
We present a new approach to the source separation problem for multiple speech signals. Using the extra visual information of the speaker's face, the method aims to extract an acoustic speech signal from other acoustic signals by exploiting its coherence with the speaker's lip movements. W...
Gespeichert in:
Hauptverfasser: | , , , |
---|---|
Format: | Tagungsbericht |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | We present a new approach to the source separation problem for multiple speech signals. Using the extra visual information of the speaker's face, the method aims to extract an acoustic speech signal from other acoustic signals by exploiting its coherence with the speaker's lip movements. We define a statistical model of the joint probability of visual and spectral audio input for quantifying the audio-visual coherence. Then, separation can be achieved by maximising this joint probability. Experiments on additive mixtures of 2, 3 and 5 sources show that the algorithm performs well, and systematically better than the classical BSS algorithm JADE. |
---|---|
DOI: | 10.1109/ISSPA.2003.1224816 |