Associating faces with voices for speaker diarization within videos

A computer-implemented method for speech diarization is described. The method comprises determining temporal positions of separate faces in a video using face detection and clustering. Voice features are detected in the speech sections of the video. The method further includes generating a correlati...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Chaudhuri, Sourish, Hoover, Kenneth
Format: Patent
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:A computer-implemented method for speech diarization is described. The method comprises determining temporal positions of separate faces in a video using face detection and clustering. Voice features are detected in the speech sections of the video. The method further includes generating a correlation between the determined separate faces and separate voices based at least on the temporal positions of the separate faces and the separate voices in the video. This correlation is stored in a content store with the video.