Associating faces with voices for speaker diarization within videos

A computer-implemented method for speech diarization is described. The method comprises determining temporal positions of separate faces in a video using face detection and clustering. Voice features are detected in the speech sections of the video. The method further includes generating a correlati...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Chaudhuri, Sourish, Hoover, Kenneth
Format:	Patent
Sprache:	eng
Schlagworte:	ACOUSTICS CALCULATING COMPUTING COUNTING ELECTRIC COMMUNICATION TECHNIQUE ELECTRICITY HANDLING RECORD CARRIERS INFORMATION STORAGE INFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORDCARRIER AND TRANSDUCER MUSICAL INSTRUMENTS PHYSICS PICTORIAL COMMUNICATION, e.g. TELEVISION PRESENTATION OF DATA RECOGNITION OF DATA RECORD CARRIERS SPEECH ANALYSIS OR SYNTHESIS SPEECH OR AUDIO CODING OR DECODING SPEECH OR VOICE PROCESSING SPEECH RECOGNITION
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	A computer-implemented method for speech diarization is described. The method comprises determining temporal positions of separate faces in a video using face detection and clustering. Voice features are detected in the speech sections of the video. The method further includes generating a correlation between the determined separate faces and separate voices based at least on the temporal positions of the separate faces and the separate voices in the video. This correlation is stored in a content store with the video.