Speaker diarization in meeting audio

This paper describes speaker diarization system on a NIST Rich Transcription 2007 (RT-07) meeting recognition evaluation data set for the task of multiple distant microphone (MDM). Our implementation includes three components: initial clustering, non-speech removal and cluster purification. Initial...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Nwe, T.L., Hanwu Sun, Haizhou Li, Rahardja, S.
Format: Tagungsbericht
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:This paper describes speaker diarization system on a NIST Rich Transcription 2007 (RT-07) meeting recognition evaluation data set for the task of multiple distant microphone (MDM). Our implementation includes three components: initial clustering, non-speech removal and cluster purification. Initial clusters are generated using directional of arrival (DOA) information and bootstrap clustering. Multiple GMM modeling for speech/non-speech classification is employed for non-speech removal component. In addition, a novel system fusion strategy using information from receiver operating curve (ROC) is proposed for non-speech removal component. Finally, consensus clustering approach together with iterative GMM clustering method is employed for speaker cluster purification. The system achieves the overall DER of 10.81%.
ISSN:1520-6149
2379-190X
DOI:10.1109/ICASSP.2009.4960523