A study of semi-supervised speaker diarization system using gan mixture model
We propose a new speaker diarization system based on a recently introduced unsupervised clustering technique namely, generative adversarial network mixture model (GANMM). The proposed system uses x-vectors as front-end representation. Spectral embedding is used for dimensionality reduction followed...
Gespeichert in:
Hauptverfasser: | , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | We propose a new speaker diarization system based on a recently introduced
unsupervised clustering technique namely, generative adversarial network
mixture model (GANMM). The proposed system uses x-vectors as front-end
representation. Spectral embedding is used for dimensionality reduction
followed by k-means initialization during GANMM pre-training. GANMM performs
unsupervised speaker clustering by efficiently capturing complex data
distributions. Experimental results on the AMI meeting corpus show that the
proposed semi-supervised diarization system matches or exceeds the performance
of competitive baselines. On an evaluation set containing fifty sessions with
varying durations, the best achieved average diarization error rate (DER) is
17.11%, a relative improvement of 33% over the information bottleneck baseline
and comparable to xvector baseline. |
---|---|
DOI: | 10.48550/arxiv.1910.11416 |