A novel approach for speaker diarization system using TMFCC parameterization and Lion optimization

In audio stream containing multiple speakers, speaker diarization aids in ascertaining “who speak when”. This is an unsupervised task as there is no prior information about the speakers. It labels the speech signal conforming to the identity of the speaker, namely, input audio stream is partitioned...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Journal of Central South University 2017-11, Vol.24 (11), p.2649-2663
Hauptverfasser:	Subba Ramaiah, V., Rajeswara Rao, R.
Format:	Artikel
Sprache:	eng
Schlagworte:	algorithm Audio equipment cepstral Clustering coefficient diarization Energy consumption Engineering extraction frequency i-vector Lion Mel Metallic Materials Parameterization speaker Tracking Voice recognition
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	In audio stream containing multiple speakers, speaker diarization aids in ascertaining “who speak when”. This is an unsupervised task as there is no prior information about the speakers. It labels the speech signal conforming to the identity of the speaker, namely, input audio stream is partitioned into homogeneous segments. In this work, we present a novel speaker diarization system using the Tangent weighted Mel frequency cepstral coefficient (TMFCC) as the feature parameter and Lion algorithm for the clustering of the voice activity detected audio streams into particular speaker groups. Thus the two main tasks of the speaker indexing,i.e., speaker segmentation and speaker clustering, are improved. The TMFCC makes use of the low energy frame as well as the high energy frame with more effect, improving the performance of the proposed system. The experiments using the audio signal from the ELSDSR corpus datasets having three speakers, four speakers and five speakers are analyzed for the proposed system. The evaluation of the proposed speaker diarization system based on the tracking distance, tracking time as the evaluation metrics is done and the experimental results show that the speaker diarization system with the TMFCC parameterization and Lion based clustering is found to be superior over existing diarization systems with 95% tracking accuracy.
ISSN:	2095-2899 2227-5223
DOI:	10.1007/s11771-017-3678-3