Attention based gender and nationality information exploration for speaker identification
Gender and nationality information has not been exploited in large-scale speaker recognition despite being provided in the popular VoxCeleb1 dataset. This paper explores methods that combine high-level features extracted from the gender and nationality information with low-level acoustic features fo...
Gespeichert in:
Veröffentlicht in: | Digital signal processing 2022-04, Vol.123, p.103449, Article 103449 |
---|---|
Hauptverfasser: | , , , , , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Gender and nationality information has not been exploited in large-scale speaker recognition despite being provided in the popular VoxCeleb1 dataset. This paper explores methods that combine high-level features extracted from the gender and nationality information with low-level acoustic features for speaker identification. To our knowledge, this is the first time that the gender and nationality information provided in VoxCeleb1 is utilized in speaker identification. Specifically, we propose Gender-Guided Spectrogram-Attention network and Nationality-Guided Spectrogram-Attention network that embed gender and nationality information into the spectrogram features, respectively. The resulting gender and nationality embeddings are then used with the spectrogram features together for classification. Experimental results show that the proposed methods can successfully capture the gender and nationality information of the speakers, and can effectively improve speaker identification accuracy. |
---|---|
ISSN: | 1051-2004 1095-4333 |
DOI: | 10.1016/j.dsp.2022.103449 |