Using probabilistic characteristic vector based on both phonetic and prosodic features for language identification
Language identification (LID) is an important task in indexing of audio signals. This paper introduces a LID system with a generative frontend based on both phonetic and prosodic features. The generative frontend is built upon an ensemble of Gaussian densities. Half of these Gaussian densities are t...
Gespeichert in:
Hauptverfasser: | , |
---|---|
Format: | Tagungsbericht |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Language identification (LID) is an important task in indexing of audio signals. This paper introduces a LID system with a generative frontend based on both phonetic and prosodic features. The generative frontend is built upon an ensemble of Gaussian densities. Half of these Gaussian densities are trained to represent elementary speech sound units and the others are trained to represent prosodic properties that both characterize a wide variety of languages. Shifted Delta Cepstral (SDC) and Pitch Contour Polynomial Approximation (PCPA) are used as feature. The backend classifier is Support Vector Machine (SVM). Several language identification experiments were conducted and the proposed improvements were evaluated using OGI-MLTS corpus. Using SVM with (Generalized Linear Discriminant Analysis) GLDS and Probabilistic Sequence Kernel (PSK) outperforms GMM where all systems are based on PCPA, and improves LID performance about 2.1% and 5.9% respectively. Furthermore, something in the region of 4% improvement was achieved by combining both phonetic and prosodic features in our four languages identification experiments. |
---|---|
DOI: | 10.1109/ISTEL.2010.5734122 |