Acoustic Feature Analysis and Discriminative Modeling for Language Identification of Closely Related South-Asian Languages
With the advancement in technology, communication between people around the world from different linguistic backgrounds is increasing gradually, resulting in the requirement of language identification services. Language identification techniques extract distinguishable information as features of a l...
Gespeichert in:
Veröffentlicht in: | Circuits, systems, and signal processing systems, and signal processing, 2018-08, Vol.37 (8), p.3589-3604 |
---|---|
Hauptverfasser: | , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | With the advancement in technology, communication between people around the world from different linguistic backgrounds is increasing gradually, resulting in the requirement of language identification services. Language identification techniques extract distinguishable information as features of a language from the speech corpora to differentiate one language from other. Without publicly available speech corpora, comparison between different techniques will not be much reliable. This paper investigates state-of-the-art features and techniques for language identification of under-resource and closely related languages, namely Pashto, Punjabi, Sindhi, and Urdu. For language identification, speech corpus is designed and collected for mentioned languages. The dataset is a read speech data collected over telephone network (mobile and landline) from different regions of Pakistan. The speech corpus is annotated at the sentence level using X-SAMPA, its orthographic transcription is also provided, and verified data are divided into training and evaluation sets. Mel-frequency cepstral coefficients and their shifted delta cepstral features are used to develop language identification system of target languages. Gaussian mixture model with universal background model (GMM-UBM)-based and I-vector-based language identification approaches are investigated. The results show that GMM-UBM is more effective than the I-vector for language identification of short duration test utterances. |
---|---|
ISSN: | 0278-081X 1531-5878 |
DOI: | 10.1007/s00034-017-0724-1 |