Reviewing automatic language identification

The Oregon Graduate Institute Multi-language Telephone Speech Corpus (OGI-TS) was designed specifically for language identification research. It currently consists of spontaneous and fixed-vocabulary utterances in 11 languages: English, Farsi, French, German, Hindi, Japanese, Korean, Mandarin, Spani...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE signal processing magazine 1994-10, Vol.11 (4), p.33-41
Hauptverfasser: Muthusamy, Y.K., Barnard, E., Cole, R.A.
Format: Magazinearticle
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The Oregon Graduate Institute Multi-language Telephone Speech Corpus (OGI-TS) was designed specifically for language identification research. It currently consists of spontaneous and fixed-vocabulary utterances in 11 languages: English, Farsi, French, German, Hindi, Japanese, Korean, Mandarin, Spanish, Tamil, and Vietnamese. These utterances were produced by 90 native speakers in each language over real telephone lines. Language identification is related to speaker-independent speech recognition and speaker identification in several interesting ways. It is therefore not surprising that many of the recent developments in language identification can be related to developments in those two fields. We review some of the more important recent approaches to language identification against the background of successes in speaker and speech recognition. In particular, we demonstrate how approaches to language identification based on acoustic modeling and language modeling, respectively, are similar to algorithms used in speaker-independent continuous speech recognition. Thereafter, prosodic and duration-based information sources are studied. We then review an approach to language identification that draws heavily on speaker identification. Finally, the performance of some representative algorithms is reported.< >
ISSN:1053-5888
1558-0792
DOI:10.1109/79.317925