Method and system for deriving a large-span semantic language model for large-vocabulary recognition systems

A system and method for deriving a large-span semantic language model for a large vocabulary recognition system is disclosed. The method and system maps words from a vocabulary into a vector space, where each word is represented by a vector. After the vectors are mapped to the space, the vectors are...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: BELLEGARDA, JEROME R, CHOW, YEN-LU
Format: Patent
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:A system and method for deriving a large-span semantic language model for a large vocabulary recognition system is disclosed. The method and system maps words from a vocabulary into a vector space, where each word is represented by a vector. After the vectors are mapped to the space, the vectors are clustered into a set of clusters, where each cluster represents a semantic event. After clustering the vectors, a probability that a first word will occur given a history of prior words is computed by (i) calculating a probability that the vector representing the first word belongs to each of the clusters; (ii) calculating a probability of each cluster occurring in a history of prior words; and weighting (i) by (ii) to provide the probability.