Language model with structured penalty

A penalized loss is optimized using a corpus of language samples respective to a set of parameters of a language model. The penalized loss includes a function measuring predictive accuracy of the language model respective to the corpus of language samples and a penalty comprising a tree-structured n...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Nelakanti Anil Kumar, Bach Francis, Bouchard Guillaume M, Archambeau Cedric, Mairal Julien
Format:	Patent
Sprache:	eng
Schlagworte:	ACOUSTICS CALCULATING COMPUTING COUNTING ELECTRIC DIGITAL DATA PROCESSING MUSICAL INSTRUMENTS PHYSICS SPEECH ANALYSIS OR SYNTHESIS SPEECH OR AUDIO CODING OR DECODING SPEECH OR VOICE PROCESSING SPEECH RECOGNITION
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	A penalized loss is optimized using a corpus of language samples respective to a set of parameters of a language model. The penalized loss includes a function measuring predictive accuracy of the language model respective to the corpus of language samples and a penalty comprising a tree-structured norm. The trained language model with optimized values for the parameters generated by the optimizing is applied to predict a symbol following sequence of symbols of the language modeled by the language model. In some embodiments the penalty comprises a tree-structured lp-norm, such as a tree-structured l2-norm or a tree-structured l∞-norm. In some embodiments a tree-structured l∞-norm operates on a collapsed suffix trie in which any series of suffixes of increasing lengths which are always observed in the same context are collapsed into a single node. The optimizing may be performed using a proximal step algorithm.