Risk-Based Semi-Supervised Discriminative Language Modeling for Broadcast Transcription

This paper describes a new method for semi-supervised discriminative language modeling, which is designed to improve the robustness of a discriminative language model (LM) obtained from manually transcribed (labeled) data. The discriminative LM is implemented as a log-linear model, which employs a s...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEICE Transactions on Information and Systems 2012/11/01, Vol.E95.D(11), pp.2674-2681
Hauptverfasser:	KOBAYASHI, Akio, OKU, Takahiro, IMAI, Toru, NAKAGAWA, Seiichi
Format:	Artikel
Sprache:	eng
Schlagworte:	Applied sciences Artificial intelligence Bayes risk minimization Broadcasting Broadcasting. Videocommunications. Audiovisual Computer science control theory systems discriminative training Errors Exact sciences and technology Information, signal and communications theory language modeling Lattices Linguistics Mathematical models Miscellaneous Optimization Programming Robustness semi-supervised training Signal processing Speech and sound recognition and synthesis. Linguistics Speech processing Telecommunications Telecommunications and information theory
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	This paper describes a new method for semi-supervised discriminative language modeling, which is designed to improve the robustness of a discriminative language model (LM) obtained from manually transcribed (labeled) data. The discriminative LM is implemented as a log-linear model, which employs a set of linguistic features derived from word or phoneme sequences. The proposed semi-supervised discriminative modeling is formulated as a multi-objective optimization programming problem (MOP), which consists of two objective functions defined on both labeled lattices and automatic speech recognition (ASR) lattices as unlabeled data. The objectives are coherently designed based on the expected risks that reflect information about word errors for the training data. The model is trained in a discriminative manner and acquired as a solution to the MOP problem. In transcribing Japanese broadcast programs, the proposed method reduced relatively a word error rate by 6.3% compared with that achieved by a conventional trigram LM.
ISSN:	0916-8532 1745-1361
DOI:	10.1587/transinf.E95.D.2674