A Novel Similarity Measure to Induce Semantic Classes and Its Application for Language Model Adaptation in a Dialogue System

In this paper,we propose a novel co-occurrence probabilities based similarity measure for inducing semantic classes.Clustering with the new similarity measure outperforms the widely used distance based on Kullback-Leibler divergence in precision,recall and F1 evaluation.In our experiments,we induced...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Journal of computer science and technology 2012-03, Vol.27 (2), p.443-450
1. Verfasser:	李亚丽徐为群颜永红
Format:	Artikel
Sprache:	eng
Schlagworte:	Adaptation Analysis Artificial Intelligence Character recognition Clustering Computer Science Computer simulation Data Structures and Information Theory Divergence Experiments Information Systems Applications (incl.Internet) Interactive computer systems Kullback-Leibler距离 Language Mathematical models Probability Programming languages Recall Regular Paper Semantics Similarity Similarity measures Software Engineering Studies Theory of Computation 对话系统应用相似性度量相似性测度语义语言模型诱导
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	In this paper,we propose a novel co-occurrence probabilities based similarity measure for inducing semantic classes.Clustering with the new similarity measure outperforms the widely used distance based on Kullback-Leibler divergence in precision,recall and F1 evaluation.In our experiments,we induced semantic classes from unannotated in-domain corpus and then used the induced classes and structures to generate large in-domain corpus which was then used for language model adaptation.Character recognition rate was improved from 85.2% to 91%.We imply a new measure to solve the lack of domain data problem by first induction then generation for a dialogue system.
ISSN:	1000-9000 1860-4749
DOI:	10.1007/s11390-012-1233-0