Assessing the influence of personal preferences on the choice of vocabulary for natural language generation

► Most NLG systems try to find a general way of generating natural language. ► We examine the influence of personal preference in the choice of vocabulary for NLG. ► We use a corpus annotated by several people to test our hypothesis. ► The results show a decrease of 40% in error when personal prefer...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Information processing & management 2013-07, Vol.49 (4), p.817-832
Hauptverfasser:	Hervás, Raquel, Francisco, Virginia, Gervás, Pablo
Format:	Artikel
Sprache:	eng
Schlagworte:	Automation Corpus approach Data processing Exact sciences and technology Heuristic Heuristics Information and communication sciences Information processing Information processing and retrieval Information retrieval systems. Information and document management system Information retrieval. Man machine relationship Information science. Documentation Language Lexicalization Linguistics Management Natural language Natural language generation Natural language processing Personalization Preferences Referring expression generation Research process. Evaluation Sciences and techniques of general use Similarity Speech recognition Studies Text Texts Transforms Tunas Vocabularies & taxonomies
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	► Most NLG systems try to find a general way of generating natural language. ► We examine the influence of personal preference in the choice of vocabulary for NLG. ► We use a corpus annotated by several people to test our hypothesis. ► The results show a decrease of 40% in error when personal preferences are considered. Referring expression generation is the part of natural language generation that decides how to refer to the entities appearing in an automatically generated text. Lexicalization is the part of this process which involves the choice of appropriate vocabulary or expressions to transform the conceptual content of a referring expression into the corresponding text in natural language. This problem presents an important challenge when we have enough knowledge to allow more than one alternative. In those cases, we need some heuristics to decide which alternatives are more appropriate in a given situation. Whereas most work on natural language generation has focused on a generic way of generating language, in this paper we explore personal preferences as a type of heuristic that has not been properly addressed. We empirically analyze the TUNA corpus, a corpus of referring expression lexicalizations, to investigate the influence of language preferences in how people lexicalize new referring expressions in different situations. We then present two corpus-based approaches to solve the problem of referring expression lexicalization, one that takes preferences into account and one that does not. The results show a decrease of 50% in the similarity error against the reference corpus when personal preferences are used to generate the final referring expression.
ISSN:	0306-4573 1873-5371
DOI:	10.1016/j.ipm.2013.01.006