An automatic noun compound extraction from Arabic corpus

The identification of noun compound as multi-word lexical units is very important task in natural language processing applications that require some degree of semantic interpretation such as, machine translation, information retrieval and text summarization. In this paper, we used the hybrid method...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Saif, A. M., Aziz, M. J. A.
Format:	Tagungsbericht
Sprache:	eng
Schlagworte:	Arabic noun compund Association measures Compounds hybrid method lemmatization Magnetic heads morphological variations Mutual information n-best evaluation method Pragmatics Semantics Syntactics Tagging
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	The identification of noun compound as multi-word lexical units is very important task in natural language processing applications that require some degree of semantic interpretation such as, machine translation, information retrieval and text summarization. In this paper, we used the hybrid method for extracting the noun compound from Arabic corpus that is based on linguistic knowledge and statistical measures. For the candidate identification, we have used some linguistic analysis tools such as lemmatization and POS in order to filter the candidates and determine the variations. The association measures have been computed for each candidate to rank the candidates. After that, we have evaluated the association measures by using the n-best evaluation method. We reported the precision values for each association measure in each n-best list. The experimental results showed that the log-likelihood ratio is the best association measure that achieved highest precision.
ISSN:	2166-0697
DOI:	10.1109/STAIR.2011.5995793