A HYBRID MODEL FOR PHRASE CHUNKING EMPLOYING ARTIFICIAL IMMUNITY SYSTEM AND RULE BASED METHODS

Natural language Understanding (NLU), an important field of Artificial Intelligence (AI) is concerned with the speech and language understanding between human and computer. Understanding language means knowing what concept a word or phrase stands for and how to link them to form meaningful sentence....

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	International journal of artificial intelligence & applications 2011-10, Vol.2 (4), p.95-95
Hauptverfasser:	Bindu, M S, Idicula, Sumam Mary
Format:	Artikel
Sprache:	eng
Schlagworte:	Ambiguity Artificial intelligence Expert systems Immunity Mathematical models Natural language (computers) Sentences Tags
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Natural language Understanding (NLU), an important field of Artificial Intelligence (AI) is concerned with the speech and language understanding between human and computer. Understanding language means knowing what concept a word or phrase stands for and how to link them to form meaningful sentence. Identification of phrases or phrase chunking is an important step in natural language understanding (NLU). Chunker identifies and divides sentences into syntactically correlated word groups. Question Answering (QA) systems, another important application of Artificial Intelligence (AI) mostly requires retrieval of nouns or noun phrases as answers to the questions raised by the users. Also Chunking is an important preprocessing step in full parsing. Due to high ambiguity of natural language, exact parsing of text may become very complex. This ambiguity may be partially resolved by using chunking as an intermediate step. To the best of our knowledge no known work or tag set is available for phrase chunking in Malayalam. To separate the chunks in a document it must be labeled with parts-of-speech (POS) tags. POS Tagging is a difficult task in Malayalam as it is a complex and compounding language. In this paper we describe the application of artificial immunity system (AIS) for chunking which is implemented and obtained an accurate output with 96% precision and 93% recall. This system is tested on corpuses collected from reputed news papers and magazines. These corpuses contained documents from five different domains such as sports, health, agriculture, science and politics and each document contained sentences -simple, compound, complex-of various levels of complexity. POS tag set with 52 tags is developed for preparing the tagged corpus for Malayalam. The phrase tag set contains 20 phrase tags.
ISSN:	0976-2191 0975-900X