An automatic noun compound extraction from Arabic corpus

The identification of noun compound as multi-word lexical units is very important task in natural language processing applications that require some degree of semantic interpretation such as, machine translation, information retrieval and text summarization. In this paper, we used the hybrid method...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Saif, A. M., Aziz, M. J. A.
Format: Tagungsbericht
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 230
container_issue
container_start_page 224
container_title
container_volume
creator Saif, A. M.
Aziz, M. J. A.
description The identification of noun compound as multi-word lexical units is very important task in natural language processing applications that require some degree of semantic interpretation such as, machine translation, information retrieval and text summarization. In this paper, we used the hybrid method for extracting the noun compound from Arabic corpus that is based on linguistic knowledge and statistical measures. For the candidate identification, we have used some linguistic analysis tools such as lemmatization and POS in order to filter the candidates and determine the variations. The association measures have been computed for each candidate to rank the candidates. After that, we have evaluated the association measures by using the n-best evaluation method. We reported the precision values for each association measure in each n-best list. The experimental results showed that the log-likelihood ratio is the best association measure that achieved highest precision.
doi_str_mv 10.1109/STAIR.2011.5995793
format Conference Proceeding
fullrecord <record><control><sourceid>ieee_6IE</sourceid><recordid>TN_cdi_ieee_primary_5995793</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>5995793</ieee_id><sourcerecordid>5995793</sourcerecordid><originalsourceid>FETCH-LOGICAL-i90t-5e52d29a8145c6474e25bda54f4e16870ef676378c548ca0a405eaef979fb4243</originalsourceid><addsrcrecordid>eNpFj9tKxDAURSMqOI7zA_qSH2jN5eT2WAYvAwOC9n04TROo2KakLejfW3DA_bJYsNmwCbnnrOScucePujq8l4JxXirnlHHygtxyzYUFqSS7_BdwV2QjuNYF087ckN00fbI1ejVlN8RWA8VlTj3OnadDWgbqUz-ubGn4njP6uUsDjTn1tMrYrCWf8rhMd-Q64tcUdmduSf38VO9fi-Pby2FfHYvOsblQQYlWOLQclNdgIAjVtKggQuDaGhaiNloa6xVYjwyBqYAhOuNiAwLkljz8zXYhhNOYux7zz-l8Wv4Cl-5Ixg</addsrcrecordid><sourcetype>Publisher</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype></control><display><type>conference_proceeding</type><title>An automatic noun compound extraction from Arabic corpus</title><source>IEEE Electronic Library (IEL) Conference Proceedings</source><creator>Saif, A. M. ; Aziz, M. J. A.</creator><creatorcontrib>Saif, A. M. ; Aziz, M. J. A.</creatorcontrib><description>The identification of noun compound as multi-word lexical units is very important task in natural language processing applications that require some degree of semantic interpretation such as, machine translation, information retrieval and text summarization. In this paper, we used the hybrid method for extracting the noun compound from Arabic corpus that is based on linguistic knowledge and statistical measures. For the candidate identification, we have used some linguistic analysis tools such as lemmatization and POS in order to filter the candidates and determine the variations. The association measures have been computed for each candidate to rank the candidates. After that, we have evaluated the association measures by using the n-best evaluation method. We reported the precision values for each association measure in each n-best list. The experimental results showed that the log-likelihood ratio is the best association measure that achieved highest precision.</description><identifier>ISSN: 2166-0697</identifier><identifier>ISBN: 1612843549</identifier><identifier>ISBN: 9781612843544</identifier><identifier>EISBN: 1612843530</identifier><identifier>EISBN: 9781612843537</identifier><identifier>DOI: 10.1109/STAIR.2011.5995793</identifier><language>eng</language><publisher>IEEE</publisher><subject>Arabic noun compund ; Association measures ; Compounds ; hybrid method ; lemmatization ; Magnetic heads ; morphological variations ; Mutual information ; n-best evaluation method ; Pragmatics ; Semantics ; Syntactics ; Tagging</subject><ispartof>2011 International Conference on Semantic Technology and Information Retrieval, 2011, p.224-230</ispartof><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/5995793$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>309,310,780,784,789,790,2058,27925,54920</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/5995793$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Saif, A. M.</creatorcontrib><creatorcontrib>Aziz, M. J. A.</creatorcontrib><title>An automatic noun compound extraction from Arabic corpus</title><title>2011 International Conference on Semantic Technology and Information Retrieval</title><addtitle>STAIR</addtitle><description>The identification of noun compound as multi-word lexical units is very important task in natural language processing applications that require some degree of semantic interpretation such as, machine translation, information retrieval and text summarization. In this paper, we used the hybrid method for extracting the noun compound from Arabic corpus that is based on linguistic knowledge and statistical measures. For the candidate identification, we have used some linguistic analysis tools such as lemmatization and POS in order to filter the candidates and determine the variations. The association measures have been computed for each candidate to rank the candidates. After that, we have evaluated the association measures by using the n-best evaluation method. We reported the precision values for each association measure in each n-best list. The experimental results showed that the log-likelihood ratio is the best association measure that achieved highest precision.</description><subject>Arabic noun compund</subject><subject>Association measures</subject><subject>Compounds</subject><subject>hybrid method</subject><subject>lemmatization</subject><subject>Magnetic heads</subject><subject>morphological variations</subject><subject>Mutual information</subject><subject>n-best evaluation method</subject><subject>Pragmatics</subject><subject>Semantics</subject><subject>Syntactics</subject><subject>Tagging</subject><issn>2166-0697</issn><isbn>1612843549</isbn><isbn>9781612843544</isbn><isbn>1612843530</isbn><isbn>9781612843537</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2011</creationdate><recordtype>conference_proceeding</recordtype><sourceid>6IE</sourceid><sourceid>RIE</sourceid><recordid>eNpFj9tKxDAURSMqOI7zA_qSH2jN5eT2WAYvAwOC9n04TROo2KakLejfW3DA_bJYsNmwCbnnrOScucePujq8l4JxXirnlHHygtxyzYUFqSS7_BdwV2QjuNYF087ckN00fbI1ejVlN8RWA8VlTj3OnadDWgbqUz-ubGn4njP6uUsDjTn1tMrYrCWf8rhMd-Q64tcUdmduSf38VO9fi-Pby2FfHYvOsblQQYlWOLQclNdgIAjVtKggQuDaGhaiNloa6xVYjwyBqYAhOuNiAwLkljz8zXYhhNOYux7zz-l8Wv4Cl-5Ixg</recordid><startdate>201106</startdate><enddate>201106</enddate><creator>Saif, A. M.</creator><creator>Aziz, M. J. A.</creator><general>IEEE</general><scope>6IE</scope><scope>6IL</scope><scope>CBEJK</scope><scope>RIE</scope><scope>RIL</scope></search><sort><creationdate>201106</creationdate><title>An automatic noun compound extraction from Arabic corpus</title><author>Saif, A. M. ; Aziz, M. J. A.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-i90t-5e52d29a8145c6474e25bda54f4e16870ef676378c548ca0a405eaef979fb4243</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2011</creationdate><topic>Arabic noun compund</topic><topic>Association measures</topic><topic>Compounds</topic><topic>hybrid method</topic><topic>lemmatization</topic><topic>Magnetic heads</topic><topic>morphological variations</topic><topic>Mutual information</topic><topic>n-best evaluation method</topic><topic>Pragmatics</topic><topic>Semantics</topic><topic>Syntactics</topic><topic>Tagging</topic><toplevel>online_resources</toplevel><creatorcontrib>Saif, A. M.</creatorcontrib><creatorcontrib>Aziz, M. J. A.</creatorcontrib><collection>IEEE Electronic Library (IEL) Conference Proceedings</collection><collection>IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume</collection><collection>IEEE Xplore All Conference Proceedings</collection><collection>IEEE Electronic Library (IEL)</collection><collection>IEEE Proceedings Order Plans (POP All) 1998-Present</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Saif, A. M.</au><au>Aziz, M. J. A.</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>An automatic noun compound extraction from Arabic corpus</atitle><btitle>2011 International Conference on Semantic Technology and Information Retrieval</btitle><stitle>STAIR</stitle><date>2011-06</date><risdate>2011</risdate><spage>224</spage><epage>230</epage><pages>224-230</pages><issn>2166-0697</issn><isbn>1612843549</isbn><isbn>9781612843544</isbn><eisbn>1612843530</eisbn><eisbn>9781612843537</eisbn><abstract>The identification of noun compound as multi-word lexical units is very important task in natural language processing applications that require some degree of semantic interpretation such as, machine translation, information retrieval and text summarization. In this paper, we used the hybrid method for extracting the noun compound from Arabic corpus that is based on linguistic knowledge and statistical measures. For the candidate identification, we have used some linguistic analysis tools such as lemmatization and POS in order to filter the candidates and determine the variations. The association measures have been computed for each candidate to rank the candidates. After that, we have evaluated the association measures by using the n-best evaluation method. We reported the precision values for each association measure in each n-best list. The experimental results showed that the log-likelihood ratio is the best association measure that achieved highest precision.</abstract><pub>IEEE</pub><doi>10.1109/STAIR.2011.5995793</doi><tpages>7</tpages></addata></record>
fulltext fulltext_linktorsrc
identifier ISSN: 2166-0697
ispartof 2011 International Conference on Semantic Technology and Information Retrieval, 2011, p.224-230
issn 2166-0697
language eng
recordid cdi_ieee_primary_5995793
source IEEE Electronic Library (IEL) Conference Proceedings
subjects Arabic noun compund
Association measures
Compounds
hybrid method
lemmatization
Magnetic heads
morphological variations
Mutual information
n-best evaluation method
Pragmatics
Semantics
Syntactics
Tagging
title An automatic noun compound extraction from Arabic corpus
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-04T12%3A59%3A04IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-ieee_6IE&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=An%20automatic%20noun%20compound%20extraction%20from%20Arabic%20corpus&rft.btitle=2011%20International%20Conference%20on%20Semantic%20Technology%20and%20Information%20Retrieval&rft.au=Saif,%20A.%20M.&rft.date=2011-06&rft.spage=224&rft.epage=230&rft.pages=224-230&rft.issn=2166-0697&rft.isbn=1612843549&rft.isbn_list=9781612843544&rft_id=info:doi/10.1109/STAIR.2011.5995793&rft_dat=%3Cieee_6IE%3E5995793%3C/ieee_6IE%3E%3Curl%3E%3C/url%3E&rft.eisbn=1612843530&rft.eisbn_list=9781612843537&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=5995793&rfr_iscdi=true