Applying Suffix Rules to Organization Name Recognition

This paper presents a method for boosting the performance of the organization name recognition, which is a part of named entity recognition (NER). Although gazetteers (lists of the NEs) have been known as one of the effective features for supervised machine learning approaches on the NER task, the p...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Transactions of the Japanese Society for Artificial Intelligence 2009, Vol.24(6), pp.469-479
Hauptverfasser:	INUI, Takashi, MURAKAMI, Koji, HASHIMOTO, Taiichi, UTSUMI, Kazuo, ISHIKAWA, Masamichi
Format:	Artikel
Sprache:	eng
Schlagworte:	named entity organization name suffix rules
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	479
container_issue	6
container_start_page	469
container_title	Transactions of the Japanese Society for Artificial Intelligence
container_volume	24
creator	INUI, Takashi MURAKAMI, Koji HASHIMOTO, Taiichi UTSUMI, Kazuo ISHIKAWA, Masamichi
description	This paper presents a method for boosting the performance of the organization name recognition, which is a part of named entity recognition (NER). Although gazetteers (lists of the NEs) have been known as one of the effective features for supervised machine learning approaches on the NER task, the previous methods which have applied the gazetteers to the NER were very simple. The gazetteers have been used just for searching the exact matches between input text and NEs included in them. The proposed method generates regular expression rules from gazetteers, and, with these rules, it can realize a high-coverage searches based on looser matches between input text and NEs. To generate these rules, we focus on the two well-known characteristics of NE expressions; 1) most of NE expressions can be divided into two parts, class-reference part and instance-reference part, 2) for most of NE expressions the class-reference parts are located at the suffix position of them. A pattern mining algorithm runs on the set of NEs in the gazetteers, and some frequent word sequences from which NEs are constructed are found. Then, we employ only word sequences which have the class-reference part at the suffix position as suffix rules. Experimental results showed that our proposed method improved the performance of the organization name recognition, and achieved the 84.58 F-value for evaluation data.
doi_str_mv	10.1527/tjsai.24.469
format	Article
fullrecord	<record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_1476944577</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>3180063851</sourcerecordid><originalsourceid>FETCH-LOGICAL-c1619-9b60b6cc9f82e6f6d20036947331670b9feee0d615e6978bcd5b6e2b1dede2753</originalsourceid><addsrcrecordid>eNpFkFtLw0AQhRdRsGjf_AEBX03dW3a7j6V4g2Kh6vOyu5nELWkSdxOw_npTU-rLzHDmmzNwELoheEYyKu-7bTR-RvmMC3WGJoRxkc4xw-fHGUvCL9E0Rm8xJpRxgrMJEou2rfa-LpO3vij8d7LpK4hJ1yTrUJra_5jON3XyanaQbMA1Ze0PwjW6KEwVYXrsV-jj8eF9-Zyu1k8vy8UqdUQQlSorsBXOqWJOQRQipxgzobhkjAiJrSoAAOeCZCCUnFuXZ1YAtSSHHKjM2BW6HX3b0Hz1EDu9bfpQDy814XJw4pmUA3U3Ui40MQYodBv8zoS9JlgfwtF_4WjK9RDOgC9GfBs7U8IJNqHzroJ_WIxluDnt3KcJGmr2C0VUbzI</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1476944577</pqid></control><display><type>article</type><title>Applying Suffix Rules to Organization Name Recognition</title><source>J-STAGE Free</source><source>EZB-FREE-00999 freely available EZB journals</source><creator>INUI, Takashi ; MURAKAMI, Koji ; HASHIMOTO, Taiichi ; UTSUMI, Kazuo ; ISHIKAWA, Masamichi</creator><creatorcontrib>INUI, Takashi ; MURAKAMI, Koji ; HASHIMOTO, Taiichi ; UTSUMI, Kazuo ; ISHIKAWA, Masamichi</creatorcontrib><description>This paper presents a method for boosting the performance of the organization name recognition, which is a part of named entity recognition (NER). Although gazetteers (lists of the NEs) have been known as one of the effective features for supervised machine learning approaches on the NER task, the previous methods which have applied the gazetteers to the NER were very simple. The gazetteers have been used just for searching the exact matches between input text and NEs included in them. The proposed method generates regular expression rules from gazetteers, and, with these rules, it can realize a high-coverage searches based on looser matches between input text and NEs. To generate these rules, we focus on the two well-known characteristics of NE expressions; 1) most of NE expressions can be divided into two parts, class-reference part and instance-reference part, 2) for most of NE expressions the class-reference parts are located at the suffix position of them. A pattern mining algorithm runs on the set of NEs in the gazetteers, and some frequent word sequences from which NEs are constructed are found. Then, we employ only word sequences which have the class-reference part at the suffix position as suffix rules. Experimental results showed that our proposed method improved the performance of the organization name recognition, and achieved the 84.58 F-value for evaluation data.</description><identifier>ISSN: 1346-0714</identifier><identifier>EISSN: 1346-8030</identifier><identifier>DOI: 10.1527/tjsai.24.469</identifier><language>eng</language><publisher>Tokyo: The Japanese Society for Artificial Intelligence</publisher><subject>named entity ; organization name ; suffix rules</subject><ispartof>Transactions of the Japanese Society for Artificial Intelligence, 2009, Vol.24(6), pp.469-479</ispartof><rights>2009 JSAI (The Japanese Society for Artificial Intelligence)</rights><rights>Copyright Japan Science and Technology Agency 2009</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c1619-9b60b6cc9f82e6f6d20036947331670b9feee0d615e6978bcd5b6e2b1dede2753</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,780,784,1881,4022,27922,27923,27924</link.rule.ids></links><search><creatorcontrib>INUI, Takashi</creatorcontrib><creatorcontrib>MURAKAMI, Koji</creatorcontrib><creatorcontrib>HASHIMOTO, Taiichi</creatorcontrib><creatorcontrib>UTSUMI, Kazuo</creatorcontrib><creatorcontrib>ISHIKAWA, Masamichi</creatorcontrib><title>Applying Suffix Rules to Organization Name Recognition</title><title>Transactions of the Japanese Society for Artificial Intelligence</title><description>This paper presents a method for boosting the performance of the organization name recognition, which is a part of named entity recognition (NER). Although gazetteers (lists of the NEs) have been known as one of the effective features for supervised machine learning approaches on the NER task, the previous methods which have applied the gazetteers to the NER were very simple. The gazetteers have been used just for searching the exact matches between input text and NEs included in them. The proposed method generates regular expression rules from gazetteers, and, with these rules, it can realize a high-coverage searches based on looser matches between input text and NEs. To generate these rules, we focus on the two well-known characteristics of NE expressions; 1) most of NE expressions can be divided into two parts, class-reference part and instance-reference part, 2) for most of NE expressions the class-reference parts are located at the suffix position of them. A pattern mining algorithm runs on the set of NEs in the gazetteers, and some frequent word sequences from which NEs are constructed are found. Then, we employ only word sequences which have the class-reference part at the suffix position as suffix rules. Experimental results showed that our proposed method improved the performance of the organization name recognition, and achieved the 84.58 F-value for evaluation data.</description><subject>named entity</subject><subject>organization name</subject><subject>suffix rules</subject><issn>1346-0714</issn><issn>1346-8030</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2009</creationdate><recordtype>article</recordtype><recordid>eNpFkFtLw0AQhRdRsGjf_AEBX03dW3a7j6V4g2Kh6vOyu5nELWkSdxOw_npTU-rLzHDmmzNwELoheEYyKu-7bTR-RvmMC3WGJoRxkc4xw-fHGUvCL9E0Rm8xJpRxgrMJEou2rfa-LpO3vij8d7LpK4hJ1yTrUJra_5jON3XyanaQbMA1Ze0PwjW6KEwVYXrsV-jj8eF9-Zyu1k8vy8UqdUQQlSorsBXOqWJOQRQipxgzobhkjAiJrSoAAOeCZCCUnFuXZ1YAtSSHHKjM2BW6HX3b0Hz1EDu9bfpQDy814XJw4pmUA3U3Ui40MQYodBv8zoS9JlgfwtF_4WjK9RDOgC9GfBs7U8IJNqHzroJ_WIxluDnt3KcJGmr2C0VUbzI</recordid><startdate>2009</startdate><enddate>2009</enddate><creator>INUI, Takashi</creator><creator>MURAKAMI, Koji</creator><creator>HASHIMOTO, Taiichi</creator><creator>UTSUMI, Kazuo</creator><creator>ISHIKAWA, Masamichi</creator><general>The Japanese Society for Artificial Intelligence</general><general>Japan Science and Technology Agency</general><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope></search><sort><creationdate>2009</creationdate><title>Applying Suffix Rules to Organization Name Recognition</title><author>INUI, Takashi ; MURAKAMI, Koji ; HASHIMOTO, Taiichi ; UTSUMI, Kazuo ; ISHIKAWA, Masamichi</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c1619-9b60b6cc9f82e6f6d20036947331670b9feee0d615e6978bcd5b6e2b1dede2753</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2009</creationdate><topic>named entity</topic><topic>organization name</topic><topic>suffix rules</topic><toplevel>online_resources</toplevel><creatorcontrib>INUI, Takashi</creatorcontrib><creatorcontrib>MURAKAMI, Koji</creatorcontrib><creatorcontrib>HASHIMOTO, Taiichi</creatorcontrib><creatorcontrib>UTSUMI, Kazuo</creatorcontrib><creatorcontrib>ISHIKAWA, Masamichi</creatorcontrib><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>Transactions of the Japanese Society for Artificial Intelligence</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>INUI, Takashi</au><au>MURAKAMI, Koji</au><au>HASHIMOTO, Taiichi</au><au>UTSUMI, Kazuo</au><au>ISHIKAWA, Masamichi</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Applying Suffix Rules to Organization Name Recognition</atitle><jtitle>Transactions of the Japanese Society for Artificial Intelligence</jtitle><date>2009</date><risdate>2009</risdate><volume>24</volume><issue>6</issue><spage>469</spage><epage>479</epage><pages>469-479</pages><issn>1346-0714</issn><eissn>1346-8030</eissn><abstract>This paper presents a method for boosting the performance of the organization name recognition, which is a part of named entity recognition (NER). Although gazetteers (lists of the NEs) have been known as one of the effective features for supervised machine learning approaches on the NER task, the previous methods which have applied the gazetteers to the NER were very simple. The gazetteers have been used just for searching the exact matches between input text and NEs included in them. The proposed method generates regular expression rules from gazetteers, and, with these rules, it can realize a high-coverage searches based on looser matches between input text and NEs. To generate these rules, we focus on the two well-known characteristics of NE expressions; 1) most of NE expressions can be divided into two parts, class-reference part and instance-reference part, 2) for most of NE expressions the class-reference parts are located at the suffix position of them. A pattern mining algorithm runs on the set of NEs in the gazetteers, and some frequent word sequences from which NEs are constructed are found. Then, we employ only word sequences which have the class-reference part at the suffix position as suffix rules. Experimental results showed that our proposed method improved the performance of the organization name recognition, and achieved the 84.58 F-value for evaluation data.</abstract><cop>Tokyo</cop><pub>The Japanese Society for Artificial Intelligence</pub><doi>10.1527/tjsai.24.469</doi><tpages>11</tpages><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	ISSN: 1346-0714
ispartof	Transactions of the Japanese Society for Artificial Intelligence, 2009, Vol.24(6), pp.469-479
issn	1346-0714 1346-8030
language	eng
recordid	cdi_proquest_journals_1476944577
source	J-STAGE Free; EZB-FREE-00999 freely available EZB journals
subjects	named entity organization name suffix rules
title	Applying Suffix Rules to Organization Name Recognition
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-12T14%3A55%3A50IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Applying%20Suffix%20Rules%20to%20Organization%20Name%20Recognition&rft.jtitle=Transactions%20of%20the%20Japanese%20Society%20for%20Artificial%20Intelligence&rft.au=INUI,%20Takashi&rft.date=2009&rft.volume=24&rft.issue=6&rft.spage=469&rft.epage=479&rft.pages=469-479&rft.issn=1346-0714&rft.eissn=1346-8030&rft_id=info:doi/10.1527/tjsai.24.469&rft_dat=%3Cproquest_cross%3E3180063851%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=1476944577&rft_id=info:pmid/&rfr_iscdi=true