A Practical Chunker for Unrestricted Text
In this paper we present a practical approach to text chunking for unrestricted Modern Greek text that is based on multiple-pass parsing. Two versions of this chunker are proposed: one based on a large lexicon and one based on minimal resources. In the latter case the morphological analysis is perfo...
Gespeichert in:
Hauptverfasser: | , , |
---|---|
Format: | Buchkapitel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 150 |
---|---|
container_issue | |
container_start_page | 139 |
container_title | |
container_volume | 1835 |
creator | Stamatatos, E. Fakotakis, N. Kokkinakis, G. |
description | In this paper we present a practical approach to text chunking for unrestricted Modern Greek text that is based on multiple-pass parsing. Two versions of this chunker are proposed: one based on a large lexicon and one based on minimal resources. In the latter case the morphological analysis is performed using exclusively two small lexicons containing closed-class words and common suffixes of the Modern Greek words. We give comparative performance results on the basis of a corpus of unrestricted text and show that very good results can be obtained by omitting the large and complicate resources. Moreover, the considerable time cost introduced by the use of the large lexicon indicates that the minimal-resources chunker is the best solution regarding a practical application that requires rapid response and less than perfect parsing results. |
doi_str_mv | 10.1007/3-540-45154-4_13 |
format | Book Chapter |
fullrecord | <record><control><sourceid>proquest_pasca</sourceid><recordid>TN_cdi_pascalfrancis_primary_1376601</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>EBC3072215_19_148</sourcerecordid><originalsourceid>FETCH-LOGICAL-p267t-c8fb353c1549fedc2b7afba61f7783efbc28d9e2eced673e79a4c3be5941399a3</originalsourceid><addsrcrecordid>eNotkDlPAzEQhc0plpCecgsaCgfb42NdoohLigRFUlter02WhN1gbyT49zjHNCPNvPc08yF0S8mEEqIeAAtOMBdUcMwNhRN0DXmyH_BTVFBJKQbg-uywkEoSUZ2jggBhWCsOl6jQohKMKVpdoXFKXyQXMJG1Bbp_LD-idUPr7LqcLrfdyscy9LFcdNGnIbZu8E0597_DDboIdp38-NhHaPH8NJ--4tn7y9v0cYY3TKoBuyrUIMDl-3TwjWO1sqG2kgalKvChdqxqtGfe-UYq8Epb7qD2QnMKWlsYobtD7samfFSItnNtMpvYftv4lxEoKQnNsslBlvKm-_TR1H2_SoYSswNnwGQaZs_J7MBlAxxzY_-zzb8Zv3M43w3Rrt3SbgYfkwGiGKPCUG0or-AfP91qtQ</addsrcrecordid><sourcetype>Index Database</sourcetype><iscdi>true</iscdi><recordtype>book_chapter</recordtype><pqid>EBC3072215_19_148</pqid></control><display><type>book_chapter</type><title>A Practical Chunker for Unrestricted Text</title><source>Springer Books</source><creator>Stamatatos, E. ; Fakotakis, N. ; Kokkinakis, G.</creator><contributor>Christodoulakis, Dimitris N ; Christodoulakis, Dimitris N.</contributor><creatorcontrib>Stamatatos, E. ; Fakotakis, N. ; Kokkinakis, G. ; Christodoulakis, Dimitris N ; Christodoulakis, Dimitris N.</creatorcontrib><description>In this paper we present a practical approach to text chunking for unrestricted Modern Greek text that is based on multiple-pass parsing. Two versions of this chunker are proposed: one based on a large lexicon and one based on minimal resources. In the latter case the morphological analysis is performed using exclusively two small lexicons containing closed-class words and common suffixes of the Modern Greek words. We give comparative performance results on the basis of a corpus of unrestricted text and show that very good results can be obtained by omitting the large and complicate resources. Moreover, the considerable time cost introduced by the use of the large lexicon indicates that the minimal-resources chunker is the best solution regarding a practical application that requires rapid response and less than perfect parsing results.</description><identifier>ISSN: 0302-9743</identifier><identifier>ISBN: 3540676058</identifier><identifier>ISBN: 9783540676058</identifier><identifier>EISSN: 1611-3349</identifier><identifier>EISBN: 3540451544</identifier><identifier>EISBN: 9783540451549</identifier><identifier>DOI: 10.1007/3-540-45154-4_13</identifier><identifier>OCLC: 958522718</identifier><identifier>LCCallNum: QA76.9.N38</identifier><language>eng</language><publisher>Germany: Springer Berlin / Heidelberg</publisher><subject>Applied sciences ; Artificial intelligence ; Computer science; control theory; systems ; Exact sciences and technology ; Morphological Description ; Noun Phrase ; Prepositional Phrase ; Speech and sound recognition and synthesis. Linguistics ; Total Word ; Unknown Word</subject><ispartof>Lecture notes in computer science, 2000, Vol.1835, p.139-150</ispartof><rights>Springer-Verlag Berlin Heidelberg 2000</rights><rights>2000 INIST-CNRS</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><relation>Lecture Notes in Computer Science</relation></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Uhttps://ebookcentral.proquest.com/covers/3072215-l.jpg</thumbnail><linktopdf>$$Uhttps://link.springer.com/content/pdf/10.1007/3-540-45154-4_13$$EPDF$$P50$$Gspringer$$H</linktopdf><linktohtml>$$Uhttps://link.springer.com/10.1007/3-540-45154-4_13$$EHTML$$P50$$Gspringer$$H</linktohtml><link.rule.ids>309,310,777,778,782,787,788,791,4038,4039,27908,38238,41425,42494</link.rule.ids><backlink>$$Uhttp://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&idt=1376601$$DView record in Pascal Francis$$Hfree_for_read</backlink></links><search><contributor>Christodoulakis, Dimitris N</contributor><contributor>Christodoulakis, Dimitris N.</contributor><creatorcontrib>Stamatatos, E.</creatorcontrib><creatorcontrib>Fakotakis, N.</creatorcontrib><creatorcontrib>Kokkinakis, G.</creatorcontrib><title>A Practical Chunker for Unrestricted Text</title><title>Lecture notes in computer science</title><description>In this paper we present a practical approach to text chunking for unrestricted Modern Greek text that is based on multiple-pass parsing. Two versions of this chunker are proposed: one based on a large lexicon and one based on minimal resources. In the latter case the morphological analysis is performed using exclusively two small lexicons containing closed-class words and common suffixes of the Modern Greek words. We give comparative performance results on the basis of a corpus of unrestricted text and show that very good results can be obtained by omitting the large and complicate resources. Moreover, the considerable time cost introduced by the use of the large lexicon indicates that the minimal-resources chunker is the best solution regarding a practical application that requires rapid response and less than perfect parsing results.</description><subject>Applied sciences</subject><subject>Artificial intelligence</subject><subject>Computer science; control theory; systems</subject><subject>Exact sciences and technology</subject><subject>Morphological Description</subject><subject>Noun Phrase</subject><subject>Prepositional Phrase</subject><subject>Speech and sound recognition and synthesis. Linguistics</subject><subject>Total Word</subject><subject>Unknown Word</subject><issn>0302-9743</issn><issn>1611-3349</issn><isbn>3540676058</isbn><isbn>9783540676058</isbn><isbn>3540451544</isbn><isbn>9783540451549</isbn><fulltext>true</fulltext><rsrctype>book_chapter</rsrctype><creationdate>2000</creationdate><recordtype>book_chapter</recordtype><recordid>eNotkDlPAzEQhc0plpCecgsaCgfb42NdoohLigRFUlter02WhN1gbyT49zjHNCPNvPc08yF0S8mEEqIeAAtOMBdUcMwNhRN0DXmyH_BTVFBJKQbg-uywkEoSUZ2jggBhWCsOl6jQohKMKVpdoXFKXyQXMJG1Bbp_LD-idUPr7LqcLrfdyscy9LFcdNGnIbZu8E0597_DDboIdp38-NhHaPH8NJ--4tn7y9v0cYY3TKoBuyrUIMDl-3TwjWO1sqG2kgalKvChdqxqtGfe-UYq8Epb7qD2QnMKWlsYobtD7samfFSItnNtMpvYftv4lxEoKQnNsslBlvKm-_TR1H2_SoYSswNnwGQaZs_J7MBlAxxzY_-zzb8Zv3M43w3Rrt3SbgYfkwGiGKPCUG0or-AfP91qtQ</recordid><startdate>2000</startdate><enddate>2000</enddate><creator>Stamatatos, E.</creator><creator>Fakotakis, N.</creator><creator>Kokkinakis, G.</creator><general>Springer Berlin / Heidelberg</general><general>Springer Berlin Heidelberg</general><general>Springer</general><scope>FFUUA</scope><scope>IQODW</scope></search><sort><creationdate>2000</creationdate><title>A Practical Chunker for Unrestricted Text</title><author>Stamatatos, E. ; Fakotakis, N. ; Kokkinakis, G.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-p267t-c8fb353c1549fedc2b7afba61f7783efbc28d9e2eced673e79a4c3be5941399a3</frbrgroupid><rsrctype>book_chapters</rsrctype><prefilter>book_chapters</prefilter><language>eng</language><creationdate>2000</creationdate><topic>Applied sciences</topic><topic>Artificial intelligence</topic><topic>Computer science; control theory; systems</topic><topic>Exact sciences and technology</topic><topic>Morphological Description</topic><topic>Noun Phrase</topic><topic>Prepositional Phrase</topic><topic>Speech and sound recognition and synthesis. Linguistics</topic><topic>Total Word</topic><topic>Unknown Word</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Stamatatos, E.</creatorcontrib><creatorcontrib>Fakotakis, N.</creatorcontrib><creatorcontrib>Kokkinakis, G.</creatorcontrib><collection>ProQuest Ebook Central - Book Chapters - Demo use only</collection><collection>Pascal-Francis</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Stamatatos, E.</au><au>Fakotakis, N.</au><au>Kokkinakis, G.</au><au>Christodoulakis, Dimitris N</au><au>Christodoulakis, Dimitris N.</au><format>book</format><genre>bookitem</genre><ristype>CHAP</ristype><atitle>A Practical Chunker for Unrestricted Text</atitle><btitle>Lecture notes in computer science</btitle><seriestitle>Lecture Notes in Computer Science</seriestitle><date>2000</date><risdate>2000</risdate><volume>1835</volume><spage>139</spage><epage>150</epage><pages>139-150</pages><issn>0302-9743</issn><eissn>1611-3349</eissn><isbn>3540676058</isbn><isbn>9783540676058</isbn><eisbn>3540451544</eisbn><eisbn>9783540451549</eisbn><abstract>In this paper we present a practical approach to text chunking for unrestricted Modern Greek text that is based on multiple-pass parsing. Two versions of this chunker are proposed: one based on a large lexicon and one based on minimal resources. In the latter case the morphological analysis is performed using exclusively two small lexicons containing closed-class words and common suffixes of the Modern Greek words. We give comparative performance results on the basis of a corpus of unrestricted text and show that very good results can be obtained by omitting the large and complicate resources. Moreover, the considerable time cost introduced by the use of the large lexicon indicates that the minimal-resources chunker is the best solution regarding a practical application that requires rapid response and less than perfect parsing results.</abstract><cop>Germany</cop><pub>Springer Berlin / Heidelberg</pub><doi>10.1007/3-540-45154-4_13</doi><oclcid>958522718</oclcid><tpages>12</tpages></addata></record> |
fulltext | fulltext |
identifier | ISSN: 0302-9743 |
ispartof | Lecture notes in computer science, 2000, Vol.1835, p.139-150 |
issn | 0302-9743 1611-3349 |
language | eng |
recordid | cdi_pascalfrancis_primary_1376601 |
source | Springer Books |
subjects | Applied sciences Artificial intelligence Computer science control theory systems Exact sciences and technology Morphological Description Noun Phrase Prepositional Phrase Speech and sound recognition and synthesis. Linguistics Total Word Unknown Word |
title | A Practical Chunker for Unrestricted Text |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-16T15%3A42%3A37IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_pasca&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=bookitem&rft.atitle=A%20Practical%20Chunker%20for%20Unrestricted%20Text&rft.btitle=Lecture%20notes%20in%20computer%20science&rft.au=Stamatatos,%20E.&rft.date=2000&rft.volume=1835&rft.spage=139&rft.epage=150&rft.pages=139-150&rft.issn=0302-9743&rft.eissn=1611-3349&rft.isbn=3540676058&rft.isbn_list=9783540676058&rft_id=info:doi/10.1007/3-540-45154-4_13&rft_dat=%3Cproquest_pasca%3EEBC3072215_19_148%3C/proquest_pasca%3E%3Curl%3E%3C/url%3E&rft.eisbn=3540451544&rft.eisbn_list=9783540451549&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=EBC3072215_19_148&rft_id=info:pmid/&rfr_iscdi=true |