Exploring the Statistical Derivation of Transformational Rule Sequences for Part-of-Speech Tagging

ACL Balancing Act Workshop proceedings, July 94, pp. 86-95 Eric Brill has recently proposed a simple and powerful corpus-based language modeling approach that can be applied to various tasks including part-of-speech tagging and building phrase structure trees. The method learns a series of symbolic...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Ramshaw, Lance A, Marcus, Mitchell P
Format:	Artikel
Sprache:	eng
Schlagworte:	Computer Science - Computation and Language
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title
container_volume
creator	Ramshaw, Lance A Marcus, Mitchell P
description	ACL Balancing Act Workshop proceedings, July 94, pp. 86-95 Eric Brill has recently proposed a simple and powerful corpus-based language modeling approach that can be applied to various tasks including part-of-speech tagging and building phrase structure trees. The method learns a series of symbolic transformational rules, which can then be applied in sequence to a test corpus to produce predictions. The learning process only requires counting matches for a given set of rule templates, allowing the method to survey a very large space of possible contextual factors. This paper analyses Brill's approach as an interesting variation on existing decision tree methods, based on experiments involving part-of-speech tagging for both English and ancient Greek corpora. In particular, the analysis throws light on why the new mechanism seems surprisingly resistant to overtraining. A fast, incremental implementation and a mechanism for recording the dependencies that underlie the resulting rule sequence are also described.
doi_str_mv	10.48550/arxiv.cmp-lg/9406011
format	Article
fullrecord	<record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_cmp_lg_9406011</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>cmp_lg_9406011</sourcerecordid><originalsourceid>FETCH-LOGICAL-a751-55dc8984690e446c2a21b389d68b0875aeb8e6e406f3fe13bf0b544aece0622a3</originalsourceid><addsrcrecordid>eNotj81qwzAQhHXpoaR9hIKgZyWSLSnysaTpDwRSGt_NSlk5AsV2ZSekb1-R5rTMzjDMR8iT4HNplOILSJdwnrvjwGK7qCTXXIh7YteXIfYpdC2dDkh3E0xhnIKDSF8xhXOWfUd7T-sE3ej7dLx-sv19ijmPPyfsHI40W_QL0sR6z3YDojvQGto2Fz-QOw9xxMfbnZH6bV2vPthm-_65etkwWCrBlNo7UxmpK45SaldAIWxpqr02lpulArQGNebdvvQoSuu5VVICOuS6KKCckef_2itoM6RwhPTbZOAmts0NuPwDNd9Uhw</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Exploring the Statistical Derivation of Transformational Rule Sequences for Part-of-Speech Tagging</title><source>arXiv.org</source><creator>Ramshaw, Lance A ; Marcus, Mitchell P</creator><creatorcontrib>Ramshaw, Lance A ; Marcus, Mitchell P</creatorcontrib><description>ACL Balancing Act Workshop proceedings, July 94, pp. 86-95 Eric Brill has recently proposed a simple and powerful corpus-based language modeling approach that can be applied to various tasks including part-of-speech tagging and building phrase structure trees. The method learns a series of symbolic transformational rules, which can then be applied in sequence to a test corpus to produce predictions. The learning process only requires counting matches for a given set of rule templates, allowing the method to survey a very large space of possible contextual factors. This paper analyses Brill's approach as an interesting variation on existing decision tree methods, based on experiments involving part-of-speech tagging for both English and ancient Greek corpora. In particular, the analysis throws light on why the new mechanism seems surprisingly resistant to overtraining. A fast, incremental implementation and a mechanism for recording the dependencies that underlie the resulting rule sequence are also described.</description><identifier>DOI: 10.48550/arxiv.cmp-lg/9406011</identifier><language>eng</language><subject>Computer Science - Computation and Language</subject><creationdate>1994-06</creationdate><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,776,881</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/cmp-lg/9406011$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.cmp-lg/9406011$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Ramshaw, Lance A</creatorcontrib><creatorcontrib>Marcus, Mitchell P</creatorcontrib><title>Exploring the Statistical Derivation of Transformational Rule Sequences for Part-of-Speech Tagging</title><description>ACL Balancing Act Workshop proceedings, July 94, pp. 86-95 Eric Brill has recently proposed a simple and powerful corpus-based language modeling approach that can be applied to various tasks including part-of-speech tagging and building phrase structure trees. The method learns a series of symbolic transformational rules, which can then be applied in sequence to a test corpus to produce predictions. The learning process only requires counting matches for a given set of rule templates, allowing the method to survey a very large space of possible contextual factors. This paper analyses Brill's approach as an interesting variation on existing decision tree methods, based on experiments involving part-of-speech tagging for both English and ancient Greek corpora. In particular, the analysis throws light on why the new mechanism seems surprisingly resistant to overtraining. A fast, incremental implementation and a mechanism for recording the dependencies that underlie the resulting rule sequence are also described.</description><subject>Computer Science - Computation and Language</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>1994</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNotj81qwzAQhHXpoaR9hIKgZyWSLSnysaTpDwRSGt_NSlk5AsV2ZSekb1-R5rTMzjDMR8iT4HNplOILSJdwnrvjwGK7qCTXXIh7YteXIfYpdC2dDkh3E0xhnIKDSF8xhXOWfUd7T-sE3ej7dLx-sv19ijmPPyfsHI40W_QL0sR6z3YDojvQGto2Fz-QOw9xxMfbnZH6bV2vPthm-_65etkwWCrBlNo7UxmpK45SaldAIWxpqr02lpulArQGNebdvvQoSuu5VVICOuS6KKCckef_2itoM6RwhPTbZOAmts0NuPwDNd9Uhw</recordid><startdate>19940603</startdate><enddate>19940603</enddate><creator>Ramshaw, Lance A</creator><creator>Marcus, Mitchell P</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>19940603</creationdate><title>Exploring the Statistical Derivation of Transformational Rule Sequences for Part-of-Speech Tagging</title><author>Ramshaw, Lance A ; Marcus, Mitchell P</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a751-55dc8984690e446c2a21b389d68b0875aeb8e6e406f3fe13bf0b544aece0622a3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>1994</creationdate><topic>Computer Science - Computation and Language</topic><toplevel>online_resources</toplevel><creatorcontrib>Ramshaw, Lance A</creatorcontrib><creatorcontrib>Marcus, Mitchell P</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Ramshaw, Lance A</au><au>Marcus, Mitchell P</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Exploring the Statistical Derivation of Transformational Rule Sequences for Part-of-Speech Tagging</atitle><date>1994-06-03</date><risdate>1994</risdate><abstract>ACL Balancing Act Workshop proceedings, July 94, pp. 86-95 Eric Brill has recently proposed a simple and powerful corpus-based language modeling approach that can be applied to various tasks including part-of-speech tagging and building phrase structure trees. The method learns a series of symbolic transformational rules, which can then be applied in sequence to a test corpus to produce predictions. The learning process only requires counting matches for a given set of rule templates, allowing the method to survey a very large space of possible contextual factors. This paper analyses Brill's approach as an interesting variation on existing decision tree methods, based on experiments involving part-of-speech tagging for both English and ancient Greek corpora. In particular, the analysis throws light on why the new mechanism seems surprisingly resistant to overtraining. A fast, incremental implementation and a mechanism for recording the dependencies that underlie the resulting rule sequence are also described.</abstract><doi>10.48550/arxiv.cmp-lg/9406011</doi><oa>free_for_read</oa></addata></record>
fulltext	fulltext_linktorsrc
identifier	DOI: 10.48550/arxiv.cmp-lg/9406011
ispartof
issn
language	eng
recordid	cdi_arxiv_primary_cmp_lg_9406011
source	arXiv.org
subjects	Computer Science - Computation and Language
title	Exploring the Statistical Derivation of Transformational Rule Sequences for Part-of-Speech Tagging
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-08T03%3A11%3A17IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Exploring%20the%20Statistical%20Derivation%20of%20Transformational%20Rule%20Sequences%20for%20Part-of-Speech%20Tagging&rft.au=Ramshaw,%20Lance%20A&rft.date=1994-06-03&rft_id=info:doi/10.48550/arxiv.cmp-lg/9406011&rft_dat=%3Carxiv_GOX%3Ecmp_lg_9406011%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true