A new regular grammar pattern matching algorithm

This paper presents a Boyer–Moore type algorithm for regular grammar pattern matching, answering a variant of an open problem posed by Aho (Pattern Matching in Strings, Academic Press, New York, 1980, p. 342). The new algorithm handles patterns specified by regular (left linear) grammars—a generaliz...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Theoretical computer science 2003-04, Vol.299 (1-3), p.509-521
1. Verfasser: Watson, Bruce W.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 521
container_issue 1-3
container_start_page 509
container_title Theoretical computer science
container_volume 299
creator Watson, Bruce W.
description This paper presents a Boyer–Moore type algorithm for regular grammar pattern matching, answering a variant of an open problem posed by Aho (Pattern Matching in Strings, Academic Press, New York, 1980, p. 342). The new algorithm handles patterns specified by regular (left linear) grammars—a generalization of the Boyer–Moore (single keyword) and Commentz-Walter (multiple keyword) algorithms. Like the Boyer–Moore and Commentz-Walter algorithms, the new algorithm makes use of shift functions which can be precomputed and tabulated. The precomputation functions are derived, and it is shown that they can be obtained from Commentz-Walter's d1 and d2 shift functions. In most cases, the Boyer–Moore (respectively, Commentz-Walter) algorithm has greatly outperformed the Knuth–Morris–Pratt (respectively, Aho–Corasick) algorithm. In practice, an earlier version of the algorithm presented in this paper also frequently outperforms the regular grammar generalization of the Aho–Corasick algorithm.
doi_str_mv 10.1016/S0304-3975(02)00532-7
format Article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_27925858</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><els_id>S0304397502005327</els_id><sourcerecordid>27925858</sourcerecordid><originalsourceid>FETCH-LOGICAL-c368t-55b89d146d01ebb1dadc7209d90d503dfa20ecbaf3e4438d77c869eb04a189773</originalsourceid><addsrcrecordid>eNqFkEtPwzAQhC0EEqXwE5ByAcEhsHbi2D6hquIlVeIAnC3H3qRGeRQ7BfHvSVsER_Yyl292NEPIKYUrCrS4foYM8jRTgl8AuwTgGUvFHplQKVTKmMr3yeQXOSRHMb7BeFwUEwKzpMPPJGC9bkxI6mDadtSVGQYMXdKawS59Vyemqfvgh2V7TA4q00Q8-dEpeb27fZk_pIun-8f5bJHarJBDynkplaN54YBiWVJnnBUMlFPgOGSuMgzQlqbKMM8z6YSwslBYQm6oVEJkU3K--7sK_fsa46BbHy02jemwX0fNhGJccjmCfAfa0McYsNKr4McOX5qC3uyjt_voTXkNTG_30ZuAs58AE61pqmA66-OfOS8KmdNi5G52HI5tPzwGHa3HzqLzAe2gXe__SfoGzJ55Dw</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>27925858</pqid></control><display><type>article</type><title>A new regular grammar pattern matching algorithm</title><source>Elsevier ScienceDirect Journals Complete</source><source>EZB-FREE-00999 freely available EZB journals</source><creator>Watson, Bruce W.</creator><creatorcontrib>Watson, Bruce W.</creatorcontrib><description>This paper presents a Boyer–Moore type algorithm for regular grammar pattern matching, answering a variant of an open problem posed by Aho (Pattern Matching in Strings, Academic Press, New York, 1980, p. 342). The new algorithm handles patterns specified by regular (left linear) grammars—a generalization of the Boyer–Moore (single keyword) and Commentz-Walter (multiple keyword) algorithms. Like the Boyer–Moore and Commentz-Walter algorithms, the new algorithm makes use of shift functions which can be precomputed and tabulated. The precomputation functions are derived, and it is shown that they can be obtained from Commentz-Walter's d1 and d2 shift functions. In most cases, the Boyer–Moore (respectively, Commentz-Walter) algorithm has greatly outperformed the Knuth–Morris–Pratt (respectively, Aho–Corasick) algorithm. In practice, an earlier version of the algorithm presented in this paper also frequently outperforms the regular grammar generalization of the Aho–Corasick algorithm.</description><identifier>ISSN: 0304-3975</identifier><identifier>EISSN: 1879-2294</identifier><identifier>DOI: 10.1016/S0304-3975(02)00532-7</identifier><identifier>CODEN: TCSCDI</identifier><language>eng</language><publisher>Amsterdam: Elsevier B.V</publisher><subject>Algorithmics. Computability. Computer arithmetics ; Applied sciences ; Artificial intelligence ; Boyer–Moore algorithm ; Computer science; control theory; systems ; Data processing. List processing. Character string processing ; Exact sciences and technology ; Language theory and syntactical analysis ; Memory organisation. Data processing ; Pattern matching algorithms ; Pattern recognition. Digital image processing. Computational geometry ; Regular grammars ; Software ; Theoretical computing</subject><ispartof>Theoretical computer science, 2003-04, Vol.299 (1-3), p.509-521</ispartof><rights>2002 Elsevier Science B.V.</rights><rights>2003 INIST-CNRS</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c368t-55b89d146d01ebb1dadc7209d90d503dfa20ecbaf3e4438d77c869eb04a189773</citedby><cites>FETCH-LOGICAL-c368t-55b89d146d01ebb1dadc7209d90d503dfa20ecbaf3e4438d77c869eb04a189773</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://dx.doi.org/10.1016/S0304-3975(02)00532-7$$EHTML$$P50$$Gelsevier$$Hfree_for_read</linktohtml><link.rule.ids>314,780,784,3550,27924,27925,45995</link.rule.ids><backlink>$$Uhttp://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&amp;idt=14668416$$DView record in Pascal Francis$$Hfree_for_read</backlink></links><search><creatorcontrib>Watson, Bruce W.</creatorcontrib><title>A new regular grammar pattern matching algorithm</title><title>Theoretical computer science</title><description>This paper presents a Boyer–Moore type algorithm for regular grammar pattern matching, answering a variant of an open problem posed by Aho (Pattern Matching in Strings, Academic Press, New York, 1980, p. 342). The new algorithm handles patterns specified by regular (left linear) grammars—a generalization of the Boyer–Moore (single keyword) and Commentz-Walter (multiple keyword) algorithms. Like the Boyer–Moore and Commentz-Walter algorithms, the new algorithm makes use of shift functions which can be precomputed and tabulated. The precomputation functions are derived, and it is shown that they can be obtained from Commentz-Walter's d1 and d2 shift functions. In most cases, the Boyer–Moore (respectively, Commentz-Walter) algorithm has greatly outperformed the Knuth–Morris–Pratt (respectively, Aho–Corasick) algorithm. In practice, an earlier version of the algorithm presented in this paper also frequently outperforms the regular grammar generalization of the Aho–Corasick algorithm.</description><subject>Algorithmics. Computability. Computer arithmetics</subject><subject>Applied sciences</subject><subject>Artificial intelligence</subject><subject>Boyer–Moore algorithm</subject><subject>Computer science; control theory; systems</subject><subject>Data processing. List processing. Character string processing</subject><subject>Exact sciences and technology</subject><subject>Language theory and syntactical analysis</subject><subject>Memory organisation. Data processing</subject><subject>Pattern matching algorithms</subject><subject>Pattern recognition. Digital image processing. Computational geometry</subject><subject>Regular grammars</subject><subject>Software</subject><subject>Theoretical computing</subject><issn>0304-3975</issn><issn>1879-2294</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2003</creationdate><recordtype>article</recordtype><recordid>eNqFkEtPwzAQhC0EEqXwE5ByAcEhsHbi2D6hquIlVeIAnC3H3qRGeRQ7BfHvSVsER_Yyl292NEPIKYUrCrS4foYM8jRTgl8AuwTgGUvFHplQKVTKmMr3yeQXOSRHMb7BeFwUEwKzpMPPJGC9bkxI6mDadtSVGQYMXdKawS59Vyemqfvgh2V7TA4q00Q8-dEpeb27fZk_pIun-8f5bJHarJBDynkplaN54YBiWVJnnBUMlFPgOGSuMgzQlqbKMM8z6YSwslBYQm6oVEJkU3K--7sK_fsa46BbHy02jemwX0fNhGJccjmCfAfa0McYsNKr4McOX5qC3uyjt_voTXkNTG_30ZuAs58AE61pqmA66-OfOS8KmdNi5G52HI5tPzwGHa3HzqLzAe2gXe__SfoGzJ55Dw</recordid><startdate>20030418</startdate><enddate>20030418</enddate><creator>Watson, Bruce W.</creator><general>Elsevier B.V</general><general>Elsevier</general><scope>6I.</scope><scope>AAFTH</scope><scope>IQODW</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope></search><sort><creationdate>20030418</creationdate><title>A new regular grammar pattern matching algorithm</title><author>Watson, Bruce W.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c368t-55b89d146d01ebb1dadc7209d90d503dfa20ecbaf3e4438d77c869eb04a189773</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2003</creationdate><topic>Algorithmics. Computability. Computer arithmetics</topic><topic>Applied sciences</topic><topic>Artificial intelligence</topic><topic>Boyer–Moore algorithm</topic><topic>Computer science; control theory; systems</topic><topic>Data processing. List processing. Character string processing</topic><topic>Exact sciences and technology</topic><topic>Language theory and syntactical analysis</topic><topic>Memory organisation. Data processing</topic><topic>Pattern matching algorithms</topic><topic>Pattern recognition. Digital image processing. Computational geometry</topic><topic>Regular grammars</topic><topic>Software</topic><topic>Theoretical computing</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Watson, Bruce W.</creatorcontrib><collection>ScienceDirect Open Access Titles</collection><collection>Elsevier:ScienceDirect:Open Access</collection><collection>Pascal-Francis</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>Theoretical computer science</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Watson, Bruce W.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>A new regular grammar pattern matching algorithm</atitle><jtitle>Theoretical computer science</jtitle><date>2003-04-18</date><risdate>2003</risdate><volume>299</volume><issue>1-3</issue><spage>509</spage><epage>521</epage><pages>509-521</pages><issn>0304-3975</issn><eissn>1879-2294</eissn><coden>TCSCDI</coden><abstract>This paper presents a Boyer–Moore type algorithm for regular grammar pattern matching, answering a variant of an open problem posed by Aho (Pattern Matching in Strings, Academic Press, New York, 1980, p. 342). The new algorithm handles patterns specified by regular (left linear) grammars—a generalization of the Boyer–Moore (single keyword) and Commentz-Walter (multiple keyword) algorithms. Like the Boyer–Moore and Commentz-Walter algorithms, the new algorithm makes use of shift functions which can be precomputed and tabulated. The precomputation functions are derived, and it is shown that they can be obtained from Commentz-Walter's d1 and d2 shift functions. In most cases, the Boyer–Moore (respectively, Commentz-Walter) algorithm has greatly outperformed the Knuth–Morris–Pratt (respectively, Aho–Corasick) algorithm. In practice, an earlier version of the algorithm presented in this paper also frequently outperforms the regular grammar generalization of the Aho–Corasick algorithm.</abstract><cop>Amsterdam</cop><pub>Elsevier B.V</pub><doi>10.1016/S0304-3975(02)00532-7</doi><tpages>13</tpages><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 0304-3975
ispartof Theoretical computer science, 2003-04, Vol.299 (1-3), p.509-521
issn 0304-3975
1879-2294
language eng
recordid cdi_proquest_miscellaneous_27925858
source Elsevier ScienceDirect Journals Complete; EZB-FREE-00999 freely available EZB journals
subjects Algorithmics. Computability. Computer arithmetics
Applied sciences
Artificial intelligence
Boyer–Moore algorithm
Computer science
control theory
systems
Data processing. List processing. Character string processing
Exact sciences and technology
Language theory and syntactical analysis
Memory organisation. Data processing
Pattern matching algorithms
Pattern recognition. Digital image processing. Computational geometry
Regular grammars
Software
Theoretical computing
title A new regular grammar pattern matching algorithm
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-26T06%3A21%3A42IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=A%20new%20regular%20grammar%20pattern%20matching%20algorithm&rft.jtitle=Theoretical%20computer%20science&rft.au=Watson,%20Bruce%20W.&rft.date=2003-04-18&rft.volume=299&rft.issue=1-3&rft.spage=509&rft.epage=521&rft.pages=509-521&rft.issn=0304-3975&rft.eissn=1879-2294&rft.coden=TCSCDI&rft_id=info:doi/10.1016/S0304-3975(02)00532-7&rft_dat=%3Cproquest_cross%3E27925858%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=27925858&rft_id=info:pmid/&rft_els_id=S0304397502005327&rfr_iscdi=true