On optimizing syntactic pattern recognition using tries and AI-based heuristic-search strategies
This paper deals with the problem of estimating, using enhanced artificial-intelligence (AI) techniques, a transmitted string X/sup */ by processing the corresponding string Y, which is a noisy version of X/sup */. It is assumed that Y contains substitution, insertion, and deletion (SID) errors. The...
Gespeichert in:
Veröffentlicht in: | IEEE transactions on cybernetics 2006-06, Vol.36 (3), p.611-622 |
---|---|
Hauptverfasser: | , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 622 |
---|---|
container_issue | 3 |
container_start_page | 611 |
container_title | IEEE transactions on cybernetics |
container_volume | 36 |
creator | Badr, G. Oommen, B.J. |
description | This paper deals with the problem of estimating, using enhanced artificial-intelligence (AI) techniques, a transmitted string X/sup */ by processing the corresponding string Y, which is a noisy version of X/sup */. It is assumed that Y contains substitution, insertion, and deletion (SID) errors. The best estimate X/sup +/ of X/sup */ is defined as that element of a dictionary H that minimizes the generalized Levenshtein distance (GLD) D(X,Y) between X and Y, for all X/spl isin/H. In this paper, it is shown how to evaluate D(X,Y) for every X/spl isin/H simultaneously, when the edit distances are general and the maximum number of errors is not given a priori, and when H is stored as a trie. A new scheme called clustered beam search (CBS) is first introduced, which is a heuristic-based search approach that enhances the well-known beam-search (BS) techniques used in AI. The new scheme is then applied to the approximate string-matching problem when the dictionary is stored as a trie. The new technique is compared with the benchmark depth-first search (DFS) trie-based technique (with respect to time and accuracy) using large and small dictionaries. The results demonstrate a marked improvement of up to 75% with respect to the total number of operations needed on three benchmark dictionaries, while yielding an accuracy comparable to the optimal. Experiments are also done to show the benefits of the CBS over the BS when the search is done on the trie. The results also demonstrate a marked improvement (more than 91%) for large dictionaries. |
doi_str_mv | 10.1109/TSMCB.2005.861860 |
format | Article |
fullrecord | <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_proquest_miscellaneous_28060085</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>1634653</ieee_id><sourcerecordid>28978451</sourcerecordid><originalsourceid>FETCH-LOGICAL-c441t-118a4a6824cb8e7b0e42b180e79086149a60fda9b731d2eb09799b56cff1f1813</originalsourceid><addsrcrecordid>eNqNkU1v1DAQhi0EoqXwAxASijggLllmHMexj-2Kj0pFPVDOxnEmW1e7zmI7h_LrcdiVQByAky35eUd-52HsOcIKEfTbm8-f1hcrDtCulEQl4QE7RS2wBqH5w3IH1dRCoD5hT1K6AwANunvMTlB2hUdxyr5eh2raZ7_z333YVOk-ZOuyd9Xe5kwxVJHctAk--ylUc1qYHD2lyoahOr-se5toqG5pjj6VWJ3IRndbpRxtpk0Bn7JHo90menY8z9iX9-9u1h_rq-sPl-vzq9qVD-YaUVlhpeLC9Yq6HkjwHhVQp6F0E9pKGAer-67BgVNfemjdt9KNI46lSnPGXh_m7uP0baaUzc4nR9utDTTNyUgFbcN5-0-QK90p0eJ_gCAB1DLxzV_Bsm3kXav1gr76A72b5hjKYoxSWhVVShYID5CLU0qRRrOPfmfjvUEwi3jzU7xZxJuD-JJ5eRw89zsafiWOpgvw4gB4IvrtuRGybZoftcuxLA</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>889841986</pqid></control><display><type>article</type><title>On optimizing syntactic pattern recognition using tries and AI-based heuristic-search strategies</title><source>IEEE Electronic Library (IEL)</source><creator>Badr, G. ; Oommen, B.J.</creator><creatorcontrib>Badr, G. ; Oommen, B.J.</creatorcontrib><description>This paper deals with the problem of estimating, using enhanced artificial-intelligence (AI) techniques, a transmitted string X/sup */ by processing the corresponding string Y, which is a noisy version of X/sup */. It is assumed that Y contains substitution, insertion, and deletion (SID) errors. The best estimate X/sup +/ of X/sup */ is defined as that element of a dictionary H that minimizes the generalized Levenshtein distance (GLD) D(X,Y) between X and Y, for all X/spl isin/H. In this paper, it is shown how to evaluate D(X,Y) for every X/spl isin/H simultaneously, when the edit distances are general and the maximum number of errors is not given a priori, and when H is stored as a trie. A new scheme called clustered beam search (CBS) is first introduced, which is a heuristic-based search approach that enhances the well-known beam-search (BS) techniques used in AI. The new scheme is then applied to the approximate string-matching problem when the dictionary is stored as a trie. The new technique is compared with the benchmark depth-first search (DFS) trie-based technique (with respect to time and accuracy) using large and small dictionaries. The results demonstrate a marked improvement of up to 75% with respect to the total number of operations needed on three benchmark dictionaries, while yielding an accuracy comparable to the optimal. Experiments are also done to show the benefits of the CBS over the BS when the search is done on the trie. The results also demonstrate a marked improvement (more than 91%) for large dictionaries.</description><identifier>ISSN: 1083-4419</identifier><identifier>ISSN: 2168-2267</identifier><identifier>EISSN: 1941-0492</identifier><identifier>EISSN: 2168-2275</identifier><identifier>DOI: 10.1109/TSMCB.2005.861860</identifier><identifier>PMID: 16761814</identifier><identifier>CODEN: ITSCFI</identifier><language>eng</language><publisher>United States: IEEE</publisher><subject>Algorithms ; Approximate string matching ; Artificial Intelligence ; artificial intelligence (AI) ; Benchmarking ; Computer science ; Costs ; Cybernetics ; Data structures ; Dictionaries ; Errors ; Information Storage and Retrieval - methods ; Language ; local beam search (BS) ; Natural Language Processing ; noisy syntactic recognition using tries ; Optimization ; Pattern matching ; Pattern recognition ; Pattern Recognition, Automated - methods ; Searching ; Speech Recognition Software ; Strings ; trie-based syntactic pattern recognition (PR)</subject><ispartof>IEEE transactions on cybernetics, 2006-06, Vol.36 (3), p.611-622</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2006</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c441t-118a4a6824cb8e7b0e42b180e79086149a60fda9b731d2eb09799b56cff1f1813</citedby><cites>FETCH-LOGICAL-c441t-118a4a6824cb8e7b0e42b180e79086149a60fda9b731d2eb09799b56cff1f1813</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/1634653$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,776,780,792,27903,27904,54737</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/1634653$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/16761814$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Badr, G.</creatorcontrib><creatorcontrib>Oommen, B.J.</creatorcontrib><title>On optimizing syntactic pattern recognition using tries and AI-based heuristic-search strategies</title><title>IEEE transactions on cybernetics</title><addtitle>TSMCB</addtitle><addtitle>IEEE Trans Syst Man Cybern B Cybern</addtitle><description>This paper deals with the problem of estimating, using enhanced artificial-intelligence (AI) techniques, a transmitted string X/sup */ by processing the corresponding string Y, which is a noisy version of X/sup */. It is assumed that Y contains substitution, insertion, and deletion (SID) errors. The best estimate X/sup +/ of X/sup */ is defined as that element of a dictionary H that minimizes the generalized Levenshtein distance (GLD) D(X,Y) between X and Y, for all X/spl isin/H. In this paper, it is shown how to evaluate D(X,Y) for every X/spl isin/H simultaneously, when the edit distances are general and the maximum number of errors is not given a priori, and when H is stored as a trie. A new scheme called clustered beam search (CBS) is first introduced, which is a heuristic-based search approach that enhances the well-known beam-search (BS) techniques used in AI. The new scheme is then applied to the approximate string-matching problem when the dictionary is stored as a trie. The new technique is compared with the benchmark depth-first search (DFS) trie-based technique (with respect to time and accuracy) using large and small dictionaries. The results demonstrate a marked improvement of up to 75% with respect to the total number of operations needed on three benchmark dictionaries, while yielding an accuracy comparable to the optimal. Experiments are also done to show the benefits of the CBS over the BS when the search is done on the trie. The results also demonstrate a marked improvement (more than 91%) for large dictionaries.</description><subject>Algorithms</subject><subject>Approximate string matching</subject><subject>Artificial Intelligence</subject><subject>artificial intelligence (AI)</subject><subject>Benchmarking</subject><subject>Computer science</subject><subject>Costs</subject><subject>Cybernetics</subject><subject>Data structures</subject><subject>Dictionaries</subject><subject>Errors</subject><subject>Information Storage and Retrieval - methods</subject><subject>Language</subject><subject>local beam search (BS)</subject><subject>Natural Language Processing</subject><subject>noisy syntactic recognition using tries</subject><subject>Optimization</subject><subject>Pattern matching</subject><subject>Pattern recognition</subject><subject>Pattern Recognition, Automated - methods</subject><subject>Searching</subject><subject>Speech Recognition Software</subject><subject>Strings</subject><subject>trie-based syntactic pattern recognition (PR)</subject><issn>1083-4419</issn><issn>2168-2267</issn><issn>1941-0492</issn><issn>2168-2275</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2006</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><sourceid>EIF</sourceid><recordid>eNqNkU1v1DAQhi0EoqXwAxASijggLllmHMexj-2Kj0pFPVDOxnEmW1e7zmI7h_LrcdiVQByAky35eUd-52HsOcIKEfTbm8-f1hcrDtCulEQl4QE7RS2wBqH5w3IH1dRCoD5hT1K6AwANunvMTlB2hUdxyr5eh2raZ7_z333YVOk-ZOuyd9Xe5kwxVJHctAk--ylUc1qYHD2lyoahOr-se5toqG5pjj6VWJ3IRndbpRxtpk0Bn7JHo90menY8z9iX9-9u1h_rq-sPl-vzq9qVD-YaUVlhpeLC9Yq6HkjwHhVQp6F0E9pKGAer-67BgVNfemjdt9KNI46lSnPGXh_m7uP0baaUzc4nR9utDTTNyUgFbcN5-0-QK90p0eJ_gCAB1DLxzV_Bsm3kXav1gr76A72b5hjKYoxSWhVVShYID5CLU0qRRrOPfmfjvUEwi3jzU7xZxJuD-JJ5eRw89zsafiWOpgvw4gB4IvrtuRGybZoftcuxLA</recordid><startdate>20060601</startdate><enddate>20060601</enddate><creator>Badr, G.</creator><creator>Oommen, B.J.</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>7TB</scope><scope>8FD</scope><scope>F28</scope><scope>FR3</scope><scope>H8D</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>7X8</scope></search><sort><creationdate>20060601</creationdate><title>On optimizing syntactic pattern recognition using tries and AI-based heuristic-search strategies</title><author>Badr, G. ; Oommen, B.J.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c441t-118a4a6824cb8e7b0e42b180e79086149a60fda9b731d2eb09799b56cff1f1813</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2006</creationdate><topic>Algorithms</topic><topic>Approximate string matching</topic><topic>Artificial Intelligence</topic><topic>artificial intelligence (AI)</topic><topic>Benchmarking</topic><topic>Computer science</topic><topic>Costs</topic><topic>Cybernetics</topic><topic>Data structures</topic><topic>Dictionaries</topic><topic>Errors</topic><topic>Information Storage and Retrieval - methods</topic><topic>Language</topic><topic>local beam search (BS)</topic><topic>Natural Language Processing</topic><topic>noisy syntactic recognition using tries</topic><topic>Optimization</topic><topic>Pattern matching</topic><topic>Pattern recognition</topic><topic>Pattern Recognition, Automated - methods</topic><topic>Searching</topic><topic>Speech Recognition Software</topic><topic>Strings</topic><topic>trie-based syntactic pattern recognition (PR)</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Badr, G.</creatorcontrib><creatorcontrib>Oommen, B.J.</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998–Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Mechanical & Transportation Engineering Abstracts</collection><collection>Technology Research Database</collection><collection>ANTE: Abstracts in New Technology & Engineering</collection><collection>Engineering Research Database</collection><collection>Aerospace Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>MEDLINE - Academic</collection><jtitle>IEEE transactions on cybernetics</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Badr, G.</au><au>Oommen, B.J.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>On optimizing syntactic pattern recognition using tries and AI-based heuristic-search strategies</atitle><jtitle>IEEE transactions on cybernetics</jtitle><stitle>TSMCB</stitle><addtitle>IEEE Trans Syst Man Cybern B Cybern</addtitle><date>2006-06-01</date><risdate>2006</risdate><volume>36</volume><issue>3</issue><spage>611</spage><epage>622</epage><pages>611-622</pages><issn>1083-4419</issn><issn>2168-2267</issn><eissn>1941-0492</eissn><eissn>2168-2275</eissn><coden>ITSCFI</coden><abstract>This paper deals with the problem of estimating, using enhanced artificial-intelligence (AI) techniques, a transmitted string X/sup */ by processing the corresponding string Y, which is a noisy version of X/sup */. It is assumed that Y contains substitution, insertion, and deletion (SID) errors. The best estimate X/sup +/ of X/sup */ is defined as that element of a dictionary H that minimizes the generalized Levenshtein distance (GLD) D(X,Y) between X and Y, for all X/spl isin/H. In this paper, it is shown how to evaluate D(X,Y) for every X/spl isin/H simultaneously, when the edit distances are general and the maximum number of errors is not given a priori, and when H is stored as a trie. A new scheme called clustered beam search (CBS) is first introduced, which is a heuristic-based search approach that enhances the well-known beam-search (BS) techniques used in AI. The new scheme is then applied to the approximate string-matching problem when the dictionary is stored as a trie. The new technique is compared with the benchmark depth-first search (DFS) trie-based technique (with respect to time and accuracy) using large and small dictionaries. The results demonstrate a marked improvement of up to 75% with respect to the total number of operations needed on three benchmark dictionaries, while yielding an accuracy comparable to the optimal. Experiments are also done to show the benefits of the CBS over the BS when the search is done on the trie. The results also demonstrate a marked improvement (more than 91%) for large dictionaries.</abstract><cop>United States</cop><pub>IEEE</pub><pmid>16761814</pmid><doi>10.1109/TSMCB.2005.861860</doi><tpages>12</tpages></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | ISSN: 1083-4419 |
ispartof | IEEE transactions on cybernetics, 2006-06, Vol.36 (3), p.611-622 |
issn | 1083-4419 2168-2267 1941-0492 2168-2275 |
language | eng |
recordid | cdi_proquest_miscellaneous_28060085 |
source | IEEE Electronic Library (IEL) |
subjects | Algorithms Approximate string matching Artificial Intelligence artificial intelligence (AI) Benchmarking Computer science Costs Cybernetics Data structures Dictionaries Errors Information Storage and Retrieval - methods Language local beam search (BS) Natural Language Processing noisy syntactic recognition using tries Optimization Pattern matching Pattern recognition Pattern Recognition, Automated - methods Searching Speech Recognition Software Strings trie-based syntactic pattern recognition (PR) |
title | On optimizing syntactic pattern recognition using tries and AI-based heuristic-search strategies |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-22T14%3A32%3A00IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=On%20optimizing%20syntactic%20pattern%20recognition%20using%20tries%20and%20AI-based%20heuristic-search%20strategies&rft.jtitle=IEEE%20transactions%20on%20cybernetics&rft.au=Badr,%20G.&rft.date=2006-06-01&rft.volume=36&rft.issue=3&rft.spage=611&rft.epage=622&rft.pages=611-622&rft.issn=1083-4419&rft.eissn=1941-0492&rft.coden=ITSCFI&rft_id=info:doi/10.1109/TSMCB.2005.861860&rft_dat=%3Cproquest_RIE%3E28978451%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=889841986&rft_id=info:pmid/16761814&rft_ieee_id=1634653&rfr_iscdi=true |