TVSBS: A fast exact pattern matching algorithm for biological sequences

The post-genomic era is witnessing a remarkable increase in the number of nucleotide and amino acid sequences. The content of biological sequence databases almost doubles frequently. Pattern matching emerges as a powerful tool in locating nucleotide or amino acid sequence patterns in the biological...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Current science (Bangalore) 2006-07, Vol.91 (1), p.47-53
Hauptverfasser: Thathoo, Rahul, Virmani, Ashish, Lakshmi, S. Sai, Balakrishnan, N., Sekar, K.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 53
container_issue 1
container_start_page 47
container_title Current science (Bangalore)
container_volume 91
creator Thathoo, Rahul
Virmani, Ashish
Lakshmi, S. Sai
Balakrishnan, N.
Sekar, K.
description The post-genomic era is witnessing a remarkable increase in the number of nucleotide and amino acid sequences. The content of biological sequence databases almost doubles frequently. Pattern matching emerges as a powerful tool in locating nucleotide or amino acid sequence patterns in the biological sequence databases. Presently, several pattern-matching algorithms are available in the literature right from the basic Brute Force algorithm to the recent SSABS. The efficiency of the various algorithms depends on faster and exact identification of the pattern in the text. In this article, we propose an exact pattern-matching algorithm for biological sequences. The proposed algorithm, TVSBS, is a combination of Berry–Ravindran and SSABS algorithms. The performance of the new algorithm has been improved using the shift of Berry–Ravindran bad character table, which leads to lesser number of character comparisons. It works consistently well for both nucleotide and amino acid sequences. The proposed algorithm has been compared with the recent algorithm, SSABS. The results show the robustness of the proposed algorithm and thus it can be incorporated in any exact pattern-matching applications involving biological sequences. The best- and worst-case time complexities of the new algorithm are also outlined.
format Article
fullrecord <record><control><sourceid>jstor</sourceid><recordid>TN_cdi_jstor_primary_24094174</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><jstor_id>24094174</jstor_id><sourcerecordid>24094174</sourcerecordid><originalsourceid>FETCH-LOGICAL-j177t-17eaba4894d50b037de6074b367808885fec649a407bdc699a4caf37a88f9b453</originalsourceid><addsrcrecordid>eNotzEFPwyAYgGEOmjjnfoIJf6DJR6EFvM1Fp8kSD5u7Lh8UOpq2TMBE_70menqf03tFFgCMVVxpdkNucx4Aal6DXpDt4bh_3D_QNfWYC3VfaAu9YCkuzXTCYs9h7imOfUyhnCfqY6ImxDH2weJIs_v4dLN1-Y5cexyzW_13Sd6fnw6bl2r3tn3drHfVwKQsFZMODQqlRdeAAS4714IUhrdSgVKq8c62QqMAaTrb6l9Z9FyiUl4b0fAluf_7DrnEdLqkMGH6PtUCtGBS8B89_UQ4</addsrcrecordid><sourcetype>Publisher</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>TVSBS: A fast exact pattern matching algorithm for biological sequences</title><source>Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals</source><source>JSTOR Archive Collection A-Z Listing</source><creator>Thathoo, Rahul ; Virmani, Ashish ; Lakshmi, S. Sai ; Balakrishnan, N. ; Sekar, K.</creator><creatorcontrib>Thathoo, Rahul ; Virmani, Ashish ; Lakshmi, S. Sai ; Balakrishnan, N. ; Sekar, K.</creatorcontrib><description>The post-genomic era is witnessing a remarkable increase in the number of nucleotide and amino acid sequences. The content of biological sequence databases almost doubles frequently. Pattern matching emerges as a powerful tool in locating nucleotide or amino acid sequence patterns in the biological sequence databases. Presently, several pattern-matching algorithms are available in the literature right from the basic Brute Force algorithm to the recent SSABS. The efficiency of the various algorithms depends on faster and exact identification of the pattern in the text. In this article, we propose an exact pattern-matching algorithm for biological sequences. The proposed algorithm, TVSBS, is a combination of Berry–Ravindran and SSABS algorithms. The performance of the new algorithm has been improved using the shift of Berry–Ravindran bad character table, which leads to lesser number of character comparisons. It works consistently well for both nucleotide and amino acid sequences. The proposed algorithm has been compared with the recent algorithm, SSABS. The results show the robustness of the proposed algorithm and thus it can be incorporated in any exact pattern-matching applications involving biological sequences. The best- and worst-case time complexities of the new algorithm are also outlined.</description><identifier>ISSN: 0011-3891</identifier><language>eng</language><publisher>Current Science Association</publisher><subject>Algorithms ; Alphabets ; Amino acids ; Bioinformatics ; Chromosomes ; Genes ; Nucleotide sequences ; Nucleotides</subject><ispartof>Current science (Bangalore), 2006-07, Vol.91 (1), p.47-53</ispartof><rights>2006 Current Science Association</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.jstor.org/stable/pdf/24094174$$EPDF$$P50$$Gjstor$$H</linktopdf><linktohtml>$$Uhttps://www.jstor.org/stable/24094174$$EHTML$$P50$$Gjstor$$H</linktohtml><link.rule.ids>314,780,784,803,58017,58250</link.rule.ids></links><search><creatorcontrib>Thathoo, Rahul</creatorcontrib><creatorcontrib>Virmani, Ashish</creatorcontrib><creatorcontrib>Lakshmi, S. Sai</creatorcontrib><creatorcontrib>Balakrishnan, N.</creatorcontrib><creatorcontrib>Sekar, K.</creatorcontrib><title>TVSBS: A fast exact pattern matching algorithm for biological sequences</title><title>Current science (Bangalore)</title><description>The post-genomic era is witnessing a remarkable increase in the number of nucleotide and amino acid sequences. The content of biological sequence databases almost doubles frequently. Pattern matching emerges as a powerful tool in locating nucleotide or amino acid sequence patterns in the biological sequence databases. Presently, several pattern-matching algorithms are available in the literature right from the basic Brute Force algorithm to the recent SSABS. The efficiency of the various algorithms depends on faster and exact identification of the pattern in the text. In this article, we propose an exact pattern-matching algorithm for biological sequences. The proposed algorithm, TVSBS, is a combination of Berry–Ravindran and SSABS algorithms. The performance of the new algorithm has been improved using the shift of Berry–Ravindran bad character table, which leads to lesser number of character comparisons. It works consistently well for both nucleotide and amino acid sequences. The proposed algorithm has been compared with the recent algorithm, SSABS. The results show the robustness of the proposed algorithm and thus it can be incorporated in any exact pattern-matching applications involving biological sequences. The best- and worst-case time complexities of the new algorithm are also outlined.</description><subject>Algorithms</subject><subject>Alphabets</subject><subject>Amino acids</subject><subject>Bioinformatics</subject><subject>Chromosomes</subject><subject>Genes</subject><subject>Nucleotide sequences</subject><subject>Nucleotides</subject><issn>0011-3891</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2006</creationdate><recordtype>article</recordtype><sourceid/><recordid>eNotzEFPwyAYgGEOmjjnfoIJf6DJR6EFvM1Fp8kSD5u7Lh8UOpq2TMBE_70menqf03tFFgCMVVxpdkNucx4Aal6DXpDt4bh_3D_QNfWYC3VfaAu9YCkuzXTCYs9h7imOfUyhnCfqY6ImxDH2weJIs_v4dLN1-Y5cexyzW_13Sd6fnw6bl2r3tn3drHfVwKQsFZMODQqlRdeAAS4714IUhrdSgVKq8c62QqMAaTrb6l9Z9FyiUl4b0fAluf_7DrnEdLqkMGH6PtUCtGBS8B89_UQ4</recordid><startdate>20060710</startdate><enddate>20060710</enddate><creator>Thathoo, Rahul</creator><creator>Virmani, Ashish</creator><creator>Lakshmi, S. Sai</creator><creator>Balakrishnan, N.</creator><creator>Sekar, K.</creator><general>Current Science Association</general><scope/></search><sort><creationdate>20060710</creationdate><title>TVSBS: A fast exact pattern matching algorithm for biological sequences</title><author>Thathoo, Rahul ; Virmani, Ashish ; Lakshmi, S. Sai ; Balakrishnan, N. ; Sekar, K.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-j177t-17eaba4894d50b037de6074b367808885fec649a407bdc699a4caf37a88f9b453</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2006</creationdate><topic>Algorithms</topic><topic>Alphabets</topic><topic>Amino acids</topic><topic>Bioinformatics</topic><topic>Chromosomes</topic><topic>Genes</topic><topic>Nucleotide sequences</topic><topic>Nucleotides</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Thathoo, Rahul</creatorcontrib><creatorcontrib>Virmani, Ashish</creatorcontrib><creatorcontrib>Lakshmi, S. Sai</creatorcontrib><creatorcontrib>Balakrishnan, N.</creatorcontrib><creatorcontrib>Sekar, K.</creatorcontrib><jtitle>Current science (Bangalore)</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Thathoo, Rahul</au><au>Virmani, Ashish</au><au>Lakshmi, S. Sai</au><au>Balakrishnan, N.</au><au>Sekar, K.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>TVSBS: A fast exact pattern matching algorithm for biological sequences</atitle><jtitle>Current science (Bangalore)</jtitle><date>2006-07-10</date><risdate>2006</risdate><volume>91</volume><issue>1</issue><spage>47</spage><epage>53</epage><pages>47-53</pages><issn>0011-3891</issn><abstract>The post-genomic era is witnessing a remarkable increase in the number of nucleotide and amino acid sequences. The content of biological sequence databases almost doubles frequently. Pattern matching emerges as a powerful tool in locating nucleotide or amino acid sequence patterns in the biological sequence databases. Presently, several pattern-matching algorithms are available in the literature right from the basic Brute Force algorithm to the recent SSABS. The efficiency of the various algorithms depends on faster and exact identification of the pattern in the text. In this article, we propose an exact pattern-matching algorithm for biological sequences. The proposed algorithm, TVSBS, is a combination of Berry–Ravindran and SSABS algorithms. The performance of the new algorithm has been improved using the shift of Berry–Ravindran bad character table, which leads to lesser number of character comparisons. It works consistently well for both nucleotide and amino acid sequences. The proposed algorithm has been compared with the recent algorithm, SSABS. The results show the robustness of the proposed algorithm and thus it can be incorporated in any exact pattern-matching applications involving biological sequences. The best- and worst-case time complexities of the new algorithm are also outlined.</abstract><pub>Current Science Association</pub><tpages>7</tpages></addata></record>
fulltext fulltext
identifier ISSN: 0011-3891
ispartof Current science (Bangalore), 2006-07, Vol.91 (1), p.47-53
issn 0011-3891
language eng
recordid cdi_jstor_primary_24094174
source Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals; JSTOR Archive Collection A-Z Listing
subjects Algorithms
Alphabets
Amino acids
Bioinformatics
Chromosomes
Genes
Nucleotide sequences
Nucleotides
title TVSBS: A fast exact pattern matching algorithm for biological sequences
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-05T02%3A03%3A38IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-jstor&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=TVSBS:%20A%20fast%20exact%20pattern%20matching%20algorithm%20for%20biological%20sequences&rft.jtitle=Current%20science%20(Bangalore)&rft.au=Thathoo,%20Rahul&rft.date=2006-07-10&rft.volume=91&rft.issue=1&rft.spage=47&rft.epage=53&rft.pages=47-53&rft.issn=0011-3891&rft_id=info:doi/&rft_dat=%3Cjstor%3E24094174%3C/jstor%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rft_jstor_id=24094174&rfr_iscdi=true