PatternHunter: faster and more sensitive homology search

Motivation: Genomics and proteomics studies routinely depend on homology searches based on the strategy of finding short seed matches which are then extended. The exploding genomic data growth presents a dilemma for DNA homology search techniques: increasing seed size decreases sensitivity whereas d...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Bioinformatics 2002-03, Vol.18 (3), p.440-445
Hauptverfasser: Ma, Bin, Tromp, John, Li, Ming
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 445
container_issue 3
container_start_page 440
container_title Bioinformatics
container_volume 18
creator Ma, Bin
Tromp, John
Li, Ming
description Motivation: Genomics and proteomics studies routinely depend on homology searches based on the strategy of finding short seed matches which are then extended. The exploding genomic data growth presents a dilemma for DNA homology search techniques: increasing seed size decreases sensitivity whereas decreasing seed size slows down computation. Results: We present a new homology search algorithm ‘PatternHunter’ that uses a novel seed model for increased sensitivity and new hit-processing techniques for significantly increased speed. At Blast levels of sensitivity, PatternHunter is able to find homologies between sequences as large as human chromosomes, in mere hours on a desktop. Availability: PatternHunter is available at http://www.bioinformaticssolutions.com, as a commercial package. It runs on all platforms that support Java. PatternHunter technology is being patented; commercial use requires a license from BSI, while non-commercial use will be free. Contact: mli@cs.ucsb.edu
doi_str_mv 10.1093/bioinformatics/18.3.440
format Article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_71583080</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>71583080</sourcerecordid><originalsourceid>FETCH-LOGICAL-c519t-4321a8be2dcef3940af19b0f20d762414e82534aa752af387e058f4ac7c668793</originalsourceid><addsrcrecordid>eNpdkFFrFDEQx4Motp5-BV0EfdtrJplssr5pUU85UFBBfAlzucSm7m5qsiv225tyh0Wf_sPkN8Pkx9gT4GvgvTzbxRSnkPJIc3TlDMxarhH5HXYK2PFWcNXfrbXsdIuGyxP2oJRLzhUg4n12AtBL1ChPmflI8-zztFmmGi-aQKVmQ9O-GVP2TfFTiXP85ZuLNKYhfb-uLcru4iG7F2go_tExV-zLm9efzzft9sPbd-cvt61T0M8tSgFkdl7snQ-yR04B-h0Pgu91JxDQG6EkEmklKEijPVcmIDntus7oXq7Y88Peq5x-Lr7MdozF-WGgyaelWA3KSF7_uGJP_wMv05KnepuF3nSIQogK6QPkciol-2CvchwpX1vg9sas_desBWOlrWbr5OPj-mU3-v3t3FFlBZ4dASqOhpBpcrHccii06uCGaw9crKZ__32n_MN2WmplN1-_2e1WvYdXQthP8g8KyJP-</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>198644222</pqid></control><display><type>article</type><title>PatternHunter: faster and more sensitive homology search</title><source>MEDLINE</source><source>Oxford Journals Open Access Collection</source><source>EZB-FREE-00999 freely available EZB journals</source><source>Alma/SFX Local Collection</source><creator>Ma, Bin ; Tromp, John ; Li, Ming</creator><creatorcontrib>Ma, Bin ; Tromp, John ; Li, Ming</creatorcontrib><description>Motivation: Genomics and proteomics studies routinely depend on homology searches based on the strategy of finding short seed matches which are then extended. The exploding genomic data growth presents a dilemma for DNA homology search techniques: increasing seed size decreases sensitivity whereas decreasing seed size slows down computation. Results: We present a new homology search algorithm ‘PatternHunter’ that uses a novel seed model for increased sensitivity and new hit-processing techniques for significantly increased speed. At Blast levels of sensitivity, PatternHunter is able to find homologies between sequences as large as human chromosomes, in mere hours on a desktop. Availability: PatternHunter is available at http://www.bioinformaticssolutions.com, as a commercial package. It runs on all platforms that support Java. PatternHunter technology is being patented; commercial use requires a license from BSI, while non-commercial use will be free. Contact: mli@cs.ucsb.edu</description><identifier>ISSN: 1367-4803</identifier><identifier>EISSN: 1460-2059</identifier><identifier>EISSN: 1367-4811</identifier><identifier>DOI: 10.1093/bioinformatics/18.3.440</identifier><identifier>PMID: 11934743</identifier><identifier>CODEN: BOINFP</identifier><language>eng</language><publisher>Oxford: Oxford University Press</publisher><subject>Algorithms ; Base Sequence ; Biological and medical sciences ; Databases, Nucleic Acid ; DNA - genetics ; Fundamental and applied biological sciences. Psychology ; General aspects ; Genome ; Mathematics in biology. Statistical analysis. Models. Metrology. Data processing in biology (general aspects) ; Models, Statistical ; Molecular Sequence Data ; National Library of Medicine (U.S.) ; Quality Control ; Sensitivity and Specificity ; Sequence Alignment - methods ; Sequence Alignment - statistics &amp; numerical data ; Sequence Homology ; Software ; Time Factors ; United States</subject><ispartof>Bioinformatics, 2002-03, Vol.18 (3), p.440-445</ispartof><rights>Copyright Oxford University Press(England) Mar 2002</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c519t-4321a8be2dcef3940af19b0f20d762414e82534aa752af387e058f4ac7c668793</citedby></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,780,784,27924,27925</link.rule.ids><backlink>$$Uhttp://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&amp;idt=14275613$$DView record in Pascal Francis$$Hfree_for_read</backlink><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/11934743$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Ma, Bin</creatorcontrib><creatorcontrib>Tromp, John</creatorcontrib><creatorcontrib>Li, Ming</creatorcontrib><title>PatternHunter: faster and more sensitive homology search</title><title>Bioinformatics</title><addtitle>Bioinformatics</addtitle><description>Motivation: Genomics and proteomics studies routinely depend on homology searches based on the strategy of finding short seed matches which are then extended. The exploding genomic data growth presents a dilemma for DNA homology search techniques: increasing seed size decreases sensitivity whereas decreasing seed size slows down computation. Results: We present a new homology search algorithm ‘PatternHunter’ that uses a novel seed model for increased sensitivity and new hit-processing techniques for significantly increased speed. At Blast levels of sensitivity, PatternHunter is able to find homologies between sequences as large as human chromosomes, in mere hours on a desktop. Availability: PatternHunter is available at http://www.bioinformaticssolutions.com, as a commercial package. It runs on all platforms that support Java. PatternHunter technology is being patented; commercial use requires a license from BSI, while non-commercial use will be free. Contact: mli@cs.ucsb.edu</description><subject>Algorithms</subject><subject>Base Sequence</subject><subject>Biological and medical sciences</subject><subject>Databases, Nucleic Acid</subject><subject>DNA - genetics</subject><subject>Fundamental and applied biological sciences. Psychology</subject><subject>General aspects</subject><subject>Genome</subject><subject>Mathematics in biology. Statistical analysis. Models. Metrology. Data processing in biology (general aspects)</subject><subject>Models, Statistical</subject><subject>Molecular Sequence Data</subject><subject>National Library of Medicine (U.S.)</subject><subject>Quality Control</subject><subject>Sensitivity and Specificity</subject><subject>Sequence Alignment - methods</subject><subject>Sequence Alignment - statistics &amp; numerical data</subject><subject>Sequence Homology</subject><subject>Software</subject><subject>Time Factors</subject><subject>United States</subject><issn>1367-4803</issn><issn>1460-2059</issn><issn>1367-4811</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2002</creationdate><recordtype>article</recordtype><sourceid>EIF</sourceid><recordid>eNpdkFFrFDEQx4Motp5-BV0EfdtrJplssr5pUU85UFBBfAlzucSm7m5qsiv225tyh0Wf_sPkN8Pkx9gT4GvgvTzbxRSnkPJIc3TlDMxarhH5HXYK2PFWcNXfrbXsdIuGyxP2oJRLzhUg4n12AtBL1ChPmflI8-zztFmmGi-aQKVmQ9O-GVP2TfFTiXP85ZuLNKYhfb-uLcru4iG7F2go_tExV-zLm9efzzft9sPbd-cvt61T0M8tSgFkdl7snQ-yR04B-h0Pgu91JxDQG6EkEmklKEijPVcmIDntus7oXq7Y88Peq5x-Lr7MdozF-WGgyaelWA3KSF7_uGJP_wMv05KnepuF3nSIQogK6QPkciol-2CvchwpX1vg9sas_desBWOlrWbr5OPj-mU3-v3t3FFlBZ4dASqOhpBpcrHccii06uCGaw9crKZ__32n_MN2WmplN1-_2e1WvYdXQthP8g8KyJP-</recordid><startdate>20020301</startdate><enddate>20020301</enddate><creator>Ma, Bin</creator><creator>Tromp, John</creator><creator>Li, Ming</creator><general>Oxford University Press</general><general>Oxford Publishing Limited (England)</general><scope>BSCLL</scope><scope>IQODW</scope><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7QF</scope><scope>7QO</scope><scope>7QQ</scope><scope>7SC</scope><scope>7SE</scope><scope>7SP</scope><scope>7SR</scope><scope>7TA</scope><scope>7TB</scope><scope>7TM</scope><scope>7TO</scope><scope>7U5</scope><scope>8BQ</scope><scope>8FD</scope><scope>F28</scope><scope>FR3</scope><scope>H8D</scope><scope>H8G</scope><scope>H94</scope><scope>JG9</scope><scope>JQ2</scope><scope>K9.</scope><scope>KR7</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>P64</scope><scope>7X8</scope></search><sort><creationdate>20020301</creationdate><title>PatternHunter: faster and more sensitive homology search</title><author>Ma, Bin ; Tromp, John ; Li, Ming</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c519t-4321a8be2dcef3940af19b0f20d762414e82534aa752af387e058f4ac7c668793</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2002</creationdate><topic>Algorithms</topic><topic>Base Sequence</topic><topic>Biological and medical sciences</topic><topic>Databases, Nucleic Acid</topic><topic>DNA - genetics</topic><topic>Fundamental and applied biological sciences. Psychology</topic><topic>General aspects</topic><topic>Genome</topic><topic>Mathematics in biology. Statistical analysis. Models. Metrology. Data processing in biology (general aspects)</topic><topic>Models, Statistical</topic><topic>Molecular Sequence Data</topic><topic>National Library of Medicine (U.S.)</topic><topic>Quality Control</topic><topic>Sensitivity and Specificity</topic><topic>Sequence Alignment - methods</topic><topic>Sequence Alignment - statistics &amp; numerical data</topic><topic>Sequence Homology</topic><topic>Software</topic><topic>Time Factors</topic><topic>United States</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Ma, Bin</creatorcontrib><creatorcontrib>Tromp, John</creatorcontrib><creatorcontrib>Li, Ming</creatorcontrib><collection>Istex</collection><collection>Pascal-Francis</collection><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Aluminium Industry Abstracts</collection><collection>Biotechnology Research Abstracts</collection><collection>Ceramic Abstracts</collection><collection>Computer and Information Systems Abstracts</collection><collection>Corrosion Abstracts</collection><collection>Electronics &amp; Communications Abstracts</collection><collection>Engineered Materials Abstracts</collection><collection>Materials Business File</collection><collection>Mechanical &amp; Transportation Engineering Abstracts</collection><collection>Nucleic Acids Abstracts</collection><collection>Oncogenes and Growth Factors Abstracts</collection><collection>Solid State and Superconductivity Abstracts</collection><collection>METADEX</collection><collection>Technology Research Database</collection><collection>ANTE: Abstracts in New Technology &amp; Engineering</collection><collection>Engineering Research Database</collection><collection>Aerospace Database</collection><collection>Copper Technical Reference Library</collection><collection>AIDS and Cancer Research Abstracts</collection><collection>Materials Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>ProQuest Health &amp; Medical Complete (Alumni)</collection><collection>Civil Engineering Abstracts</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>Biotechnology and BioEngineering Abstracts</collection><collection>MEDLINE - Academic</collection><jtitle>Bioinformatics</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Ma, Bin</au><au>Tromp, John</au><au>Li, Ming</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>PatternHunter: faster and more sensitive homology search</atitle><jtitle>Bioinformatics</jtitle><addtitle>Bioinformatics</addtitle><date>2002-03-01</date><risdate>2002</risdate><volume>18</volume><issue>3</issue><spage>440</spage><epage>445</epage><pages>440-445</pages><issn>1367-4803</issn><eissn>1460-2059</eissn><eissn>1367-4811</eissn><coden>BOINFP</coden><abstract>Motivation: Genomics and proteomics studies routinely depend on homology searches based on the strategy of finding short seed matches which are then extended. The exploding genomic data growth presents a dilemma for DNA homology search techniques: increasing seed size decreases sensitivity whereas decreasing seed size slows down computation. Results: We present a new homology search algorithm ‘PatternHunter’ that uses a novel seed model for increased sensitivity and new hit-processing techniques for significantly increased speed. At Blast levels of sensitivity, PatternHunter is able to find homologies between sequences as large as human chromosomes, in mere hours on a desktop. Availability: PatternHunter is available at http://www.bioinformaticssolutions.com, as a commercial package. It runs on all platforms that support Java. PatternHunter technology is being patented; commercial use requires a license from BSI, while non-commercial use will be free. Contact: mli@cs.ucsb.edu</abstract><cop>Oxford</cop><pub>Oxford University Press</pub><pmid>11934743</pmid><doi>10.1093/bioinformatics/18.3.440</doi><tpages>6</tpages><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 1367-4803
ispartof Bioinformatics, 2002-03, Vol.18 (3), p.440-445
issn 1367-4803
1460-2059
1367-4811
language eng
recordid cdi_proquest_miscellaneous_71583080
source MEDLINE; Oxford Journals Open Access Collection; EZB-FREE-00999 freely available EZB journals; Alma/SFX Local Collection
subjects Algorithms
Base Sequence
Biological and medical sciences
Databases, Nucleic Acid
DNA - genetics
Fundamental and applied biological sciences. Psychology
General aspects
Genome
Mathematics in biology. Statistical analysis. Models. Metrology. Data processing in biology (general aspects)
Models, Statistical
Molecular Sequence Data
National Library of Medicine (U.S.)
Quality Control
Sensitivity and Specificity
Sequence Alignment - methods
Sequence Alignment - statistics & numerical data
Sequence Homology
Software
Time Factors
United States
title PatternHunter: faster and more sensitive homology search
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-22T10%3A56%3A31IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=PatternHunter:%20faster%20and%20more%20sensitive%20homology%20search&rft.jtitle=Bioinformatics&rft.au=Ma,%20Bin&rft.date=2002-03-01&rft.volume=18&rft.issue=3&rft.spage=440&rft.epage=445&rft.pages=440-445&rft.issn=1367-4803&rft.eissn=1460-2059&rft.coden=BOINFP&rft_id=info:doi/10.1093/bioinformatics/18.3.440&rft_dat=%3Cproquest_cross%3E71583080%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=198644222&rft_id=info:pmid/11934743&rfr_iscdi=true