PatternHunter: faster and more sensitive homology search
Motivation: Genomics and proteomics studies routinely depend on homology searches based on the strategy of finding short seed matches which are then extended. The exploding genomic data growth presents a dilemma for DNA homology search techniques: increasing seed size decreases sensitivity whereas d...
Gespeichert in:
Veröffentlicht in: | Bioinformatics 2002-03, Vol.18 (3), p.440-445 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 445 |
---|---|
container_issue | 3 |
container_start_page | 440 |
container_title | Bioinformatics |
container_volume | 18 |
creator | Ma, Bin Tromp, John Li, Ming |
description | Motivation: Genomics and proteomics studies routinely depend on homology searches based on the strategy of finding short seed matches which are then extended. The exploding genomic data growth presents a dilemma for DNA homology search techniques: increasing seed size decreases sensitivity whereas decreasing seed size slows down computation. Results: We present a new homology search algorithm ‘PatternHunter’ that uses a novel seed model for increased sensitivity and new hit-processing techniques for significantly increased speed. At Blast levels of sensitivity, PatternHunter is able to find homologies between sequences as large as human chromosomes, in mere hours on a desktop. Availability: PatternHunter is available at http://www.bioinformaticssolutions.com, as a commercial package. It runs on all platforms that support Java. PatternHunter technology is being patented; commercial use requires a license from BSI, while non-commercial use will be free. Contact: mli@cs.ucsb.edu |
doi_str_mv | 10.1093/bioinformatics/18.3.440 |
format | Article |
fullrecord | <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_71583080</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>71583080</sourcerecordid><originalsourceid>FETCH-LOGICAL-c519t-4321a8be2dcef3940af19b0f20d762414e82534aa752af387e058f4ac7c668793</originalsourceid><addsrcrecordid>eNpdkFFrFDEQx4Motp5-BV0EfdtrJplssr5pUU85UFBBfAlzucSm7m5qsiv225tyh0Wf_sPkN8Pkx9gT4GvgvTzbxRSnkPJIc3TlDMxarhH5HXYK2PFWcNXfrbXsdIuGyxP2oJRLzhUg4n12AtBL1ChPmflI8-zztFmmGi-aQKVmQ9O-GVP2TfFTiXP85ZuLNKYhfb-uLcru4iG7F2go_tExV-zLm9efzzft9sPbd-cvt61T0M8tSgFkdl7snQ-yR04B-h0Pgu91JxDQG6EkEmklKEijPVcmIDntus7oXq7Y88Peq5x-Lr7MdozF-WGgyaelWA3KSF7_uGJP_wMv05KnepuF3nSIQogK6QPkciol-2CvchwpX1vg9sas_desBWOlrWbr5OPj-mU3-v3t3FFlBZ4dASqOhpBpcrHccii06uCGaw9crKZ__32n_MN2WmplN1-_2e1WvYdXQthP8g8KyJP-</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>198644222</pqid></control><display><type>article</type><title>PatternHunter: faster and more sensitive homology search</title><source>MEDLINE</source><source>Oxford Journals Open Access Collection</source><source>EZB-FREE-00999 freely available EZB journals</source><source>Alma/SFX Local Collection</source><creator>Ma, Bin ; Tromp, John ; Li, Ming</creator><creatorcontrib>Ma, Bin ; Tromp, John ; Li, Ming</creatorcontrib><description>Motivation: Genomics and proteomics studies routinely depend on homology searches based on the strategy of finding short seed matches which are then extended. The exploding genomic data growth presents a dilemma for DNA homology search techniques: increasing seed size decreases sensitivity whereas decreasing seed size slows down computation. Results: We present a new homology search algorithm ‘PatternHunter’ that uses a novel seed model for increased sensitivity and new hit-processing techniques for significantly increased speed. At Blast levels of sensitivity, PatternHunter is able to find homologies between sequences as large as human chromosomes, in mere hours on a desktop. Availability: PatternHunter is available at http://www.bioinformaticssolutions.com, as a commercial package. It runs on all platforms that support Java. PatternHunter technology is being patented; commercial use requires a license from BSI, while non-commercial use will be free. Contact: mli@cs.ucsb.edu</description><identifier>ISSN: 1367-4803</identifier><identifier>EISSN: 1460-2059</identifier><identifier>EISSN: 1367-4811</identifier><identifier>DOI: 10.1093/bioinformatics/18.3.440</identifier><identifier>PMID: 11934743</identifier><identifier>CODEN: BOINFP</identifier><language>eng</language><publisher>Oxford: Oxford University Press</publisher><subject>Algorithms ; Base Sequence ; Biological and medical sciences ; Databases, Nucleic Acid ; DNA - genetics ; Fundamental and applied biological sciences. Psychology ; General aspects ; Genome ; Mathematics in biology. Statistical analysis. Models. Metrology. Data processing in biology (general aspects) ; Models, Statistical ; Molecular Sequence Data ; National Library of Medicine (U.S.) ; Quality Control ; Sensitivity and Specificity ; Sequence Alignment - methods ; Sequence Alignment - statistics & numerical data ; Sequence Homology ; Software ; Time Factors ; United States</subject><ispartof>Bioinformatics, 2002-03, Vol.18 (3), p.440-445</ispartof><rights>Copyright Oxford University Press(England) Mar 2002</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c519t-4321a8be2dcef3940af19b0f20d762414e82534aa752af387e058f4ac7c668793</citedby></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,780,784,27924,27925</link.rule.ids><backlink>$$Uhttp://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&idt=14275613$$DView record in Pascal Francis$$Hfree_for_read</backlink><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/11934743$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Ma, Bin</creatorcontrib><creatorcontrib>Tromp, John</creatorcontrib><creatorcontrib>Li, Ming</creatorcontrib><title>PatternHunter: faster and more sensitive homology search</title><title>Bioinformatics</title><addtitle>Bioinformatics</addtitle><description>Motivation: Genomics and proteomics studies routinely depend on homology searches based on the strategy of finding short seed matches which are then extended. The exploding genomic data growth presents a dilemma for DNA homology search techniques: increasing seed size decreases sensitivity whereas decreasing seed size slows down computation. Results: We present a new homology search algorithm ‘PatternHunter’ that uses a novel seed model for increased sensitivity and new hit-processing techniques for significantly increased speed. At Blast levels of sensitivity, PatternHunter is able to find homologies between sequences as large as human chromosomes, in mere hours on a desktop. Availability: PatternHunter is available at http://www.bioinformaticssolutions.com, as a commercial package. It runs on all platforms that support Java. PatternHunter technology is being patented; commercial use requires a license from BSI, while non-commercial use will be free. Contact: mli@cs.ucsb.edu</description><subject>Algorithms</subject><subject>Base Sequence</subject><subject>Biological and medical sciences</subject><subject>Databases, Nucleic Acid</subject><subject>DNA - genetics</subject><subject>Fundamental and applied biological sciences. Psychology</subject><subject>General aspects</subject><subject>Genome</subject><subject>Mathematics in biology. Statistical analysis. Models. Metrology. Data processing in biology (general aspects)</subject><subject>Models, Statistical</subject><subject>Molecular Sequence Data</subject><subject>National Library of Medicine (U.S.)</subject><subject>Quality Control</subject><subject>Sensitivity and Specificity</subject><subject>Sequence Alignment - methods</subject><subject>Sequence Alignment - statistics & numerical data</subject><subject>Sequence Homology</subject><subject>Software</subject><subject>Time Factors</subject><subject>United States</subject><issn>1367-4803</issn><issn>1460-2059</issn><issn>1367-4811</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2002</creationdate><recordtype>article</recordtype><sourceid>EIF</sourceid><recordid>eNpdkFFrFDEQx4Motp5-BV0EfdtrJplssr5pUU85UFBBfAlzucSm7m5qsiv225tyh0Wf_sPkN8Pkx9gT4GvgvTzbxRSnkPJIc3TlDMxarhH5HXYK2PFWcNXfrbXsdIuGyxP2oJRLzhUg4n12AtBL1ChPmflI8-zztFmmGi-aQKVmQ9O-GVP2TfFTiXP85ZuLNKYhfb-uLcru4iG7F2go_tExV-zLm9efzzft9sPbd-cvt61T0M8tSgFkdl7snQ-yR04B-h0Pgu91JxDQG6EkEmklKEijPVcmIDntus7oXq7Y88Peq5x-Lr7MdozF-WGgyaelWA3KSF7_uGJP_wMv05KnepuF3nSIQogK6QPkciol-2CvchwpX1vg9sas_desBWOlrWbr5OPj-mU3-v3t3FFlBZ4dASqOhpBpcrHccii06uCGaw9crKZ__32n_MN2WmplN1-_2e1WvYdXQthP8g8KyJP-</recordid><startdate>20020301</startdate><enddate>20020301</enddate><creator>Ma, Bin</creator><creator>Tromp, John</creator><creator>Li, Ming</creator><general>Oxford University Press</general><general>Oxford Publishing Limited (England)</general><scope>BSCLL</scope><scope>IQODW</scope><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7QF</scope><scope>7QO</scope><scope>7QQ</scope><scope>7SC</scope><scope>7SE</scope><scope>7SP</scope><scope>7SR</scope><scope>7TA</scope><scope>7TB</scope><scope>7TM</scope><scope>7TO</scope><scope>7U5</scope><scope>8BQ</scope><scope>8FD</scope><scope>F28</scope><scope>FR3</scope><scope>H8D</scope><scope>H8G</scope><scope>H94</scope><scope>JG9</scope><scope>JQ2</scope><scope>K9.</scope><scope>KR7</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>P64</scope><scope>7X8</scope></search><sort><creationdate>20020301</creationdate><title>PatternHunter: faster and more sensitive homology search</title><author>Ma, Bin ; Tromp, John ; Li, Ming</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c519t-4321a8be2dcef3940af19b0f20d762414e82534aa752af387e058f4ac7c668793</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2002</creationdate><topic>Algorithms</topic><topic>Base Sequence</topic><topic>Biological and medical sciences</topic><topic>Databases, Nucleic Acid</topic><topic>DNA - genetics</topic><topic>Fundamental and applied biological sciences. Psychology</topic><topic>General aspects</topic><topic>Genome</topic><topic>Mathematics in biology. Statistical analysis. Models. Metrology. Data processing in biology (general aspects)</topic><topic>Models, Statistical</topic><topic>Molecular Sequence Data</topic><topic>National Library of Medicine (U.S.)</topic><topic>Quality Control</topic><topic>Sensitivity and Specificity</topic><topic>Sequence Alignment - methods</topic><topic>Sequence Alignment - statistics & numerical data</topic><topic>Sequence Homology</topic><topic>Software</topic><topic>Time Factors</topic><topic>United States</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Ma, Bin</creatorcontrib><creatorcontrib>Tromp, John</creatorcontrib><creatorcontrib>Li, Ming</creatorcontrib><collection>Istex</collection><collection>Pascal-Francis</collection><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Aluminium Industry Abstracts</collection><collection>Biotechnology Research Abstracts</collection><collection>Ceramic Abstracts</collection><collection>Computer and Information Systems Abstracts</collection><collection>Corrosion Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Engineered Materials Abstracts</collection><collection>Materials Business File</collection><collection>Mechanical & Transportation Engineering Abstracts</collection><collection>Nucleic Acids Abstracts</collection><collection>Oncogenes and Growth Factors Abstracts</collection><collection>Solid State and Superconductivity Abstracts</collection><collection>METADEX</collection><collection>Technology Research Database</collection><collection>ANTE: Abstracts in New Technology & Engineering</collection><collection>Engineering Research Database</collection><collection>Aerospace Database</collection><collection>Copper Technical Reference Library</collection><collection>AIDS and Cancer Research Abstracts</collection><collection>Materials Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>ProQuest Health & Medical Complete (Alumni)</collection><collection>Civil Engineering Abstracts</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>Biotechnology and BioEngineering Abstracts</collection><collection>MEDLINE - Academic</collection><jtitle>Bioinformatics</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Ma, Bin</au><au>Tromp, John</au><au>Li, Ming</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>PatternHunter: faster and more sensitive homology search</atitle><jtitle>Bioinformatics</jtitle><addtitle>Bioinformatics</addtitle><date>2002-03-01</date><risdate>2002</risdate><volume>18</volume><issue>3</issue><spage>440</spage><epage>445</epage><pages>440-445</pages><issn>1367-4803</issn><eissn>1460-2059</eissn><eissn>1367-4811</eissn><coden>BOINFP</coden><abstract>Motivation: Genomics and proteomics studies routinely depend on homology searches based on the strategy of finding short seed matches which are then extended. The exploding genomic data growth presents a dilemma for DNA homology search techniques: increasing seed size decreases sensitivity whereas decreasing seed size slows down computation. Results: We present a new homology search algorithm ‘PatternHunter’ that uses a novel seed model for increased sensitivity and new hit-processing techniques for significantly increased speed. At Blast levels of sensitivity, PatternHunter is able to find homologies between sequences as large as human chromosomes, in mere hours on a desktop. Availability: PatternHunter is available at http://www.bioinformaticssolutions.com, as a commercial package. It runs on all platforms that support Java. PatternHunter technology is being patented; commercial use requires a license from BSI, while non-commercial use will be free. Contact: mli@cs.ucsb.edu</abstract><cop>Oxford</cop><pub>Oxford University Press</pub><pmid>11934743</pmid><doi>10.1093/bioinformatics/18.3.440</doi><tpages>6</tpages><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 1367-4803 |
ispartof | Bioinformatics, 2002-03, Vol.18 (3), p.440-445 |
issn | 1367-4803 1460-2059 1367-4811 |
language | eng |
recordid | cdi_proquest_miscellaneous_71583080 |
source | MEDLINE; Oxford Journals Open Access Collection; EZB-FREE-00999 freely available EZB journals; Alma/SFX Local Collection |
subjects | Algorithms Base Sequence Biological and medical sciences Databases, Nucleic Acid DNA - genetics Fundamental and applied biological sciences. Psychology General aspects Genome Mathematics in biology. Statistical analysis. Models. Metrology. Data processing in biology (general aspects) Models, Statistical Molecular Sequence Data National Library of Medicine (U.S.) Quality Control Sensitivity and Specificity Sequence Alignment - methods Sequence Alignment - statistics & numerical data Sequence Homology Software Time Factors United States |
title | PatternHunter: faster and more sensitive homology search |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-22T10%3A56%3A31IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=PatternHunter:%20faster%20and%20more%20sensitive%20homology%20search&rft.jtitle=Bioinformatics&rft.au=Ma,%20Bin&rft.date=2002-03-01&rft.volume=18&rft.issue=3&rft.spage=440&rft.epage=445&rft.pages=440-445&rft.issn=1367-4803&rft.eissn=1460-2059&rft.coden=BOINFP&rft_id=info:doi/10.1093/bioinformatics/18.3.440&rft_dat=%3Cproquest_cross%3E71583080%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=198644222&rft_id=info:pmid/11934743&rfr_iscdi=true |