A search for common patterns in many sequences
A new approach to search for common patterns in many sequences is presented. The idea is that one sequence from the set of sequences to be compared is considered as a ‘basic’ one and all its similarities with other sequences are found. Multiple similarities are then reconstructed using these data. T...
Gespeichert in:
Veröffentlicht in: | Bioinformatics 1992-02, Vol.8 (1), p.57-64 |
---|---|
1. Verfasser: | |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 64 |
---|---|
container_issue | 1 |
container_start_page | 57 |
container_title | Bioinformatics |
container_volume | 8 |
creator | ROYTBERG, M. A |
description | A new approach to search for common patterns in many sequences is presented. The idea is that one sequence from the set of sequences to be compared is considered as a ‘basic’ one and all its similarities with other sequences are found. Multiple similarities are then reconstructed using these data. This approach allows one to search for similar segments which can differ in both substitutions and deletions/insertions. These segments can be situated at different positions in various sequences. No regions of complete or strong similarity within the segments are required. The other parts of the sequences can have no similarity at all. The only requirement is that the similar segments can be found in all the sequences (or in the majority of them, given the common segments are present in the basic sequence). Working time of an algorithm presented is proportional to n.L2when n sequences of length L are analyzed. The algorithm proposed is implemented as programs for the IBM-PC and IBM/370. Its applications to the analysis of biopolymer primary structures as well as the dependence of the results on the choice of basic sequence are discussed. |
doi_str_mv | 10.1093/bioinformatics/8.1.57 |
format | Article |
fullrecord | <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_72903728</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>72903728</sourcerecordid><originalsourceid>FETCH-LOGICAL-c371t-d22303dc17413b2cfa4aac1afa5141f6b244aca1b82a286339953fc941e08f173</originalsourceid><addsrcrecordid>eNpVkE9Lw0AQxRdRaq1-hEIO4i3pzv7JJsdS1AqiHiqUXpbJdhejTVJ3U7Df3pSUiqcZeL95b3iEjIEmQHM-KcqmrF3jK2xLEyZZAolUZ2QIIqUxozI_73aeqlhklF-SqxA-KZUghBiQAcg0A6aGJJlGwaI3H1FnFZmmqpo62mLbWl-HqKyjCut9h3zvbG1suCYXDjfB3hzniLw_3C9m8_j59fFpNn2ODVfQxmvGOOVrA0oAL5hxKBANoMPuAXBpwYRAg1BkDFmWcp7nkjuTC7A0c6D4iNz1vlvfdNGh1VUZjN1ssLbNLmjFcsoVyzpQ9qDxTQjeOr31ZYV-r4HqQ0_6f08606DlIWB8DNgVlV3_XfXFdPrtUcdgcOM81qYMJ0wCValiHRb3WBla-3OS0X_pVHEl9Xy50m9qxZazxYue81_oSINm</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>72903728</pqid></control><display><type>article</type><title>A search for common patterns in many sequences</title><source>MEDLINE</source><source>Oxford University Press Archive</source><source>Oxford Journals Open Access Collection</source><source>Alma/SFX Local Collection</source><creator>ROYTBERG, M. A</creator><creatorcontrib>ROYTBERG, M. A</creatorcontrib><description>A new approach to search for common patterns in many sequences is presented. The idea is that one sequence from the set of sequences to be compared is considered as a ‘basic’ one and all its similarities with other sequences are found. Multiple similarities are then reconstructed using these data. This approach allows one to search for similar segments which can differ in both substitutions and deletions/insertions. These segments can be situated at different positions in various sequences. No regions of complete or strong similarity within the segments are required. The other parts of the sequences can have no similarity at all. The only requirement is that the similar segments can be found in all the sequences (or in the majority of them, given the common segments are present in the basic sequence). Working time of an algorithm presented is proportional to n.L2when n sequences of length L are analyzed. The algorithm proposed is implemented as programs for the IBM-PC and IBM/370. Its applications to the analysis of biopolymer primary structures as well as the dependence of the results on the choice of basic sequence are discussed.</description><identifier>ISSN: 1367-4803</identifier><identifier>ISSN: 0266-7061</identifier><identifier>EISSN: 1460-2059</identifier><identifier>DOI: 10.1093/bioinformatics/8.1.57</identifier><identifier>PMID: 1568127</identifier><identifier>CODEN: COABER</identifier><language>eng</language><publisher>Washington, DC: Oxford University Press</publisher><subject>Algorithms ; Amino Acid Sequence ; Base Sequence ; Biological and medical sciences ; Fundamental and applied biological sciences. Psychology ; General aspects ; Humans ; Mathematics in biology. Statistical analysis. Models. Metrology. Data processing in biology (general aspects) ; Molecular Sequence Data ; Pattern Recognition, Automated ; Sequence Alignment - methods ; Sequence Alignment - statistics & numerical data ; Software</subject><ispartof>Bioinformatics, 1992-02, Vol.8 (1), p.57-64</ispartof><rights>1992 INIST-CNRS</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c371t-d22303dc17413b2cfa4aac1afa5141f6b244aca1b82a286339953fc941e08f173</citedby></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,780,784,27924,27925</link.rule.ids><backlink>$$Uhttp://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&idt=5107672$$DView record in Pascal Francis$$Hfree_for_read</backlink><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/1568127$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>ROYTBERG, M. A</creatorcontrib><title>A search for common patterns in many sequences</title><title>Bioinformatics</title><addtitle>Comput Appl Biosci</addtitle><description>A new approach to search for common patterns in many sequences is presented. The idea is that one sequence from the set of sequences to be compared is considered as a ‘basic’ one and all its similarities with other sequences are found. Multiple similarities are then reconstructed using these data. This approach allows one to search for similar segments which can differ in both substitutions and deletions/insertions. These segments can be situated at different positions in various sequences. No regions of complete or strong similarity within the segments are required. The other parts of the sequences can have no similarity at all. The only requirement is that the similar segments can be found in all the sequences (or in the majority of them, given the common segments are present in the basic sequence). Working time of an algorithm presented is proportional to n.L2when n sequences of length L are analyzed. The algorithm proposed is implemented as programs for the IBM-PC and IBM/370. Its applications to the analysis of biopolymer primary structures as well as the dependence of the results on the choice of basic sequence are discussed.</description><subject>Algorithms</subject><subject>Amino Acid Sequence</subject><subject>Base Sequence</subject><subject>Biological and medical sciences</subject><subject>Fundamental and applied biological sciences. Psychology</subject><subject>General aspects</subject><subject>Humans</subject><subject>Mathematics in biology. Statistical analysis. Models. Metrology. Data processing in biology (general aspects)</subject><subject>Molecular Sequence Data</subject><subject>Pattern Recognition, Automated</subject><subject>Sequence Alignment - methods</subject><subject>Sequence Alignment - statistics & numerical data</subject><subject>Software</subject><issn>1367-4803</issn><issn>0266-7061</issn><issn>1460-2059</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>1992</creationdate><recordtype>article</recordtype><sourceid>EIF</sourceid><recordid>eNpVkE9Lw0AQxRdRaq1-hEIO4i3pzv7JJsdS1AqiHiqUXpbJdhejTVJ3U7Df3pSUiqcZeL95b3iEjIEmQHM-KcqmrF3jK2xLEyZZAolUZ2QIIqUxozI_73aeqlhklF-SqxA-KZUghBiQAcg0A6aGJJlGwaI3H1FnFZmmqpo62mLbWl-HqKyjCut9h3zvbG1suCYXDjfB3hzniLw_3C9m8_j59fFpNn2ODVfQxmvGOOVrA0oAL5hxKBANoMPuAXBpwYRAg1BkDFmWcp7nkjuTC7A0c6D4iNz1vlvfdNGh1VUZjN1ssLbNLmjFcsoVyzpQ9qDxTQjeOr31ZYV-r4HqQ0_6f08606DlIWB8DNgVlV3_XfXFdPrtUcdgcOM81qYMJ0wCValiHRb3WBla-3OS0X_pVHEl9Xy50m9qxZazxYue81_oSINm</recordid><startdate>19920201</startdate><enddate>19920201</enddate><creator>ROYTBERG, M. A</creator><general>Oxford University Press</general><scope>BSCLL</scope><scope>IQODW</scope><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7X8</scope></search><sort><creationdate>19920201</creationdate><title>A search for common patterns in many sequences</title><author>ROYTBERG, M. A</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c371t-d22303dc17413b2cfa4aac1afa5141f6b244aca1b82a286339953fc941e08f173</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>1992</creationdate><topic>Algorithms</topic><topic>Amino Acid Sequence</topic><topic>Base Sequence</topic><topic>Biological and medical sciences</topic><topic>Fundamental and applied biological sciences. Psychology</topic><topic>General aspects</topic><topic>Humans</topic><topic>Mathematics in biology. Statistical analysis. Models. Metrology. Data processing in biology (general aspects)</topic><topic>Molecular Sequence Data</topic><topic>Pattern Recognition, Automated</topic><topic>Sequence Alignment - methods</topic><topic>Sequence Alignment - statistics & numerical data</topic><topic>Software</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>ROYTBERG, M. A</creatorcontrib><collection>Istex</collection><collection>Pascal-Francis</collection><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>MEDLINE - Academic</collection><jtitle>Bioinformatics</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>ROYTBERG, M. A</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>A search for common patterns in many sequences</atitle><jtitle>Bioinformatics</jtitle><addtitle>Comput Appl Biosci</addtitle><date>1992-02-01</date><risdate>1992</risdate><volume>8</volume><issue>1</issue><spage>57</spage><epage>64</epage><pages>57-64</pages><issn>1367-4803</issn><issn>0266-7061</issn><eissn>1460-2059</eissn><coden>COABER</coden><abstract>A new approach to search for common patterns in many sequences is presented. The idea is that one sequence from the set of sequences to be compared is considered as a ‘basic’ one and all its similarities with other sequences are found. Multiple similarities are then reconstructed using these data. This approach allows one to search for similar segments which can differ in both substitutions and deletions/insertions. These segments can be situated at different positions in various sequences. No regions of complete or strong similarity within the segments are required. The other parts of the sequences can have no similarity at all. The only requirement is that the similar segments can be found in all the sequences (or in the majority of them, given the common segments are present in the basic sequence). Working time of an algorithm presented is proportional to n.L2when n sequences of length L are analyzed. The algorithm proposed is implemented as programs for the IBM-PC and IBM/370. Its applications to the analysis of biopolymer primary structures as well as the dependence of the results on the choice of basic sequence are discussed.</abstract><cop>Washington, DC</cop><cop>Oxford</cop><pub>Oxford University Press</pub><pmid>1568127</pmid><doi>10.1093/bioinformatics/8.1.57</doi><tpages>8</tpages></addata></record> |
fulltext | fulltext |
identifier | ISSN: 1367-4803 |
ispartof | Bioinformatics, 1992-02, Vol.8 (1), p.57-64 |
issn | 1367-4803 0266-7061 1460-2059 |
language | eng |
recordid | cdi_proquest_miscellaneous_72903728 |
source | MEDLINE; Oxford University Press Archive; Oxford Journals Open Access Collection; Alma/SFX Local Collection |
subjects | Algorithms Amino Acid Sequence Base Sequence Biological and medical sciences Fundamental and applied biological sciences. Psychology General aspects Humans Mathematics in biology. Statistical analysis. Models. Metrology. Data processing in biology (general aspects) Molecular Sequence Data Pattern Recognition, Automated Sequence Alignment - methods Sequence Alignment - statistics & numerical data Software |
title | A search for common patterns in many sequences |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-29T13%3A30%3A19IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=A%20search%20for%20common%20patterns%20in%20many%20sequences&rft.jtitle=Bioinformatics&rft.au=ROYTBERG,%20M.%20A&rft.date=1992-02-01&rft.volume=8&rft.issue=1&rft.spage=57&rft.epage=64&rft.pages=57-64&rft.issn=1367-4803&rft.eissn=1460-2059&rft.coden=COABER&rft_id=info:doi/10.1093/bioinformatics/8.1.57&rft_dat=%3Cproquest_cross%3E72903728%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=72903728&rft_id=info:pmid/1568127&rfr_iscdi=true |