A search for common patterns in many sequences

A new approach to search for common patterns in many sequences is presented. The idea is that one sequence from the set of sequences to be compared is considered as a ‘basic’ one and all its similarities with other sequences are found. Multiple similarities are then reconstructed using these data. T...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Bioinformatics 1992-02, Vol.8 (1), p.57-64
1. Verfasser:	ROYTBERG, M. A
Format:	Artikel
Sprache:	eng
Schlagworte:	Algorithms Amino Acid Sequence Base Sequence Biological and medical sciences Fundamental and applied biological sciences. Psychology General aspects Humans Mathematics in biology. Statistical analysis. Models. Metrology. Data processing in biology (general aspects) Molecular Sequence Data Pattern Recognition, Automated Sequence Alignment - methods Sequence Alignment - statistics & numerical data Software
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	64
container_issue	1
container_start_page	57
container_title	Bioinformatics
container_volume	8
creator	ROYTBERG, M. A
description	A new approach to search for common patterns in many sequences is presented. The idea is that one sequence from the set of sequences to be compared is considered as a ‘basic’ one and all its similarities with other sequences are found. Multiple similarities are then reconstructed using these data. This approach allows one to search for similar segments which can differ in both substitutions and deletions/insertions. These segments can be situated at different positions in various sequences. No regions of complete or strong similarity within the segments are required. The other parts of the sequences can have no similarity at all. The only requirement is that the similar segments can be found in all the sequences (or in the majority of them, given the common segments are present in the basic sequence). Working time of an algorithm presented is proportional to n.L2when n sequences of length L are analyzed. The algorithm proposed is implemented as programs for the IBM-PC and IBM/370. Its applications to the analysis of biopolymer primary structures as well as the dependence of the results on the choice of basic sequence are discussed.
doi_str_mv	10.1093/bioinformatics/8.1.57
format	Article
fullrecord	<record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_72903728</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>72903728</sourcerecordid><originalsourceid>FETCH-LOGICAL-c371t-d22303dc17413b2cfa4aac1afa5141f6b244aca1b82a286339953fc941e08f173</originalsourceid><addsrcrecordid>eNpVkE9Lw0AQxRdRaq1-hEIO4i3pzv7JJsdS1AqiHiqUXpbJdhejTVJ3U7Df3pSUiqcZeL95b3iEjIEmQHM-KcqmrF3jK2xLEyZZAolUZ2QIIqUxozI_73aeqlhklF-SqxA-KZUghBiQAcg0A6aGJJlGwaI3H1FnFZmmqpo62mLbWl-HqKyjCut9h3zvbG1suCYXDjfB3hzniLw_3C9m8_j59fFpNn2ODVfQxmvGOOVrA0oAL5hxKBANoMPuAXBpwYRAg1BkDFmWcp7nkjuTC7A0c6D4iNz1vlvfdNGh1VUZjN1ssLbNLmjFcsoVyzpQ9qDxTQjeOr31ZYV-r4HqQ0_6f08606DlIWB8DNgVlV3_XfXFdPrtUcdgcOM81qYMJ0wCValiHRb3WBla-3OS0X_pVHEl9Xy50m9qxZazxYue81_oSINm</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>72903728</pqid></control><display><type>article</type><title>A search for common patterns in many sequences</title><source>MEDLINE</source><source>Oxford University Press Archive</source><source>Oxford Journals Open Access Collection</source><source>Alma/SFX Local Collection</source><creator>ROYTBERG, M. A</creator><creatorcontrib>ROYTBERG, M. A</creatorcontrib><description>A new approach to search for common patterns in many sequences is presented. The idea is that one sequence from the set of sequences to be compared is considered as a ‘basic’ one and all its similarities with other sequences are found. Multiple similarities are then reconstructed using these data. This approach allows one to search for similar segments which can differ in both substitutions and deletions/insertions. These segments can be situated at different positions in various sequences. No regions of complete or strong similarity within the segments are required. The other parts of the sequences can have no similarity at all. The only requirement is that the similar segments can be found in all the sequences (or in the majority of them, given the common segments are present in the basic sequence). Working time of an algorithm presented is proportional to n.L2when n sequences of length L are analyzed. The algorithm proposed is implemented as programs for the IBM-PC and IBM/370. Its applications to the analysis of biopolymer primary structures as well as the dependence of the results on the choice of basic sequence are discussed.</description><identifier>ISSN: 1367-4803</identifier><identifier>ISSN: 0266-7061</identifier><identifier>EISSN: 1460-2059</identifier><identifier>DOI: 10.1093/bioinformatics/8.1.57</identifier><identifier>PMID: 1568127</identifier><identifier>CODEN: COABER</identifier><language>eng</language><publisher>Washington, DC: Oxford University Press</publisher><subject>Algorithms ; Amino Acid Sequence ; Base Sequence ; Biological and medical sciences ; Fundamental and applied biological sciences. Psychology ; General aspects ; Humans ; Mathematics in biology. Statistical analysis. Models. Metrology. Data processing in biology (general aspects) ; Molecular Sequence Data ; Pattern Recognition, Automated ; Sequence Alignment - methods ; Sequence Alignment - statistics & numerical data ; Software</subject><ispartof>Bioinformatics, 1992-02, Vol.8 (1), p.57-64</ispartof><rights>1992 INIST-CNRS</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c371t-d22303dc17413b2cfa4aac1afa5141f6b244aca1b82a286339953fc941e08f173</citedby></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,780,784,27924,27925</link.rule.ids><backlink>$$Uhttp://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&idt=5107672$$DView record in Pascal Francis$$Hfree_for_read</backlink><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/1568127$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>ROYTBERG, M. A</creatorcontrib><title>A search for common patterns in many sequences</title><title>Bioinformatics</title><addtitle>Comput Appl Biosci</addtitle><description>A new approach to search for common patterns in many sequences is presented. The idea is that one sequence from the set of sequences to be compared is considered as a ‘basic’ one and all its similarities with other sequences are found. Multiple similarities are then reconstructed using these data. This approach allows one to search for similar segments which can differ in both substitutions and deletions/insertions. These segments can be situated at different positions in various sequences. No regions of complete or strong similarity within the segments are required. The other parts of the sequences can have no similarity at all. The only requirement is that the similar segments can be found in all the sequences (or in the majority of them, given the common segments are present in the basic sequence). Working time of an algorithm presented is proportional to n.L2when n sequences of length L are analyzed. The algorithm proposed is implemented as programs for the IBM-PC and IBM/370. Its applications to the analysis of biopolymer primary structures as well as the dependence of the results on the choice of basic sequence are discussed.</description><subject>Algorithms</subject><subject>Amino Acid Sequence</subject><subject>Base Sequence</subject><subject>Biological and medical sciences</subject><subject>Fundamental and applied biological sciences. Psychology</subject><subject>General aspects</subject><subject>Humans</subject><subject>Mathematics in biology. Statistical analysis. Models. Metrology. Data processing in biology (general aspects)</subject><subject>Molecular Sequence Data</subject><subject>Pattern Recognition, Automated</subject><subject>Sequence Alignment - methods</subject><subject>Sequence Alignment - statistics & numerical data</subject><subject>Software</subject><issn>1367-4803</issn><issn>0266-7061</issn><issn>1460-2059</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>1992</creationdate><recordtype>article</recordtype><sourceid>EIF</sourceid><recordid>eNpVkE9Lw0AQxRdRaq1-hEIO4i3pzv7JJsdS1AqiHiqUXpbJdhejTVJ3U7Df3pSUiqcZeL95b3iEjIEmQHM-KcqmrF3jK2xLEyZZAolUZ2QIIqUxozI_73aeqlhklF-SqxA-KZUghBiQAcg0A6aGJJlGwaI3H1FnFZmmqpo62mLbWl-HqKyjCut9h3zvbG1suCYXDjfB3hzniLw_3C9m8_j59fFpNn2ODVfQxmvGOOVrA0oAL5hxKBANoMPuAXBpwYRAg1BkDFmWcp7nkjuTC7A0c6D4iNz1vlvfdNGh1VUZjN1ssLbNLmjFcsoVyzpQ9qDxTQjeOr31ZYV-r4HqQ0_6f08606DlIWB8DNgVlV3_XfXFdPrtUcdgcOM81qYMJ0wCValiHRb3WBla-3OS0X_pVHEl9Xy50m9qxZazxYue81_oSINm</recordid><startdate>19920201</startdate><enddate>19920201</enddate><creator>ROYTBERG, M. A</creator><general>Oxford University Press</general><scope>BSCLL</scope><scope>IQODW</scope><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7X8</scope></search><sort><creationdate>19920201</creationdate><title>A search for common patterns in many sequences</title><author>ROYTBERG, M. A</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c371t-d22303dc17413b2cfa4aac1afa5141f6b244aca1b82a286339953fc941e08f173</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>1992</creationdate><topic>Algorithms</topic><topic>Amino Acid Sequence</topic><topic>Base Sequence</topic><topic>Biological and medical sciences</topic><topic>Fundamental and applied biological sciences. Psychology</topic><topic>General aspects</topic><topic>Humans</topic><topic>Mathematics in biology. Statistical analysis. Models. Metrology. Data processing in biology (general aspects)</topic><topic>Molecular Sequence Data</topic><topic>Pattern Recognition, Automated</topic><topic>Sequence Alignment - methods</topic><topic>Sequence Alignment - statistics & numerical data</topic><topic>Software</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>ROYTBERG, M. A</creatorcontrib><collection>Istex</collection><collection>Pascal-Francis</collection><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>MEDLINE - Academic</collection><jtitle>Bioinformatics</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>ROYTBERG, M. A</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>A search for common patterns in many sequences</atitle><jtitle>Bioinformatics</jtitle><addtitle>Comput Appl Biosci</addtitle><date>1992-02-01</date><risdate>1992</risdate><volume>8</volume><issue>1</issue><spage>57</spage><epage>64</epage><pages>57-64</pages><issn>1367-4803</issn><issn>0266-7061</issn><eissn>1460-2059</eissn><coden>COABER</coden><abstract>A new approach to search for common patterns in many sequences is presented. The idea is that one sequence from the set of sequences to be compared is considered as a ‘basic’ one and all its similarities with other sequences are found. Multiple similarities are then reconstructed using these data. This approach allows one to search for similar segments which can differ in both substitutions and deletions/insertions. These segments can be situated at different positions in various sequences. No regions of complete or strong similarity within the segments are required. The other parts of the sequences can have no similarity at all. The only requirement is that the similar segments can be found in all the sequences (or in the majority of them, given the common segments are present in the basic sequence). Working time of an algorithm presented is proportional to n.L2when n sequences of length L are analyzed. The algorithm proposed is implemented as programs for the IBM-PC and IBM/370. Its applications to the analysis of biopolymer primary structures as well as the dependence of the results on the choice of basic sequence are discussed.</abstract><cop>Washington, DC</cop><cop>Oxford</cop><pub>Oxford University Press</pub><pmid>1568127</pmid><doi>10.1093/bioinformatics/8.1.57</doi><tpages>8</tpages></addata></record>
fulltext	fulltext
identifier	ISSN: 1367-4803
ispartof	Bioinformatics, 1992-02, Vol.8 (1), p.57-64
issn	1367-4803 0266-7061 1460-2059
language	eng
recordid	cdi_proquest_miscellaneous_72903728
source	MEDLINE; Oxford University Press Archive; Oxford Journals Open Access Collection; Alma/SFX Local Collection
subjects	Algorithms Amino Acid Sequence Base Sequence Biological and medical sciences Fundamental and applied biological sciences. Psychology General aspects Humans Mathematics in biology. Statistical analysis. Models. Metrology. Data processing in biology (general aspects) Molecular Sequence Data Pattern Recognition, Automated Sequence Alignment - methods Sequence Alignment - statistics & numerical data Software
title	A search for common patterns in many sequences
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-29T13%3A30%3A19IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=A%20search%20for%20common%20patterns%20in%20many%20sequences&rft.jtitle=Bioinformatics&rft.au=ROYTBERG,%20M.%20A&rft.date=1992-02-01&rft.volume=8&rft.issue=1&rft.spage=57&rft.epage=64&rft.pages=57-64&rft.issn=1367-4803&rft.eissn=1460-2059&rft.coden=COABER&rft_id=info:doi/10.1093/bioinformatics/8.1.57&rft_dat=%3Cproquest_cross%3E72903728%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=72903728&rft_id=info:pmid/1568127&rfr_iscdi=true