Rigorous pattern-recognition methods for DNA sequences: Analysis of promoter sequences from Escherichia coli
The basic nature of the sequence features that define a promoter sequence for Escherichia coli RNA polymerase have been established by a variety of biochemical and genetic methods. We have developed rigorous analytical methods for finding unknown patterns that occur imperfectly in a set of several s...
Gespeichert in:
Veröffentlicht in: | Journal of molecular biology 1985-01, Vol.186 (1), p.117-128 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 128 |
---|---|
container_issue | 1 |
container_start_page | 117 |
container_title | Journal of molecular biology |
container_volume | 186 |
creator | Galas, David J. Eggert, Mark Waterman, Michael S. |
description | The basic nature of the sequence features that define a promoter sequence for
Escherichia coli RNA polymerase have been established by a variety of biochemical and genetic methods. We have developed rigorous analytical methods for finding unknown patterns that occur imperfectly in a set of several sequences, and have used them to examine a set of bacterial promoters. The algorithm easily discovers the “consensus” sequences for the −10 and −35 regions, which are essentially identical to the results of previous analyses, but requires no prior assumptions about the common patterns. By explicitly specifying the nature of the search for consensus sequences, we give a rigorous definition to this concept that should be widely applicable. We also have provided estimates for the statistical significance of common patterns discovered in sets of sequences.
In addition to providing a rigorous basis for defining known consensus regions, we have found additional features in these promoters that may have functional significance. These added features were located on either side of the −35 region. The pattern 5′, or upstream, from the −35 region was found using the standard alphabet (A, G, C and T), but the pattern between the −10 and the −35 regions was detectable only in a sub-alphabet. Recent results relating DNA sequence to helix conformation suggest that the former (upstream) pattern may have a functional significance. Possible roles in promoter function are discussed in this light, and an observation of altered promoter function involving the upstream region is reported that appears to support the suggestion of function in at least one case. |
doi_str_mv | 10.1016/0022-2836(85)90262-1 |
format | Article |
fullrecord | <record><control><sourceid>proquest_pubme</sourceid><recordid>TN_cdi_proquest_miscellaneous_76532871</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><els_id>0022283685902621</els_id><sourcerecordid>14338478</sourcerecordid><originalsourceid>FETCH-LOGICAL-e345t-d1e5596ccfbc3d01f2125f0233f59aeec221d5ce5aaf41b378db26a4b505d8ca3</originalsourceid><addsrcrecordid>eNqFkkFv1DAQhS1EVZbCPwDhA0L0EBjbceJwQFqVtiBVIAE9W4493jVK4q2dReq_x2FX5cjJkt_n8cx7Q8gLBu8YsOY9AOcVV6J5q-R5B7zhFXtEVgxUV6lGqMdk9YA8IU9z_gUAUtTqlJyKDlSjuhUZvodNTHGf6c7MM6apSmjjZgpziBMdcd5Gl6mPiX76uqYZ7_Y4Wcwf6Hoyw30OmUZPdymOsTz-p1NfruhltltMwW6DoTYO4Rk58WbI-Px4npHbq8ufF5-rm2_XXy7WNxWKWs6VYyhl11jreyscMM8Zlx64EF52BtFyzpy0KI3xNetFq1zPG1P3EqRT1ogz8uZQtzRWGsqzHkO2OAxmwjKqbhspuGrZf0FWC6HqVhXw5RHc9yM6vUthNOleH30s-uujbrI1g09msiE_YEryuhML9uqAeRO12aSC3P7gwASwVgrolo8-Hggs_vwOmHS2YbHUhRLMrF0MmoFeFkAv6eol3VJf_10AzcQfE_Og-g</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>14338478</pqid></control><display><type>article</type><title>Rigorous pattern-recognition methods for DNA sequences: Analysis of promoter sequences from Escherichia coli</title><source>MEDLINE</source><source>Elsevier ScienceDirect Journals</source><creator>Galas, David J. ; Eggert, Mark ; Waterman, Michael S.</creator><creatorcontrib>Galas, David J. ; Eggert, Mark ; Waterman, Michael S.</creatorcontrib><description>The basic nature of the sequence features that define a promoter sequence for
Escherichia coli RNA polymerase have been established by a variety of biochemical and genetic methods. We have developed rigorous analytical methods for finding unknown patterns that occur imperfectly in a set of several sequences, and have used them to examine a set of bacterial promoters. The algorithm easily discovers the “consensus” sequences for the −10 and −35 regions, which are essentially identical to the results of previous analyses, but requires no prior assumptions about the common patterns. By explicitly specifying the nature of the search for consensus sequences, we give a rigorous definition to this concept that should be widely applicable. We also have provided estimates for the statistical significance of common patterns discovered in sets of sequences.
In addition to providing a rigorous basis for defining known consensus regions, we have found additional features in these promoters that may have functional significance. These added features were located on either side of the −35 region. The pattern 5′, or upstream, from the −35 region was found using the standard alphabet (A, G, C and T), but the pattern between the −10 and the −35 regions was detectable only in a sub-alphabet. Recent results relating DNA sequence to helix conformation suggest that the former (upstream) pattern may have a functional significance. Possible roles in promoter function are discussed in this light, and an observation of altered promoter function involving the upstream region is reported that appears to support the suggestion of function in at least one case.</description><identifier>ISSN: 0022-2836</identifier><identifier>EISSN: 1089-8638</identifier><identifier>DOI: 10.1016/0022-2836(85)90262-1</identifier><identifier>PMID: 3908689</identifier><identifier>CODEN: JMOBAK</identifier><language>eng</language><publisher>Oxford: Elsevier Ltd</publisher><subject>algorithms ; Analytical, structural and metabolic biochemistry ; Base Sequence ; Biological and medical sciences ; Biotechnology ; computer analysis ; computer techniques ; DNA ; DNA conformation ; DNA, Bacterial - genetics ; Dna, deoxyribonucleoproteins ; Escherichia coli ; Escherichia coli - genetics ; Fundamental and applied biological sciences. Psychology ; Genetic engineering ; Genetic technics ; Methods ; Methods. Procedures. Technologies ; molecular genetics ; Nucleic acids ; nucleotide sequences ; Pattern Recognition, Automated ; promoter regions ; Promoter Regions, Genetic ; sequence alignment ; Sequence Homology, Nucleic Acid ; Synthetic digonucleotides and genes. Sequencing ; Transcription, Genetic</subject><ispartof>Journal of molecular biology, 1985-01, Vol.186 (1), p.117-128</ispartof><rights>1985</rights><rights>1986 INIST-CNRS</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://www.sciencedirect.com/science/article/pii/0022283685902621$$EHTML$$P50$$Gelsevier$$H</linktohtml><link.rule.ids>314,776,780,3536,27903,27904,65309</link.rule.ids><backlink>$$Uhttp://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&idt=8524939$$DView record in Pascal Francis$$Hfree_for_read</backlink><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/3908689$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Galas, David J.</creatorcontrib><creatorcontrib>Eggert, Mark</creatorcontrib><creatorcontrib>Waterman, Michael S.</creatorcontrib><title>Rigorous pattern-recognition methods for DNA sequences: Analysis of promoter sequences from Escherichia coli</title><title>Journal of molecular biology</title><addtitle>J Mol Biol</addtitle><description>The basic nature of the sequence features that define a promoter sequence for
Escherichia coli RNA polymerase have been established by a variety of biochemical and genetic methods. We have developed rigorous analytical methods for finding unknown patterns that occur imperfectly in a set of several sequences, and have used them to examine a set of bacterial promoters. The algorithm easily discovers the “consensus” sequences for the −10 and −35 regions, which are essentially identical to the results of previous analyses, but requires no prior assumptions about the common patterns. By explicitly specifying the nature of the search for consensus sequences, we give a rigorous definition to this concept that should be widely applicable. We also have provided estimates for the statistical significance of common patterns discovered in sets of sequences.
In addition to providing a rigorous basis for defining known consensus regions, we have found additional features in these promoters that may have functional significance. These added features were located on either side of the −35 region. The pattern 5′, or upstream, from the −35 region was found using the standard alphabet (A, G, C and T), but the pattern between the −10 and the −35 regions was detectable only in a sub-alphabet. Recent results relating DNA sequence to helix conformation suggest that the former (upstream) pattern may have a functional significance. Possible roles in promoter function are discussed in this light, and an observation of altered promoter function involving the upstream region is reported that appears to support the suggestion of function in at least one case.</description><subject>algorithms</subject><subject>Analytical, structural and metabolic biochemistry</subject><subject>Base Sequence</subject><subject>Biological and medical sciences</subject><subject>Biotechnology</subject><subject>computer analysis</subject><subject>computer techniques</subject><subject>DNA</subject><subject>DNA conformation</subject><subject>DNA, Bacterial - genetics</subject><subject>Dna, deoxyribonucleoproteins</subject><subject>Escherichia coli</subject><subject>Escherichia coli - genetics</subject><subject>Fundamental and applied biological sciences. Psychology</subject><subject>Genetic engineering</subject><subject>Genetic technics</subject><subject>Methods</subject><subject>Methods. Procedures. Technologies</subject><subject>molecular genetics</subject><subject>Nucleic acids</subject><subject>nucleotide sequences</subject><subject>Pattern Recognition, Automated</subject><subject>promoter regions</subject><subject>Promoter Regions, Genetic</subject><subject>sequence alignment</subject><subject>Sequence Homology, Nucleic Acid</subject><subject>Synthetic digonucleotides and genes. Sequencing</subject><subject>Transcription, Genetic</subject><issn>0022-2836</issn><issn>1089-8638</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>1985</creationdate><recordtype>article</recordtype><sourceid>EIF</sourceid><recordid>eNqFkkFv1DAQhS1EVZbCPwDhA0L0EBjbceJwQFqVtiBVIAE9W4493jVK4q2dReq_x2FX5cjJkt_n8cx7Q8gLBu8YsOY9AOcVV6J5q-R5B7zhFXtEVgxUV6lGqMdk9YA8IU9z_gUAUtTqlJyKDlSjuhUZvodNTHGf6c7MM6apSmjjZgpziBMdcd5Gl6mPiX76uqYZ7_Y4Wcwf6Hoyw30OmUZPdymOsTz-p1NfruhltltMwW6DoTYO4Rk58WbI-Px4npHbq8ufF5-rm2_XXy7WNxWKWs6VYyhl11jreyscMM8Zlx64EF52BtFyzpy0KI3xNetFq1zPG1P3EqRT1ogz8uZQtzRWGsqzHkO2OAxmwjKqbhspuGrZf0FWC6HqVhXw5RHc9yM6vUthNOleH30s-uujbrI1g09msiE_YEryuhML9uqAeRO12aSC3P7gwASwVgrolo8-Hggs_vwOmHS2YbHUhRLMrF0MmoFeFkAv6eol3VJf_10AzcQfE_Og-g</recordid><startdate>19850101</startdate><enddate>19850101</enddate><creator>Galas, David J.</creator><creator>Eggert, Mark</creator><creator>Waterman, Michael S.</creator><general>Elsevier Ltd</general><general>Elsevier</general><scope>FBQ</scope><scope>IQODW</scope><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>7QL</scope><scope>7TM</scope><scope>8FD</scope><scope>C1K</scope><scope>FR3</scope><scope>P64</scope><scope>RC3</scope><scope>7X8</scope></search><sort><creationdate>19850101</creationdate><title>Rigorous pattern-recognition methods for DNA sequences: Analysis of promoter sequences from Escherichia coli</title><author>Galas, David J. ; Eggert, Mark ; Waterman, Michael S.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-e345t-d1e5596ccfbc3d01f2125f0233f59aeec221d5ce5aaf41b378db26a4b505d8ca3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>1985</creationdate><topic>algorithms</topic><topic>Analytical, structural and metabolic biochemistry</topic><topic>Base Sequence</topic><topic>Biological and medical sciences</topic><topic>Biotechnology</topic><topic>computer analysis</topic><topic>computer techniques</topic><topic>DNA</topic><topic>DNA conformation</topic><topic>DNA, Bacterial - genetics</topic><topic>Dna, deoxyribonucleoproteins</topic><topic>Escherichia coli</topic><topic>Escherichia coli - genetics</topic><topic>Fundamental and applied biological sciences. Psychology</topic><topic>Genetic engineering</topic><topic>Genetic technics</topic><topic>Methods</topic><topic>Methods. Procedures. Technologies</topic><topic>molecular genetics</topic><topic>Nucleic acids</topic><topic>nucleotide sequences</topic><topic>Pattern Recognition, Automated</topic><topic>promoter regions</topic><topic>Promoter Regions, Genetic</topic><topic>sequence alignment</topic><topic>Sequence Homology, Nucleic Acid</topic><topic>Synthetic digonucleotides and genes. Sequencing</topic><topic>Transcription, Genetic</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Galas, David J.</creatorcontrib><creatorcontrib>Eggert, Mark</creatorcontrib><creatorcontrib>Waterman, Michael S.</creatorcontrib><collection>AGRIS</collection><collection>Pascal-Francis</collection><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>Bacteriology Abstracts (Microbiology B)</collection><collection>Nucleic Acids Abstracts</collection><collection>Technology Research Database</collection><collection>Environmental Sciences and Pollution Management</collection><collection>Engineering Research Database</collection><collection>Biotechnology and BioEngineering Abstracts</collection><collection>Genetics Abstracts</collection><collection>MEDLINE - Academic</collection><jtitle>Journal of molecular biology</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Galas, David J.</au><au>Eggert, Mark</au><au>Waterman, Michael S.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Rigorous pattern-recognition methods for DNA sequences: Analysis of promoter sequences from Escherichia coli</atitle><jtitle>Journal of molecular biology</jtitle><addtitle>J Mol Biol</addtitle><date>1985-01-01</date><risdate>1985</risdate><volume>186</volume><issue>1</issue><spage>117</spage><epage>128</epage><pages>117-128</pages><issn>0022-2836</issn><eissn>1089-8638</eissn><coden>JMOBAK</coden><abstract>The basic nature of the sequence features that define a promoter sequence for
Escherichia coli RNA polymerase have been established by a variety of biochemical and genetic methods. We have developed rigorous analytical methods for finding unknown patterns that occur imperfectly in a set of several sequences, and have used them to examine a set of bacterial promoters. The algorithm easily discovers the “consensus” sequences for the −10 and −35 regions, which are essentially identical to the results of previous analyses, but requires no prior assumptions about the common patterns. By explicitly specifying the nature of the search for consensus sequences, we give a rigorous definition to this concept that should be widely applicable. We also have provided estimates for the statistical significance of common patterns discovered in sets of sequences.
In addition to providing a rigorous basis for defining known consensus regions, we have found additional features in these promoters that may have functional significance. These added features were located on either side of the −35 region. The pattern 5′, or upstream, from the −35 region was found using the standard alphabet (A, G, C and T), but the pattern between the −10 and the −35 regions was detectable only in a sub-alphabet. Recent results relating DNA sequence to helix conformation suggest that the former (upstream) pattern may have a functional significance. Possible roles in promoter function are discussed in this light, and an observation of altered promoter function involving the upstream region is reported that appears to support the suggestion of function in at least one case.</abstract><cop>Oxford</cop><pub>Elsevier Ltd</pub><pmid>3908689</pmid><doi>10.1016/0022-2836(85)90262-1</doi><tpages>12</tpages></addata></record> |
fulltext | fulltext |
identifier | ISSN: 0022-2836 |
ispartof | Journal of molecular biology, 1985-01, Vol.186 (1), p.117-128 |
issn | 0022-2836 1089-8638 |
language | eng |
recordid | cdi_proquest_miscellaneous_76532871 |
source | MEDLINE; Elsevier ScienceDirect Journals |
subjects | algorithms Analytical, structural and metabolic biochemistry Base Sequence Biological and medical sciences Biotechnology computer analysis computer techniques DNA DNA conformation DNA, Bacterial - genetics Dna, deoxyribonucleoproteins Escherichia coli Escherichia coli - genetics Fundamental and applied biological sciences. Psychology Genetic engineering Genetic technics Methods Methods. Procedures. Technologies molecular genetics Nucleic acids nucleotide sequences Pattern Recognition, Automated promoter regions Promoter Regions, Genetic sequence alignment Sequence Homology, Nucleic Acid Synthetic digonucleotides and genes. Sequencing Transcription, Genetic |
title | Rigorous pattern-recognition methods for DNA sequences: Analysis of promoter sequences from Escherichia coli |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-26T10%3A07%3A26IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Rigorous%20pattern-recognition%20methods%20for%20DNA%20sequences:%20Analysis%20of%20promoter%20sequences%20from%20Escherichia%20coli&rft.jtitle=Journal%20of%20molecular%20biology&rft.au=Galas,%20David%20J.&rft.date=1985-01-01&rft.volume=186&rft.issue=1&rft.spage=117&rft.epage=128&rft.pages=117-128&rft.issn=0022-2836&rft.eissn=1089-8638&rft.coden=JMOBAK&rft_id=info:doi/10.1016/0022-2836(85)90262-1&rft_dat=%3Cproquest_pubme%3E14338478%3C/proquest_pubme%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=14338478&rft_id=info:pmid/3908689&rft_els_id=0022283685902621&rfr_iscdi=true |