Predicting Gene Expression from Sequence

We describe a systematic genome-wide approach for learning the complex combinatorial code underlying gene expression. Our probabilistic approach identifies local DNA-sequence elements and the positional and combinatorial constraints that determine their context-dependent role in transcriptional regu...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Cell 2004-04, Vol.117 (2), p.185-198
Hauptverfasser: Beer, Michael A., Tavazoie, Saeed
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 198
container_issue 2
container_start_page 185
container_title Cell
container_volume 117
creator Beer, Michael A.
Tavazoie, Saeed
description We describe a systematic genome-wide approach for learning the complex combinatorial code underlying gene expression. Our probabilistic approach identifies local DNA-sequence elements and the positional and combinatorial constraints that determine their context-dependent role in transcriptional regulation. The inferred regulatory rules correctly predict expression patterns for 73% of genes in Saccharomyces cerevisiae, utilizing microarray expression data and sequences in the 800 bp upstream of genes. Application to Caenorhabditis elegans identifies predictive regulatory elements and combinatorial rules that control the phased temporal expression of transcription factors, histones, and germline specific genes. Successful prediction requires diverse and complex rules utilizing AND, OR, and NOT logic, with significant constraints on motif strength, orientation, and relative position. This system generates a large number of mechanistic hypotheses for focused experimental validation, and establishes a predictive dynamical framework for understanding cellular behavior from genomic sequence.
doi_str_mv 10.1016/S0092-8674(04)00304-6
format Article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_71832958</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><els_id>S0092867404003046</els_id><sourcerecordid>17942814</sourcerecordid><originalsourceid>FETCH-LOGICAL-c491t-63c22ea5028f8a46ada6d27208940c2c599e0910087223089ad33773d74608683</originalsourceid><addsrcrecordid>eNqFkE1LAzEQhoMotlZ_gtKT1MPqJJvPk0ipVSgoVM9hTWYl0t2tyVb037trix6FgYHhmXmHh5BTCpcUqLxaAhiWaan4BPgFQA48k3tkSMGojFPF9snwFxmQo5TeAEALIQ7JgArQnAk1JJPHiD64NtSv4znWOJ59riOmFJp6XMamGi_xfYO1w2NyUBarhCe7PiLPt7On6V22eJjfT28WmeOGtpnMHWNYCGC61AWXhS-kZ4qBNhwcc8IYBEO7TxRjeTctfJ4rlXvFJWip8xE5395dx6ZLTq2tQnK4WhU1NptkFdU5M-J_kCrDmaa8A8UWdLFJKWJp1zFURfyyFGzv0v64tL0oC131Lq3s9s52AZuXCv3f1k5eB1xvAex8fASMNrnQu_Ihomutb8I_Ed_5AIAM</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>17942814</pqid></control><display><type>article</type><title>Predicting Gene Expression from Sequence</title><source>MEDLINE</source><source>Cell Press Free Archives</source><source>Elsevier ScienceDirect Journals</source><source>Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals</source><creator>Beer, Michael A. ; Tavazoie, Saeed</creator><creatorcontrib>Beer, Michael A. ; Tavazoie, Saeed</creatorcontrib><description>We describe a systematic genome-wide approach for learning the complex combinatorial code underlying gene expression. Our probabilistic approach identifies local DNA-sequence elements and the positional and combinatorial constraints that determine their context-dependent role in transcriptional regulation. The inferred regulatory rules correctly predict expression patterns for 73% of genes in Saccharomyces cerevisiae, utilizing microarray expression data and sequences in the 800 bp upstream of genes. Application to Caenorhabditis elegans identifies predictive regulatory elements and combinatorial rules that control the phased temporal expression of transcription factors, histones, and germline specific genes. Successful prediction requires diverse and complex rules utilizing AND, OR, and NOT logic, with significant constraints on motif strength, orientation, and relative position. This system generates a large number of mechanistic hypotheses for focused experimental validation, and establishes a predictive dynamical framework for understanding cellular behavior from genomic sequence.</description><identifier>ISSN: 0092-8674</identifier><identifier>EISSN: 1097-4172</identifier><identifier>DOI: 10.1016/S0092-8674(04)00304-6</identifier><identifier>PMID: 15084257</identifier><language>eng</language><publisher>United States: Elsevier Inc</publisher><subject>Animals ; Base Sequence - genetics ; Bayes Theorem ; Caenorhabditis elegans ; Gene Expression Profiling - methods ; Gene Expression Regulation - genetics ; Genes - genetics ; Genome ; Models, Statistical ; Multigene Family - genetics ; Oligonucleotide Array Sequence Analysis ; Predictive Value of Tests ; Recombination, Genetic - genetics ; Reproducibility of Results ; Saccharomyces cerevisiae ; Transcription Factors - genetics</subject><ispartof>Cell, 2004-04, Vol.117 (2), p.185-198</ispartof><rights>2004 Cell Press</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c491t-63c22ea5028f8a46ada6d27208940c2c599e0910087223089ad33773d74608683</citedby><cites>FETCH-LOGICAL-c491t-63c22ea5028f8a46ada6d27208940c2c599e0910087223089ad33773d74608683</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://www.sciencedirect.com/science/article/pii/S0092867404003046$$EHTML$$P50$$Gelsevier$$Hfree_for_read</linktohtml><link.rule.ids>314,776,780,3537,27901,27902,65534</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/15084257$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Beer, Michael A.</creatorcontrib><creatorcontrib>Tavazoie, Saeed</creatorcontrib><title>Predicting Gene Expression from Sequence</title><title>Cell</title><addtitle>Cell</addtitle><description>We describe a systematic genome-wide approach for learning the complex combinatorial code underlying gene expression. Our probabilistic approach identifies local DNA-sequence elements and the positional and combinatorial constraints that determine their context-dependent role in transcriptional regulation. The inferred regulatory rules correctly predict expression patterns for 73% of genes in Saccharomyces cerevisiae, utilizing microarray expression data and sequences in the 800 bp upstream of genes. Application to Caenorhabditis elegans identifies predictive regulatory elements and combinatorial rules that control the phased temporal expression of transcription factors, histones, and germline specific genes. Successful prediction requires diverse and complex rules utilizing AND, OR, and NOT logic, with significant constraints on motif strength, orientation, and relative position. This system generates a large number of mechanistic hypotheses for focused experimental validation, and establishes a predictive dynamical framework for understanding cellular behavior from genomic sequence.</description><subject>Animals</subject><subject>Base Sequence - genetics</subject><subject>Bayes Theorem</subject><subject>Caenorhabditis elegans</subject><subject>Gene Expression Profiling - methods</subject><subject>Gene Expression Regulation - genetics</subject><subject>Genes - genetics</subject><subject>Genome</subject><subject>Models, Statistical</subject><subject>Multigene Family - genetics</subject><subject>Oligonucleotide Array Sequence Analysis</subject><subject>Predictive Value of Tests</subject><subject>Recombination, Genetic - genetics</subject><subject>Reproducibility of Results</subject><subject>Saccharomyces cerevisiae</subject><subject>Transcription Factors - genetics</subject><issn>0092-8674</issn><issn>1097-4172</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2004</creationdate><recordtype>article</recordtype><sourceid>EIF</sourceid><recordid>eNqFkE1LAzEQhoMotlZ_gtKT1MPqJJvPk0ipVSgoVM9hTWYl0t2tyVb037trix6FgYHhmXmHh5BTCpcUqLxaAhiWaan4BPgFQA48k3tkSMGojFPF9snwFxmQo5TeAEALIQ7JgArQnAk1JJPHiD64NtSv4znWOJ59riOmFJp6XMamGi_xfYO1w2NyUBarhCe7PiLPt7On6V22eJjfT28WmeOGtpnMHWNYCGC61AWXhS-kZ4qBNhwcc8IYBEO7TxRjeTctfJ4rlXvFJWip8xE5395dx6ZLTq2tQnK4WhU1NptkFdU5M-J_kCrDmaa8A8UWdLFJKWJp1zFURfyyFGzv0v64tL0oC131Lq3s9s52AZuXCv3f1k5eB1xvAex8fASMNrnQu_Ihomutb8I_Ed_5AIAM</recordid><startdate>20040416</startdate><enddate>20040416</enddate><creator>Beer, Michael A.</creator><creator>Tavazoie, Saeed</creator><general>Elsevier Inc</general><scope>6I.</scope><scope>AAFTH</scope><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>8FD</scope><scope>FR3</scope><scope>P64</scope><scope>RC3</scope><scope>7X8</scope></search><sort><creationdate>20040416</creationdate><title>Predicting Gene Expression from Sequence</title><author>Beer, Michael A. ; Tavazoie, Saeed</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c491t-63c22ea5028f8a46ada6d27208940c2c599e0910087223089ad33773d74608683</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2004</creationdate><topic>Animals</topic><topic>Base Sequence - genetics</topic><topic>Bayes Theorem</topic><topic>Caenorhabditis elegans</topic><topic>Gene Expression Profiling - methods</topic><topic>Gene Expression Regulation - genetics</topic><topic>Genes - genetics</topic><topic>Genome</topic><topic>Models, Statistical</topic><topic>Multigene Family - genetics</topic><topic>Oligonucleotide Array Sequence Analysis</topic><topic>Predictive Value of Tests</topic><topic>Recombination, Genetic - genetics</topic><topic>Reproducibility of Results</topic><topic>Saccharomyces cerevisiae</topic><topic>Transcription Factors - genetics</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Beer, Michael A.</creatorcontrib><creatorcontrib>Tavazoie, Saeed</creatorcontrib><collection>ScienceDirect Open Access Titles</collection><collection>Elsevier:ScienceDirect:Open Access</collection><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Technology Research Database</collection><collection>Engineering Research Database</collection><collection>Biotechnology and BioEngineering Abstracts</collection><collection>Genetics Abstracts</collection><collection>MEDLINE - Academic</collection><jtitle>Cell</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Beer, Michael A.</au><au>Tavazoie, Saeed</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Predicting Gene Expression from Sequence</atitle><jtitle>Cell</jtitle><addtitle>Cell</addtitle><date>2004-04-16</date><risdate>2004</risdate><volume>117</volume><issue>2</issue><spage>185</spage><epage>198</epage><pages>185-198</pages><issn>0092-8674</issn><eissn>1097-4172</eissn><abstract>We describe a systematic genome-wide approach for learning the complex combinatorial code underlying gene expression. Our probabilistic approach identifies local DNA-sequence elements and the positional and combinatorial constraints that determine their context-dependent role in transcriptional regulation. The inferred regulatory rules correctly predict expression patterns for 73% of genes in Saccharomyces cerevisiae, utilizing microarray expression data and sequences in the 800 bp upstream of genes. Application to Caenorhabditis elegans identifies predictive regulatory elements and combinatorial rules that control the phased temporal expression of transcription factors, histones, and germline specific genes. Successful prediction requires diverse and complex rules utilizing AND, OR, and NOT logic, with significant constraints on motif strength, orientation, and relative position. This system generates a large number of mechanistic hypotheses for focused experimental validation, and establishes a predictive dynamical framework for understanding cellular behavior from genomic sequence.</abstract><cop>United States</cop><pub>Elsevier Inc</pub><pmid>15084257</pmid><doi>10.1016/S0092-8674(04)00304-6</doi><tpages>14</tpages><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 0092-8674
ispartof Cell, 2004-04, Vol.117 (2), p.185-198
issn 0092-8674
1097-4172
language eng
recordid cdi_proquest_miscellaneous_71832958
source MEDLINE; Cell Press Free Archives; Elsevier ScienceDirect Journals; Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals
subjects Animals
Base Sequence - genetics
Bayes Theorem
Caenorhabditis elegans
Gene Expression Profiling - methods
Gene Expression Regulation - genetics
Genes - genetics
Genome
Models, Statistical
Multigene Family - genetics
Oligonucleotide Array Sequence Analysis
Predictive Value of Tests
Recombination, Genetic - genetics
Reproducibility of Results
Saccharomyces cerevisiae
Transcription Factors - genetics
title Predicting Gene Expression from Sequence
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-16T00%3A27%3A37IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Predicting%20Gene%20Expression%20from%20Sequence&rft.jtitle=Cell&rft.au=Beer,%20Michael%20A.&rft.date=2004-04-16&rft.volume=117&rft.issue=2&rft.spage=185&rft.epage=198&rft.pages=185-198&rft.issn=0092-8674&rft.eissn=1097-4172&rft_id=info:doi/10.1016/S0092-8674(04)00304-6&rft_dat=%3Cproquest_cross%3E17942814%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=17942814&rft_id=info:pmid/15084257&rft_els_id=S0092867404003046&rfr_iscdi=true