Supervised enhancer prediction with epigenetic pattern recognition and targeted validation
Enhancers are important non-coding elements, but they have traditionally been hard to characterize experimentally. The development of massively parallel assays allows the characterization of large numbers of enhancers for the first time. Here, we developed a framework using Drosophila STARR-seq to c...
Gespeichert in:
Veröffentlicht in: | Nature methods 2020-08, Vol.17 (8), p.807-814 |
---|---|
Hauptverfasser: | , , , , , , , , , , , , , , , , , , , , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 814 |
---|---|
container_issue | 8 |
container_start_page | 807 |
container_title | Nature methods |
container_volume | 17 |
creator | Sethi, Anurag Gu, Mengting Gumusgoz, Emrah Chan, Landon Yan, Koon-Kiu Rozowsky, Joel Barozzi, Iros Afzal, Veena Akiyama, Jennifer A. Plajzer-Frick, Ingrid Yan, Chengfei Novak, Catherine S. Kato, Momoe Garvin, Tyler H. Pham, Quan Harrington, Anne Mannion, Brandon J. Lee, Elizabeth A. Fukuda-Yuzawa, Yoko Visel, Axel Dickel, Diane E. Yip, Kevin Y. Sutton, Richard Pennacchio, Len A. Gerstein, Mark |
description | Enhancers are important non-coding elements, but they have traditionally been hard to characterize experimentally. The development of massively parallel assays allows the characterization of large numbers of enhancers for the first time. Here, we developed a framework using
Drosophila
STARR-seq to create shape-matching filters based on meta-profiles of epigenetic features. We integrated these features with supervised machine-learning algorithms to predict enhancers. We further demonstrated that our model could be transferred to predict enhancers in mammals. We comprehensively validated the predictions using a combination of in vivo and in vitro approaches, involving transgenic assays in mice and transduction-based reporter assays in human cell lines (153 enhancers in total). The results confirmed that our model can accurately predict enhancers in different species without re-parameterization. Finally, we examined the transcription factor binding patterns at predicted enhancers versus promoters. We demonstrated that these patterns enable the construction of a secondary model that effectively distinguishes enhancers and promoters.
Supervised machine-learning models trained using
Drosophila
epigenetic and STARR-seq data can be transferred to predict mouse and human enhancers. |
doi_str_mv | 10.1038/s41592-020-0907-8 |
format | Article |
fullrecord | <record><control><sourceid>proquest_pubme</sourceid><recordid>TN_cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_8073243</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2429348945</sourcerecordid><originalsourceid>FETCH-LOGICAL-c497t-4dab678efd15449da08a431fa119104f454b1876292792c456297b8d063407193</originalsourceid><addsrcrecordid>eNp1kctu1DAUhq0KREvhAbqpItiwCfiWsb1BqiqglSqxADZsLI99knGVsYPtDOLt62l6A4mVj3y-85_Lj9AJwe8JZvJD5qRTtMUUt1hh0coDdEQ6LltBcPfsPsaKHKKXOV9jzBin3Qt0yKhgggt2hH5-mydIO5_BNRA2JlhIzZTAeVt8DM1vXzYNTH6AAMXbZjKlQApNAhuH4G8ZE1xTTBqgVJGdGb0z-_9X6Hlvxgyv795j9OPzp-_nF-3V1y-X52dXreVKlJY7s14JCb2r43LlDJaGM9IbQhTBvOcdXxMpVlRRoajlXY3EWjq8YhwLotgx-rjoTvN6C85CKMmMekp-a9IfHY3Xf2eC3-gh7rTEglHOqsCbRSDm4nW2voDd2BgC2KJJPWwnaYXe3XVJ8dcMueitzxbG0QSIc9aU17EkEYJU9O0_6HWcU6g3uKUYl4p3lSILZVPMOUH_MDHBem-vXuzV1V69t1fLWnP6dNWHins_K0AXINdUGCA9tv6_6g2xybA9</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2429348945</pqid></control><display><type>article</type><title>Supervised enhancer prediction with epigenetic pattern recognition and targeted validation</title><source>MEDLINE</source><source>Nature Journals Online</source><source>SpringerLink Journals - AutoHoldings</source><creator>Sethi, Anurag ; Gu, Mengting ; Gumusgoz, Emrah ; Chan, Landon ; Yan, Koon-Kiu ; Rozowsky, Joel ; Barozzi, Iros ; Afzal, Veena ; Akiyama, Jennifer A. ; Plajzer-Frick, Ingrid ; Yan, Chengfei ; Novak, Catherine S. ; Kato, Momoe ; Garvin, Tyler H. ; Pham, Quan ; Harrington, Anne ; Mannion, Brandon J. ; Lee, Elizabeth A. ; Fukuda-Yuzawa, Yoko ; Visel, Axel ; Dickel, Diane E. ; Yip, Kevin Y. ; Sutton, Richard ; Pennacchio, Len A. ; Gerstein, Mark</creator><creatorcontrib>Sethi, Anurag ; Gu, Mengting ; Gumusgoz, Emrah ; Chan, Landon ; Yan, Koon-Kiu ; Rozowsky, Joel ; Barozzi, Iros ; Afzal, Veena ; Akiyama, Jennifer A. ; Plajzer-Frick, Ingrid ; Yan, Chengfei ; Novak, Catherine S. ; Kato, Momoe ; Garvin, Tyler H. ; Pham, Quan ; Harrington, Anne ; Mannion, Brandon J. ; Lee, Elizabeth A. ; Fukuda-Yuzawa, Yoko ; Visel, Axel ; Dickel, Diane E. ; Yip, Kevin Y. ; Sutton, Richard ; Pennacchio, Len A. ; Gerstein, Mark ; Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States)</creatorcontrib><description>Enhancers are important non-coding elements, but they have traditionally been hard to characterize experimentally. The development of massively parallel assays allows the characterization of large numbers of enhancers for the first time. Here, we developed a framework using
Drosophila
STARR-seq to create shape-matching filters based on meta-profiles of epigenetic features. We integrated these features with supervised machine-learning algorithms to predict enhancers. We further demonstrated that our model could be transferred to predict enhancers in mammals. We comprehensively validated the predictions using a combination of in vivo and in vitro approaches, involving transgenic assays in mice and transduction-based reporter assays in human cell lines (153 enhancers in total). The results confirmed that our model can accurately predict enhancers in different species without re-parameterization. Finally, we examined the transcription factor binding patterns at predicted enhancers versus promoters. We demonstrated that these patterns enable the construction of a secondary model that effectively distinguishes enhancers and promoters.
Supervised machine-learning models trained using
Drosophila
epigenetic and STARR-seq data can be transferred to predict mouse and human enhancers.</description><identifier>ISSN: 1548-7091</identifier><identifier>EISSN: 1548-7105</identifier><identifier>DOI: 10.1038/s41592-020-0907-8</identifier><identifier>PMID: 32737473</identifier><language>eng</language><publisher>New York: Nature Publishing Group US</publisher><subject>631/114/2397 ; 631/114/2415 ; 631/208/176 ; 631/208/200 ; Algorithms ; Animals ; Assaying ; BASIC BIOLOGICAL SCIENCES ; Bioinformatics ; Biological Microscopy ; Biological Techniques ; Biomedical and Life Sciences ; Biomedical Engineering/Biotechnology ; Cell Line ; Cell lines ; Computational models ; Drosophila ; Enhancers ; Epigenesis, Genetic - physiology ; Epigenetics ; Gene regulation ; Histones - genetics ; Histones - metabolism ; Humans ; Insects ; Learning algorithms ; Life Sciences ; Machine learning ; Mice ; Mice, Transgenic ; Parameterization ; Pattern recognition ; Pattern Recognition, Automated - methods ; Promoters ; Proteomics ; Reproducibility of Results ; Statistical methods ; Transgenic mice</subject><ispartof>Nature methods, 2020-08, Vol.17 (8), p.807-814</ispartof><rights>The Author(s), under exclusive licence to Springer Nature America, Inc. 2020</rights><rights>The Author(s), under exclusive licence to Springer Nature America, Inc. 2020.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c497t-4dab678efd15449da08a431fa119104f454b1876292792c456297b8d063407193</citedby><cites>FETCH-LOGICAL-c497t-4dab678efd15449da08a431fa119104f454b1876292792c456297b8d063407193</cites><orcidid>0000-0003-3295-3116 ; 0000-0002-4130-7784 ; 0000-0001-5497-6824 ; 0000-0002-9746-3719 ; 0000-0001-5516-9944 ; 0000-0002-1149-3790 ; 0000-0003-1684-0146 ; 0000-0002-3565-0762 ; 0000-0003-2337-485X ; 0000000332953116 ; 0000000241307784 ; 0000000154976824 ; 0000000235650762 ; 0000000211493790 ; 0000000155169944 ; 0000000316840146 ; 000000032337485X ; 0000000297463719</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://link.springer.com/content/pdf/10.1038/s41592-020-0907-8$$EPDF$$P50$$Gspringer$$H</linktopdf><linktohtml>$$Uhttps://link.springer.com/10.1038/s41592-020-0907-8$$EHTML$$P50$$Gspringer$$H</linktohtml><link.rule.ids>230,314,776,780,881,27901,27902,41464,42533,51294</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/32737473$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink><backlink>$$Uhttps://www.osti.gov/servlets/purl/1907582$$D View this record in Osti.gov$$Hfree_for_read</backlink></links><search><creatorcontrib>Sethi, Anurag</creatorcontrib><creatorcontrib>Gu, Mengting</creatorcontrib><creatorcontrib>Gumusgoz, Emrah</creatorcontrib><creatorcontrib>Chan, Landon</creatorcontrib><creatorcontrib>Yan, Koon-Kiu</creatorcontrib><creatorcontrib>Rozowsky, Joel</creatorcontrib><creatorcontrib>Barozzi, Iros</creatorcontrib><creatorcontrib>Afzal, Veena</creatorcontrib><creatorcontrib>Akiyama, Jennifer A.</creatorcontrib><creatorcontrib>Plajzer-Frick, Ingrid</creatorcontrib><creatorcontrib>Yan, Chengfei</creatorcontrib><creatorcontrib>Novak, Catherine S.</creatorcontrib><creatorcontrib>Kato, Momoe</creatorcontrib><creatorcontrib>Garvin, Tyler H.</creatorcontrib><creatorcontrib>Pham, Quan</creatorcontrib><creatorcontrib>Harrington, Anne</creatorcontrib><creatorcontrib>Mannion, Brandon J.</creatorcontrib><creatorcontrib>Lee, Elizabeth A.</creatorcontrib><creatorcontrib>Fukuda-Yuzawa, Yoko</creatorcontrib><creatorcontrib>Visel, Axel</creatorcontrib><creatorcontrib>Dickel, Diane E.</creatorcontrib><creatorcontrib>Yip, Kevin Y.</creatorcontrib><creatorcontrib>Sutton, Richard</creatorcontrib><creatorcontrib>Pennacchio, Len A.</creatorcontrib><creatorcontrib>Gerstein, Mark</creatorcontrib><creatorcontrib>Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States)</creatorcontrib><title>Supervised enhancer prediction with epigenetic pattern recognition and targeted validation</title><title>Nature methods</title><addtitle>Nat Methods</addtitle><addtitle>Nat Methods</addtitle><description>Enhancers are important non-coding elements, but they have traditionally been hard to characterize experimentally. The development of massively parallel assays allows the characterization of large numbers of enhancers for the first time. Here, we developed a framework using
Drosophila
STARR-seq to create shape-matching filters based on meta-profiles of epigenetic features. We integrated these features with supervised machine-learning algorithms to predict enhancers. We further demonstrated that our model could be transferred to predict enhancers in mammals. We comprehensively validated the predictions using a combination of in vivo and in vitro approaches, involving transgenic assays in mice and transduction-based reporter assays in human cell lines (153 enhancers in total). The results confirmed that our model can accurately predict enhancers in different species without re-parameterization. Finally, we examined the transcription factor binding patterns at predicted enhancers versus promoters. We demonstrated that these patterns enable the construction of a secondary model that effectively distinguishes enhancers and promoters.
Supervised machine-learning models trained using
Drosophila
epigenetic and STARR-seq data can be transferred to predict mouse and human enhancers.</description><subject>631/114/2397</subject><subject>631/114/2415</subject><subject>631/208/176</subject><subject>631/208/200</subject><subject>Algorithms</subject><subject>Animals</subject><subject>Assaying</subject><subject>BASIC BIOLOGICAL SCIENCES</subject><subject>Bioinformatics</subject><subject>Biological Microscopy</subject><subject>Biological Techniques</subject><subject>Biomedical and Life Sciences</subject><subject>Biomedical Engineering/Biotechnology</subject><subject>Cell Line</subject><subject>Cell lines</subject><subject>Computational models</subject><subject>Drosophila</subject><subject>Enhancers</subject><subject>Epigenesis, Genetic - physiology</subject><subject>Epigenetics</subject><subject>Gene regulation</subject><subject>Histones - genetics</subject><subject>Histones - metabolism</subject><subject>Humans</subject><subject>Insects</subject><subject>Learning algorithms</subject><subject>Life Sciences</subject><subject>Machine learning</subject><subject>Mice</subject><subject>Mice, Transgenic</subject><subject>Parameterization</subject><subject>Pattern recognition</subject><subject>Pattern Recognition, Automated - methods</subject><subject>Promoters</subject><subject>Proteomics</subject><subject>Reproducibility of Results</subject><subject>Statistical methods</subject><subject>Transgenic mice</subject><issn>1548-7091</issn><issn>1548-7105</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2020</creationdate><recordtype>article</recordtype><sourceid>EIF</sourceid><sourceid>BENPR</sourceid><recordid>eNp1kctu1DAUhq0KREvhAbqpItiwCfiWsb1BqiqglSqxADZsLI99knGVsYPtDOLt62l6A4mVj3y-85_Lj9AJwe8JZvJD5qRTtMUUt1hh0coDdEQ6LltBcPfsPsaKHKKXOV9jzBin3Qt0yKhgggt2hH5-mydIO5_BNRA2JlhIzZTAeVt8DM1vXzYNTH6AAMXbZjKlQApNAhuH4G8ZE1xTTBqgVJGdGb0z-_9X6Hlvxgyv795j9OPzp-_nF-3V1y-X52dXreVKlJY7s14JCb2r43LlDJaGM9IbQhTBvOcdXxMpVlRRoajlXY3EWjq8YhwLotgx-rjoTvN6C85CKMmMekp-a9IfHY3Xf2eC3-gh7rTEglHOqsCbRSDm4nW2voDd2BgC2KJJPWwnaYXe3XVJ8dcMueitzxbG0QSIc9aU17EkEYJU9O0_6HWcU6g3uKUYl4p3lSILZVPMOUH_MDHBem-vXuzV1V69t1fLWnP6dNWHins_K0AXINdUGCA9tv6_6g2xybA9</recordid><startdate>20200801</startdate><enddate>20200801</enddate><creator>Sethi, Anurag</creator><creator>Gu, Mengting</creator><creator>Gumusgoz, Emrah</creator><creator>Chan, Landon</creator><creator>Yan, Koon-Kiu</creator><creator>Rozowsky, Joel</creator><creator>Barozzi, Iros</creator><creator>Afzal, Veena</creator><creator>Akiyama, Jennifer A.</creator><creator>Plajzer-Frick, Ingrid</creator><creator>Yan, Chengfei</creator><creator>Novak, Catherine S.</creator><creator>Kato, Momoe</creator><creator>Garvin, Tyler H.</creator><creator>Pham, Quan</creator><creator>Harrington, Anne</creator><creator>Mannion, Brandon J.</creator><creator>Lee, Elizabeth A.</creator><creator>Fukuda-Yuzawa, Yoko</creator><creator>Visel, Axel</creator><creator>Dickel, Diane E.</creator><creator>Yip, Kevin Y.</creator><creator>Sutton, Richard</creator><creator>Pennacchio, Len A.</creator><creator>Gerstein, Mark</creator><general>Nature Publishing Group US</general><general>Nature Publishing Group</general><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>3V.</scope><scope>7QL</scope><scope>7QO</scope><scope>7SS</scope><scope>7TK</scope><scope>7U9</scope><scope>7X2</scope><scope>7X7</scope><scope>7XB</scope><scope>88E</scope><scope>88I</scope><scope>8AO</scope><scope>8FD</scope><scope>8FE</scope><scope>8FG</scope><scope>8FH</scope><scope>8FI</scope><scope>8FJ</scope><scope>8FK</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AEUYN</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>ATCPS</scope><scope>AZQEC</scope><scope>BBNVY</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>BHPHI</scope><scope>BKSAR</scope><scope>C1K</scope><scope>CCPQU</scope><scope>D1I</scope><scope>DWQXO</scope><scope>FR3</scope><scope>FYUFA</scope><scope>GHDGH</scope><scope>GNUQQ</scope><scope>H94</scope><scope>HCIFZ</scope><scope>K9.</scope><scope>KB.</scope><scope>L6V</scope><scope>LK8</scope><scope>M0K</scope><scope>M0S</scope><scope>M1P</scope><scope>M2P</scope><scope>M7N</scope><scope>M7P</scope><scope>M7S</scope><scope>P5Z</scope><scope>P62</scope><scope>P64</scope><scope>PATMY</scope><scope>PCBAR</scope><scope>PDBOC</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PTHSS</scope><scope>PYCSY</scope><scope>Q9U</scope><scope>RC3</scope><scope>7X8</scope><scope>OIOZB</scope><scope>OTOTI</scope><scope>5PM</scope><orcidid>https://orcid.org/0000-0003-3295-3116</orcidid><orcidid>https://orcid.org/0000-0002-4130-7784</orcidid><orcidid>https://orcid.org/0000-0001-5497-6824</orcidid><orcidid>https://orcid.org/0000-0002-9746-3719</orcidid><orcidid>https://orcid.org/0000-0001-5516-9944</orcidid><orcidid>https://orcid.org/0000-0002-1149-3790</orcidid><orcidid>https://orcid.org/0000-0003-1684-0146</orcidid><orcidid>https://orcid.org/0000-0002-3565-0762</orcidid><orcidid>https://orcid.org/0000-0003-2337-485X</orcidid><orcidid>https://orcid.org/0000000332953116</orcidid><orcidid>https://orcid.org/0000000241307784</orcidid><orcidid>https://orcid.org/0000000154976824</orcidid><orcidid>https://orcid.org/0000000235650762</orcidid><orcidid>https://orcid.org/0000000211493790</orcidid><orcidid>https://orcid.org/0000000155169944</orcidid><orcidid>https://orcid.org/0000000316840146</orcidid><orcidid>https://orcid.org/000000032337485X</orcidid><orcidid>https://orcid.org/0000000297463719</orcidid></search><sort><creationdate>20200801</creationdate><title>Supervised enhancer prediction with epigenetic pattern recognition and targeted validation</title><author>Sethi, Anurag ; Gu, Mengting ; Gumusgoz, Emrah ; Chan, Landon ; Yan, Koon-Kiu ; Rozowsky, Joel ; Barozzi, Iros ; Afzal, Veena ; Akiyama, Jennifer A. ; Plajzer-Frick, Ingrid ; Yan, Chengfei ; Novak, Catherine S. ; Kato, Momoe ; Garvin, Tyler H. ; Pham, Quan ; Harrington, Anne ; Mannion, Brandon J. ; Lee, Elizabeth A. ; Fukuda-Yuzawa, Yoko ; Visel, Axel ; Dickel, Diane E. ; Yip, Kevin Y. ; Sutton, Richard ; Pennacchio, Len A. ; Gerstein, Mark</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c497t-4dab678efd15449da08a431fa119104f454b1876292792c456297b8d063407193</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2020</creationdate><topic>631/114/2397</topic><topic>631/114/2415</topic><topic>631/208/176</topic><topic>631/208/200</topic><topic>Algorithms</topic><topic>Animals</topic><topic>Assaying</topic><topic>BASIC BIOLOGICAL SCIENCES</topic><topic>Bioinformatics</topic><topic>Biological Microscopy</topic><topic>Biological Techniques</topic><topic>Biomedical and Life Sciences</topic><topic>Biomedical Engineering/Biotechnology</topic><topic>Cell Line</topic><topic>Cell lines</topic><topic>Computational models</topic><topic>Drosophila</topic><topic>Enhancers</topic><topic>Epigenesis, Genetic - physiology</topic><topic>Epigenetics</topic><topic>Gene regulation</topic><topic>Histones - genetics</topic><topic>Histones - metabolism</topic><topic>Humans</topic><topic>Insects</topic><topic>Learning algorithms</topic><topic>Life Sciences</topic><topic>Machine learning</topic><topic>Mice</topic><topic>Mice, Transgenic</topic><topic>Parameterization</topic><topic>Pattern recognition</topic><topic>Pattern Recognition, Automated - methods</topic><topic>Promoters</topic><topic>Proteomics</topic><topic>Reproducibility of Results</topic><topic>Statistical methods</topic><topic>Transgenic mice</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Sethi, Anurag</creatorcontrib><creatorcontrib>Gu, Mengting</creatorcontrib><creatorcontrib>Gumusgoz, Emrah</creatorcontrib><creatorcontrib>Chan, Landon</creatorcontrib><creatorcontrib>Yan, Koon-Kiu</creatorcontrib><creatorcontrib>Rozowsky, Joel</creatorcontrib><creatorcontrib>Barozzi, Iros</creatorcontrib><creatorcontrib>Afzal, Veena</creatorcontrib><creatorcontrib>Akiyama, Jennifer A.</creatorcontrib><creatorcontrib>Plajzer-Frick, Ingrid</creatorcontrib><creatorcontrib>Yan, Chengfei</creatorcontrib><creatorcontrib>Novak, Catherine S.</creatorcontrib><creatorcontrib>Kato, Momoe</creatorcontrib><creatorcontrib>Garvin, Tyler H.</creatorcontrib><creatorcontrib>Pham, Quan</creatorcontrib><creatorcontrib>Harrington, Anne</creatorcontrib><creatorcontrib>Mannion, Brandon J.</creatorcontrib><creatorcontrib>Lee, Elizabeth A.</creatorcontrib><creatorcontrib>Fukuda-Yuzawa, Yoko</creatorcontrib><creatorcontrib>Visel, Axel</creatorcontrib><creatorcontrib>Dickel, Diane E.</creatorcontrib><creatorcontrib>Yip, Kevin Y.</creatorcontrib><creatorcontrib>Sutton, Richard</creatorcontrib><creatorcontrib>Pennacchio, Len A.</creatorcontrib><creatorcontrib>Gerstein, Mark</creatorcontrib><creatorcontrib>Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States)</creatorcontrib><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>ProQuest Central (Corporate)</collection><collection>Bacteriology Abstracts (Microbiology B)</collection><collection>Biotechnology Research Abstracts</collection><collection>Entomology Abstracts (Full archive)</collection><collection>Neurosciences Abstracts</collection><collection>Virology and AIDS Abstracts</collection><collection>Agricultural Science Collection</collection><collection>Health & Medical Collection</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>Medical Database (Alumni Edition)</collection><collection>Science Database (Alumni Edition)</collection><collection>ProQuest Pharma Collection</collection><collection>Technology Research Database</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Natural Science Collection</collection><collection>Hospital Premium Collection</collection><collection>Hospital Premium Collection (Alumni Edition)</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>Materials Science & Engineering Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest One Sustainability</collection><collection>ProQuest Central UK/Ireland</collection><collection>Advanced Technologies & Aerospace Collection</collection><collection>Agricultural & Environmental Science Collection</collection><collection>ProQuest Central Essentials</collection><collection>Biological Science Collection</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>Natural Science Collection</collection><collection>Earth, Atmospheric & Aquatic Science Collection</collection><collection>Environmental Sciences and Pollution Management</collection><collection>ProQuest One Community College</collection><collection>ProQuest Materials Science Collection</collection><collection>ProQuest Central Korea</collection><collection>Engineering Research Database</collection><collection>Health Research Premium Collection</collection><collection>Health Research Premium Collection (Alumni)</collection><collection>ProQuest Central Student</collection><collection>AIDS and Cancer Research Abstracts</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Health & Medical Complete (Alumni)</collection><collection>Materials Science Database</collection><collection>ProQuest Engineering Collection</collection><collection>ProQuest Biological Science Collection</collection><collection>Agricultural Science Database</collection><collection>Health & Medical Collection (Alumni Edition)</collection><collection>Medical Database</collection><collection>Science Database</collection><collection>Algology Mycology and Protozoology Abstracts (Microbiology C)</collection><collection>Biological Science Database</collection><collection>Engineering Database</collection><collection>Advanced Technologies & Aerospace Database</collection><collection>ProQuest Advanced Technologies & Aerospace Collection</collection><collection>Biotechnology and BioEngineering Abstracts</collection><collection>Environmental Science Database</collection><collection>Earth, Atmospheric & Aquatic Science Database</collection><collection>Materials Science Collection</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>Engineering Collection</collection><collection>Environmental Science Collection</collection><collection>ProQuest Central Basic</collection><collection>Genetics Abstracts</collection><collection>MEDLINE - Academic</collection><collection>OSTI.GOV - Hybrid</collection><collection>OSTI.GOV</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>Nature methods</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Sethi, Anurag</au><au>Gu, Mengting</au><au>Gumusgoz, Emrah</au><au>Chan, Landon</au><au>Yan, Koon-Kiu</au><au>Rozowsky, Joel</au><au>Barozzi, Iros</au><au>Afzal, Veena</au><au>Akiyama, Jennifer A.</au><au>Plajzer-Frick, Ingrid</au><au>Yan, Chengfei</au><au>Novak, Catherine S.</au><au>Kato, Momoe</au><au>Garvin, Tyler H.</au><au>Pham, Quan</au><au>Harrington, Anne</au><au>Mannion, Brandon J.</au><au>Lee, Elizabeth A.</au><au>Fukuda-Yuzawa, Yoko</au><au>Visel, Axel</au><au>Dickel, Diane E.</au><au>Yip, Kevin Y.</au><au>Sutton, Richard</au><au>Pennacchio, Len A.</au><au>Gerstein, Mark</au><aucorp>Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States)</aucorp><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Supervised enhancer prediction with epigenetic pattern recognition and targeted validation</atitle><jtitle>Nature methods</jtitle><stitle>Nat Methods</stitle><addtitle>Nat Methods</addtitle><date>2020-08-01</date><risdate>2020</risdate><volume>17</volume><issue>8</issue><spage>807</spage><epage>814</epage><pages>807-814</pages><issn>1548-7091</issn><eissn>1548-7105</eissn><abstract>Enhancers are important non-coding elements, but they have traditionally been hard to characterize experimentally. The development of massively parallel assays allows the characterization of large numbers of enhancers for the first time. Here, we developed a framework using
Drosophila
STARR-seq to create shape-matching filters based on meta-profiles of epigenetic features. We integrated these features with supervised machine-learning algorithms to predict enhancers. We further demonstrated that our model could be transferred to predict enhancers in mammals. We comprehensively validated the predictions using a combination of in vivo and in vitro approaches, involving transgenic assays in mice and transduction-based reporter assays in human cell lines (153 enhancers in total). The results confirmed that our model can accurately predict enhancers in different species without re-parameterization. Finally, we examined the transcription factor binding patterns at predicted enhancers versus promoters. We demonstrated that these patterns enable the construction of a secondary model that effectively distinguishes enhancers and promoters.
Supervised machine-learning models trained using
Drosophila
epigenetic and STARR-seq data can be transferred to predict mouse and human enhancers.</abstract><cop>New York</cop><pub>Nature Publishing Group US</pub><pmid>32737473</pmid><doi>10.1038/s41592-020-0907-8</doi><tpages>8</tpages><orcidid>https://orcid.org/0000-0003-3295-3116</orcidid><orcidid>https://orcid.org/0000-0002-4130-7784</orcidid><orcidid>https://orcid.org/0000-0001-5497-6824</orcidid><orcidid>https://orcid.org/0000-0002-9746-3719</orcidid><orcidid>https://orcid.org/0000-0001-5516-9944</orcidid><orcidid>https://orcid.org/0000-0002-1149-3790</orcidid><orcidid>https://orcid.org/0000-0003-1684-0146</orcidid><orcidid>https://orcid.org/0000-0002-3565-0762</orcidid><orcidid>https://orcid.org/0000-0003-2337-485X</orcidid><orcidid>https://orcid.org/0000000332953116</orcidid><orcidid>https://orcid.org/0000000241307784</orcidid><orcidid>https://orcid.org/0000000154976824</orcidid><orcidid>https://orcid.org/0000000235650762</orcidid><orcidid>https://orcid.org/0000000211493790</orcidid><orcidid>https://orcid.org/0000000155169944</orcidid><orcidid>https://orcid.org/0000000316840146</orcidid><orcidid>https://orcid.org/000000032337485X</orcidid><orcidid>https://orcid.org/0000000297463719</orcidid><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 1548-7091 |
ispartof | Nature methods, 2020-08, Vol.17 (8), p.807-814 |
issn | 1548-7091 1548-7105 |
language | eng |
recordid | cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_8073243 |
source | MEDLINE; Nature Journals Online; SpringerLink Journals - AutoHoldings |
subjects | 631/114/2397 631/114/2415 631/208/176 631/208/200 Algorithms Animals Assaying BASIC BIOLOGICAL SCIENCES Bioinformatics Biological Microscopy Biological Techniques Biomedical and Life Sciences Biomedical Engineering/Biotechnology Cell Line Cell lines Computational models Drosophila Enhancers Epigenesis, Genetic - physiology Epigenetics Gene regulation Histones - genetics Histones - metabolism Humans Insects Learning algorithms Life Sciences Machine learning Mice Mice, Transgenic Parameterization Pattern recognition Pattern Recognition, Automated - methods Promoters Proteomics Reproducibility of Results Statistical methods Transgenic mice |
title | Supervised enhancer prediction with epigenetic pattern recognition and targeted validation |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-05T20%3A06%3A48IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Supervised%20enhancer%20prediction%20with%20epigenetic%20pattern%20recognition%20and%20targeted%20validation&rft.jtitle=Nature%20methods&rft.au=Sethi,%20Anurag&rft.aucorp=Lawrence%20Berkeley%20National%20Lab.%20(LBNL),%20Berkeley,%20CA%20(United%20States)&rft.date=2020-08-01&rft.volume=17&rft.issue=8&rft.spage=807&rft.epage=814&rft.pages=807-814&rft.issn=1548-7091&rft.eissn=1548-7105&rft_id=info:doi/10.1038/s41592-020-0907-8&rft_dat=%3Cproquest_pubme%3E2429348945%3C/proquest_pubme%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2429348945&rft_id=info:pmid/32737473&rfr_iscdi=true |