cis-Regulatory element prediction in mammalian genomes

The identification of cis-regulatory elements and modules is an important step in understanding the regulation of genes. We have developed a pipeline capable of running multiple motif prediction methods on a whole genome scale. Using gene expression datasets to identify co-expressed genes and the En...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Siddiqui, A., Robertson, G., Bilenky, M., Astakhova, T., Griffith, O.L., Hassel, M., Lin, K., Montgomery, S., Oveisi, M., Pleasance, E., Robertson, N., Sleumer, M.C., Teague, K., Varhol, R., Zhang, M., Jones, S.
Format: Tagungsbericht
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 204
container_issue
container_start_page 203
container_title
container_volume
creator Siddiqui, A.
Robertson, G.
Bilenky, M.
Astakhova, T.
Griffith, O.L.
Hassel, M.
Lin, K.
Montgomery, S.
Oveisi, M.
Pleasance, E.
Robertson, N.
Sleumer, M.C.
Teague, K.
Varhol, R.
Zhang, M.
Jones, S.
description The identification of cis-regulatory elements and modules is an important step in understanding the regulation of genes. We have developed a pipeline capable of running multiple motif prediction methods on a whole genome scale. Using gene expression datasets to identify co-expressed genes and the Ensemhl Compara database orthologues, we assemble input sequence sets comprised of the upstream regions of a target gene, its orthologues and co-expressed genes on the premise that such genes will share promoters by evolution (orthologues) or share regulatory control mechanisms (co-expressed genes). Co-expressed genes are identified by an approach that combines Pearson distances from multiple gene expression datasets derived from multiple experimental approaches and calibrated against the GO database. Our pipeline runs a number of established motif detection algorithms with a range of parameter settings on the input dataset. We integrate the diverse result sets by scoring motifs with a method-independent function. For each target gene, we assign p-values to the motif score by running the discovery pipeline on multiple sets of input sequence containing the target gene, non-coexpressed genes and "Jake" orthologues generated by neutral numerical evolution. We have predicted 30,636 motif binding sites in human for 4,182 genes and an initial set of 472 motif binding sites in mouse for 92 genes with p
doi_str_mv 10.1109/CSBW.2005.35
format Conference Proceeding
fullrecord <record><control><sourceid>ieee_6IE</sourceid><recordid>TN_cdi_ieee_primary_1540599</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>1540599</ieee_id><sourcerecordid>1540599</sourcerecordid><originalsourceid>FETCH-LOGICAL-i90t-d034800d6a127319c9de83294309891eecd9e160bda62b88dc97f7ae921ac4a03</originalsourceid><addsrcrecordid>eNotzE1Lw0AQgOEFEdTamzcv-QOJM_uVzFGDX1AQ2oLHMt2dlpVsUpJ46L9X0Pfy3F6l7hAqRKCHdvP0WWkAVxl3oW6g9uS0tbq-Ustp-oLfrEOL_lr5kKZyLcfvjudhPBfSSZZ-Lk6jxBTmNPRF6ovMOXOXuC-O0g9Zplt1eeBukuW_C7V9ed62b-Xq4_W9fVyViWAuIxjbAETPqGuDFChKYzRZA9QQioRIgh72kb3eN00MVB9qFtLIwTKYhbr_2yYR2Z3GlHk879BZcETmB4nbQuM</addsrcrecordid><sourcetype>Publisher</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype></control><display><type>conference_proceeding</type><title>cis-Regulatory element prediction in mammalian genomes</title><source>IEEE Electronic Library (IEL) Conference Proceedings</source><creator>Siddiqui, A. ; Robertson, G. ; Bilenky, M. ; Astakhova, T. ; Griffith, O.L. ; Hassel, M. ; Lin, K. ; Montgomery, S. ; Oveisi, M. ; Pleasance, E. ; Robertson, N. ; Sleumer, M.C. ; Teague, K. ; Varhol, R. ; Zhang, M. ; Jones, S.</creator><creatorcontrib>Siddiqui, A. ; Robertson, G. ; Bilenky, M. ; Astakhova, T. ; Griffith, O.L. ; Hassel, M. ; Lin, K. ; Montgomery, S. ; Oveisi, M. ; Pleasance, E. ; Robertson, N. ; Sleumer, M.C. ; Teague, K. ; Varhol, R. ; Zhang, M. ; Jones, S.</creatorcontrib><description>The identification of cis-regulatory elements and modules is an important step in understanding the regulation of genes. We have developed a pipeline capable of running multiple motif prediction methods on a whole genome scale. Using gene expression datasets to identify co-expressed genes and the Ensemhl Compara database orthologues, we assemble input sequence sets comprised of the upstream regions of a target gene, its orthologues and co-expressed genes on the premise that such genes will share promoters by evolution (orthologues) or share regulatory control mechanisms (co-expressed genes). Co-expressed genes are identified by an approach that combines Pearson distances from multiple gene expression datasets derived from multiple experimental approaches and calibrated against the GO database. Our pipeline runs a number of established motif detection algorithms with a range of parameter settings on the input dataset. We integrate the diverse result sets by scoring motifs with a method-independent function. For each target gene, we assign p-values to the motif score by running the discovery pipeline on multiple sets of input sequence containing the target gene, non-coexpressed genes and "Jake" orthologues generated by neutral numerical evolution. We have predicted 30,636 motif binding sites in human for 4,182 genes and an initial set of 472 motif binding sites in mouse for 92 genes with p&lt;0.001. The positive predictive value against a library of biologically confirmed regulatory sites approaches 0.4 at the highest p-value threshold. Predicted regulatory elements and other resources from the project are available at www.cisred.org.</description><identifier>ISBN: 0769524427</identifier><identifier>ISBN: 9780769524429</identifier><identifier>DOI: 10.1109/CSBW.2005.35</identifier><language>eng</language><publisher>IEEE</publisher><subject>Assembly ; Bioinformatics ; Detection algorithms ; Evolution (biology) ; Gene expression ; Genomics ; Humans ; Pipelines ; Prediction methods</subject><ispartof>2005 IEEE Computational Systems Bioinformatics Conference - Workshops (CSBW'05), 2005, p.203-204</ispartof><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/1540599$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>309,310,776,780,785,786,2051,4035,4036,27904,54899</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/1540599$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Siddiqui, A.</creatorcontrib><creatorcontrib>Robertson, G.</creatorcontrib><creatorcontrib>Bilenky, M.</creatorcontrib><creatorcontrib>Astakhova, T.</creatorcontrib><creatorcontrib>Griffith, O.L.</creatorcontrib><creatorcontrib>Hassel, M.</creatorcontrib><creatorcontrib>Lin, K.</creatorcontrib><creatorcontrib>Montgomery, S.</creatorcontrib><creatorcontrib>Oveisi, M.</creatorcontrib><creatorcontrib>Pleasance, E.</creatorcontrib><creatorcontrib>Robertson, N.</creatorcontrib><creatorcontrib>Sleumer, M.C.</creatorcontrib><creatorcontrib>Teague, K.</creatorcontrib><creatorcontrib>Varhol, R.</creatorcontrib><creatorcontrib>Zhang, M.</creatorcontrib><creatorcontrib>Jones, S.</creatorcontrib><title>cis-Regulatory element prediction in mammalian genomes</title><title>2005 IEEE Computational Systems Bioinformatics Conference - Workshops (CSBW'05)</title><addtitle>CSBW</addtitle><description>The identification of cis-regulatory elements and modules is an important step in understanding the regulation of genes. We have developed a pipeline capable of running multiple motif prediction methods on a whole genome scale. Using gene expression datasets to identify co-expressed genes and the Ensemhl Compara database orthologues, we assemble input sequence sets comprised of the upstream regions of a target gene, its orthologues and co-expressed genes on the premise that such genes will share promoters by evolution (orthologues) or share regulatory control mechanisms (co-expressed genes). Co-expressed genes are identified by an approach that combines Pearson distances from multiple gene expression datasets derived from multiple experimental approaches and calibrated against the GO database. Our pipeline runs a number of established motif detection algorithms with a range of parameter settings on the input dataset. We integrate the diverse result sets by scoring motifs with a method-independent function. For each target gene, we assign p-values to the motif score by running the discovery pipeline on multiple sets of input sequence containing the target gene, non-coexpressed genes and "Jake" orthologues generated by neutral numerical evolution. We have predicted 30,636 motif binding sites in human for 4,182 genes and an initial set of 472 motif binding sites in mouse for 92 genes with p&lt;0.001. The positive predictive value against a library of biologically confirmed regulatory sites approaches 0.4 at the highest p-value threshold. Predicted regulatory elements and other resources from the project are available at www.cisred.org.</description><subject>Assembly</subject><subject>Bioinformatics</subject><subject>Detection algorithms</subject><subject>Evolution (biology)</subject><subject>Gene expression</subject><subject>Genomics</subject><subject>Humans</subject><subject>Pipelines</subject><subject>Prediction methods</subject><isbn>0769524427</isbn><isbn>9780769524429</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2005</creationdate><recordtype>conference_proceeding</recordtype><sourceid>6IE</sourceid><sourceid>RIE</sourceid><recordid>eNotzE1Lw0AQgOEFEdTamzcv-QOJM_uVzFGDX1AQ2oLHMt2dlpVsUpJ46L9X0Pfy3F6l7hAqRKCHdvP0WWkAVxl3oW6g9uS0tbq-Ustp-oLfrEOL_lr5kKZyLcfvjudhPBfSSZZ-Lk6jxBTmNPRF6ovMOXOXuC-O0g9Zplt1eeBukuW_C7V9ed62b-Xq4_W9fVyViWAuIxjbAETPqGuDFChKYzRZA9QQioRIgh72kb3eN00MVB9qFtLIwTKYhbr_2yYR2Z3GlHk879BZcETmB4nbQuM</recordid><startdate>2005</startdate><enddate>2005</enddate><creator>Siddiqui, A.</creator><creator>Robertson, G.</creator><creator>Bilenky, M.</creator><creator>Astakhova, T.</creator><creator>Griffith, O.L.</creator><creator>Hassel, M.</creator><creator>Lin, K.</creator><creator>Montgomery, S.</creator><creator>Oveisi, M.</creator><creator>Pleasance, E.</creator><creator>Robertson, N.</creator><creator>Sleumer, M.C.</creator><creator>Teague, K.</creator><creator>Varhol, R.</creator><creator>Zhang, M.</creator><creator>Jones, S.</creator><general>IEEE</general><scope>6IE</scope><scope>6IL</scope><scope>CBEJK</scope><scope>RIE</scope><scope>RIL</scope></search><sort><creationdate>2005</creationdate><title>cis-Regulatory element prediction in mammalian genomes</title><author>Siddiqui, A. ; Robertson, G. ; Bilenky, M. ; Astakhova, T. ; Griffith, O.L. ; Hassel, M. ; Lin, K. ; Montgomery, S. ; Oveisi, M. ; Pleasance, E. ; Robertson, N. ; Sleumer, M.C. ; Teague, K. ; Varhol, R. ; Zhang, M. ; Jones, S.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-i90t-d034800d6a127319c9de83294309891eecd9e160bda62b88dc97f7ae921ac4a03</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2005</creationdate><topic>Assembly</topic><topic>Bioinformatics</topic><topic>Detection algorithms</topic><topic>Evolution (biology)</topic><topic>Gene expression</topic><topic>Genomics</topic><topic>Humans</topic><topic>Pipelines</topic><topic>Prediction methods</topic><toplevel>online_resources</toplevel><creatorcontrib>Siddiqui, A.</creatorcontrib><creatorcontrib>Robertson, G.</creatorcontrib><creatorcontrib>Bilenky, M.</creatorcontrib><creatorcontrib>Astakhova, T.</creatorcontrib><creatorcontrib>Griffith, O.L.</creatorcontrib><creatorcontrib>Hassel, M.</creatorcontrib><creatorcontrib>Lin, K.</creatorcontrib><creatorcontrib>Montgomery, S.</creatorcontrib><creatorcontrib>Oveisi, M.</creatorcontrib><creatorcontrib>Pleasance, E.</creatorcontrib><creatorcontrib>Robertson, N.</creatorcontrib><creatorcontrib>Sleumer, M.C.</creatorcontrib><creatorcontrib>Teague, K.</creatorcontrib><creatorcontrib>Varhol, R.</creatorcontrib><creatorcontrib>Zhang, M.</creatorcontrib><creatorcontrib>Jones, S.</creatorcontrib><collection>IEEE Electronic Library (IEL) Conference Proceedings</collection><collection>IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume</collection><collection>IEEE Xplore All Conference Proceedings</collection><collection>IEEE Electronic Library (IEL)</collection><collection>IEEE Proceedings Order Plans (POP All) 1998-Present</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Siddiqui, A.</au><au>Robertson, G.</au><au>Bilenky, M.</au><au>Astakhova, T.</au><au>Griffith, O.L.</au><au>Hassel, M.</au><au>Lin, K.</au><au>Montgomery, S.</au><au>Oveisi, M.</au><au>Pleasance, E.</au><au>Robertson, N.</au><au>Sleumer, M.C.</au><au>Teague, K.</au><au>Varhol, R.</au><au>Zhang, M.</au><au>Jones, S.</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>cis-Regulatory element prediction in mammalian genomes</atitle><btitle>2005 IEEE Computational Systems Bioinformatics Conference - Workshops (CSBW'05)</btitle><stitle>CSBW</stitle><date>2005</date><risdate>2005</risdate><spage>203</spage><epage>204</epage><pages>203-204</pages><isbn>0769524427</isbn><isbn>9780769524429</isbn><abstract>The identification of cis-regulatory elements and modules is an important step in understanding the regulation of genes. We have developed a pipeline capable of running multiple motif prediction methods on a whole genome scale. Using gene expression datasets to identify co-expressed genes and the Ensemhl Compara database orthologues, we assemble input sequence sets comprised of the upstream regions of a target gene, its orthologues and co-expressed genes on the premise that such genes will share promoters by evolution (orthologues) or share regulatory control mechanisms (co-expressed genes). Co-expressed genes are identified by an approach that combines Pearson distances from multiple gene expression datasets derived from multiple experimental approaches and calibrated against the GO database. Our pipeline runs a number of established motif detection algorithms with a range of parameter settings on the input dataset. We integrate the diverse result sets by scoring motifs with a method-independent function. For each target gene, we assign p-values to the motif score by running the discovery pipeline on multiple sets of input sequence containing the target gene, non-coexpressed genes and "Jake" orthologues generated by neutral numerical evolution. We have predicted 30,636 motif binding sites in human for 4,182 genes and an initial set of 472 motif binding sites in mouse for 92 genes with p&lt;0.001. The positive predictive value against a library of biologically confirmed regulatory sites approaches 0.4 at the highest p-value threshold. Predicted regulatory elements and other resources from the project are available at www.cisred.org.</abstract><pub>IEEE</pub><doi>10.1109/CSBW.2005.35</doi><tpages>2</tpages></addata></record>
fulltext fulltext_linktorsrc
identifier ISBN: 0769524427
ispartof 2005 IEEE Computational Systems Bioinformatics Conference - Workshops (CSBW'05), 2005, p.203-204
issn
language eng
recordid cdi_ieee_primary_1540599
source IEEE Electronic Library (IEL) Conference Proceedings
subjects Assembly
Bioinformatics
Detection algorithms
Evolution (biology)
Gene expression
Genomics
Humans
Pipelines
Prediction methods
title cis-Regulatory element prediction in mammalian genomes
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-21T13%3A36%3A01IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-ieee_6IE&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=cis-Regulatory%20element%20prediction%20in%20mammalian%20genomes&rft.btitle=2005%20IEEE%20Computational%20Systems%20Bioinformatics%20Conference%20-%20Workshops%20(CSBW'05)&rft.au=Siddiqui,%20A.&rft.date=2005&rft.spage=203&rft.epage=204&rft.pages=203-204&rft.isbn=0769524427&rft.isbn_list=9780769524429&rft_id=info:doi/10.1109/CSBW.2005.35&rft_dat=%3Cieee_6IE%3E1540599%3C/ieee_6IE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=1540599&rfr_iscdi=true