Integrative analysis using module-guided random forests reveals correlated genetic factors related to mouse weight

Complex traits such as obesity are manifestations of intricate interactions of multiple genetic factors. However, such relationships are difficult to identify. Thanks to the recent advance in high-throughput technology, a large amount of data has been collected for various complex traits, including...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:PLoS computational biology 2013-03, Vol.9 (3), p.e1002956-e1002956
Hauptverfasser: Chen, Zheng, Zhang, Weixiong
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page e1002956
container_issue 3
container_start_page e1002956
container_title PLoS computational biology
container_volume 9
creator Chen, Zheng
Zhang, Weixiong
description Complex traits such as obesity are manifestations of intricate interactions of multiple genetic factors. However, such relationships are difficult to identify. Thanks to the recent advance in high-throughput technology, a large amount of data has been collected for various complex traits, including obesity. These data often measure different biological aspects of the traits of interest, including genotypic variations at the DNA level and gene expression alterations at the RNA level. Integration of such heterogeneous data provides promising opportunities to understand the genetic components and possibly genetic architecture of complex traits. In this paper, we propose a machine learning based method, module-guided Random Forests (mgRF), to integrate genotypic and gene expression data to investigate genetic factors and molecular mechanism underlying complex traits. mgRF is an augmented Random Forests method enhanced by a network analysis for identifying multiple correlated variables of different types. We applied mgRF to genetic markers and gene expression data from a cohort of F2 female mouse intercross. mgRF outperformed several existing methods in our extensive comparison. Our new approach has an improved performance when combining both genotypic and gene expression data compared to using either one of the two types of data alone. The resulting predictive variables identified by mgRF provide information of perturbed pathways that are related to body weight. More importantly, the results uncovered intricate interactions among genetic markers and genes that have been overlooked if only one type of data was examined. Our results shed light on genetic mechanisms of obesity and our approach provides a promising complementary framework to the "genetics of gene expression" analysis for integrating genotypic and gene expression information for analyzing complex traits.
doi_str_mv 10.1371/journal.pcbi.1002956
format Article
fullrecord <record><control><sourceid>gale_plos_</sourceid><recordid>TN_cdi_plos_journals_1327763105</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><galeid>A326658588</galeid><doaj_id>oai_doaj_org_article_a40be3e7701b4ddaba827e86a59ae3e2</doaj_id><sourcerecordid>A326658588</sourcerecordid><originalsourceid>FETCH-LOGICAL-c671t-189c9867787d672e386758fe23a72e6125baf178f7002253e27e2d240a3231143</originalsourceid><addsrcrecordid>eNqVkltv1DAQhSMEomXhHyDIIzzsEtvrS16QqorLShVIXJ6tiTNJvUriYjsL_ffMstuq-4jykMnkO8eeoymKl6xaMaHZu22Y4wTD6sY1fsWqitdSPSrOmZRiqYU0jx_UZ8WzlLZVRWWtnhZnXMhKCsXPi7iZMvYRst9hCeR3m3wq5-SnvhxDOw-47GffYltGmNowll2ImHIqI-4QhlS6ECMOkInoccLsXdmByyHukUM_B7KaE5a_0ffX-XnxpCMlvji-F8XPjx9-XH5eXn39tLm8uFo6pVleMlO72iitjW6V5iiolqZDLoC-FOOygY5p02manUuBXCNv-boCwQVja7EoXh98b4aQ7DGuZJngWivBKIBFsTkQbYCtvYl-hHhrA3j7rxFibyHSRANaWFcNCtS6Ys26baEBQ-cZBbIG6nPyen88bW5GbB1OOcJwYnr6Z_LXtg87K2TNuBJk8OZoEMOvmTK2o08OhwEmpPTo3jSs5NooQlcHtAe6mp-6QI6OnhZH78KEnaf-heBKSSONIcHbEwExGf_kHuaU7Ob7t_9gv5yy6wPrYkgpYnc_L6vsfkvvYrf7LbXHLSXZq4dZ3Yvu1lL8BWC65bk</addsrcrecordid><sourcetype>Open Website</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1317852786</pqid></control><display><type>article</type><title>Integrative analysis using module-guided random forests reveals correlated genetic factors related to mouse weight</title><source>MEDLINE</source><source>DOAJ Directory of Open Access Journals</source><source>Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals</source><source>PubMed Central</source><source>Public Library of Science (PLoS)</source><creator>Chen, Zheng ; Zhang, Weixiong</creator><creatorcontrib>Chen, Zheng ; Zhang, Weixiong</creatorcontrib><description>Complex traits such as obesity are manifestations of intricate interactions of multiple genetic factors. However, such relationships are difficult to identify. Thanks to the recent advance in high-throughput technology, a large amount of data has been collected for various complex traits, including obesity. These data often measure different biological aspects of the traits of interest, including genotypic variations at the DNA level and gene expression alterations at the RNA level. Integration of such heterogeneous data provides promising opportunities to understand the genetic components and possibly genetic architecture of complex traits. In this paper, we propose a machine learning based method, module-guided Random Forests (mgRF), to integrate genotypic and gene expression data to investigate genetic factors and molecular mechanism underlying complex traits. mgRF is an augmented Random Forests method enhanced by a network analysis for identifying multiple correlated variables of different types. We applied mgRF to genetic markers and gene expression data from a cohort of F2 female mouse intercross. mgRF outperformed several existing methods in our extensive comparison. Our new approach has an improved performance when combining both genotypic and gene expression data compared to using either one of the two types of data alone. The resulting predictive variables identified by mgRF provide information of perturbed pathways that are related to body weight. More importantly, the results uncovered intricate interactions among genetic markers and genes that have been overlooked if only one type of data was examined. Our results shed light on genetic mechanisms of obesity and our approach provides a promising complementary framework to the "genetics of gene expression" analysis for integrating genotypic and gene expression information for analyzing complex traits.</description><identifier>ISSN: 1553-7358</identifier><identifier>ISSN: 1553-734X</identifier><identifier>EISSN: 1553-7358</identifier><identifier>DOI: 10.1371/journal.pcbi.1002956</identifier><identifier>PMID: 23505362</identifier><language>eng</language><publisher>United States: Public Library of Science</publisher><subject>Algorithms ; Animals ; Artificial Intelligence ; Biology ; Body Weight - genetics ; Computational Biology - methods ; Decision Trees ; DNA sequencing ; Female ; Gene expression ; Genetics ; Methods ; Mice ; Models, Genetic ; Molecular genetics ; Nucleotide sequencing ; Obesity ; Quantitative genetics ; Quantitative trait loci ; RNA sequencing ; Studies ; Variables</subject><ispartof>PLoS computational biology, 2013-03, Vol.9 (3), p.e1002956-e1002956</ispartof><rights>COPYRIGHT 2013 Public Library of Science</rights><rights>2013 Chen, Zhang 2013 Chen, Zhang</rights><rights>2013 Chen, Zhang. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited: Chen Z, Zhang W (2013) Integrative Analysis Using Module-Guided Random Forests Reveals Correlated Genetic Factors Related to Mouse Weight. PLoS Comput Biol 9(3): e1002956. doi:10.1371/journal.pcbi.1002956</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c671t-189c9867787d672e386758fe23a72e6125baf178f7002253e27e2d240a3231143</citedby><cites>FETCH-LOGICAL-c671t-189c9867787d672e386758fe23a72e6125baf178f7002253e27e2d240a3231143</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC3591263/pdf/$$EPDF$$P50$$Gpubmedcentral$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC3591263/$$EHTML$$P50$$Gpubmedcentral$$Hfree_for_read</linktohtml><link.rule.ids>230,314,723,776,780,860,881,2096,2915,23845,27901,27902,53766,53768,79343,79344</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/23505362$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Chen, Zheng</creatorcontrib><creatorcontrib>Zhang, Weixiong</creatorcontrib><title>Integrative analysis using module-guided random forests reveals correlated genetic factors related to mouse weight</title><title>PLoS computational biology</title><addtitle>PLoS Comput Biol</addtitle><description>Complex traits such as obesity are manifestations of intricate interactions of multiple genetic factors. However, such relationships are difficult to identify. Thanks to the recent advance in high-throughput technology, a large amount of data has been collected for various complex traits, including obesity. These data often measure different biological aspects of the traits of interest, including genotypic variations at the DNA level and gene expression alterations at the RNA level. Integration of such heterogeneous data provides promising opportunities to understand the genetic components and possibly genetic architecture of complex traits. In this paper, we propose a machine learning based method, module-guided Random Forests (mgRF), to integrate genotypic and gene expression data to investigate genetic factors and molecular mechanism underlying complex traits. mgRF is an augmented Random Forests method enhanced by a network analysis for identifying multiple correlated variables of different types. We applied mgRF to genetic markers and gene expression data from a cohort of F2 female mouse intercross. mgRF outperformed several existing methods in our extensive comparison. Our new approach has an improved performance when combining both genotypic and gene expression data compared to using either one of the two types of data alone. The resulting predictive variables identified by mgRF provide information of perturbed pathways that are related to body weight. More importantly, the results uncovered intricate interactions among genetic markers and genes that have been overlooked if only one type of data was examined. Our results shed light on genetic mechanisms of obesity and our approach provides a promising complementary framework to the "genetics of gene expression" analysis for integrating genotypic and gene expression information for analyzing complex traits.</description><subject>Algorithms</subject><subject>Animals</subject><subject>Artificial Intelligence</subject><subject>Biology</subject><subject>Body Weight - genetics</subject><subject>Computational Biology - methods</subject><subject>Decision Trees</subject><subject>DNA sequencing</subject><subject>Female</subject><subject>Gene expression</subject><subject>Genetics</subject><subject>Methods</subject><subject>Mice</subject><subject>Models, Genetic</subject><subject>Molecular genetics</subject><subject>Nucleotide sequencing</subject><subject>Obesity</subject><subject>Quantitative genetics</subject><subject>Quantitative trait loci</subject><subject>RNA sequencing</subject><subject>Studies</subject><subject>Variables</subject><issn>1553-7358</issn><issn>1553-734X</issn><issn>1553-7358</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2013</creationdate><recordtype>article</recordtype><sourceid>EIF</sourceid><sourceid>DOA</sourceid><recordid>eNqVkltv1DAQhSMEomXhHyDIIzzsEtvrS16QqorLShVIXJ6tiTNJvUriYjsL_ffMstuq-4jykMnkO8eeoymKl6xaMaHZu22Y4wTD6sY1fsWqitdSPSrOmZRiqYU0jx_UZ8WzlLZVRWWtnhZnXMhKCsXPi7iZMvYRst9hCeR3m3wq5-SnvhxDOw-47GffYltGmNowll2ImHIqI-4QhlS6ECMOkInoccLsXdmByyHukUM_B7KaE5a_0ffX-XnxpCMlvji-F8XPjx9-XH5eXn39tLm8uFo6pVleMlO72iitjW6V5iiolqZDLoC-FOOygY5p02manUuBXCNv-boCwQVja7EoXh98b4aQ7DGuZJngWivBKIBFsTkQbYCtvYl-hHhrA3j7rxFibyHSRANaWFcNCtS6Ys26baEBQ-cZBbIG6nPyen88bW5GbB1OOcJwYnr6Z_LXtg87K2TNuBJk8OZoEMOvmTK2o08OhwEmpPTo3jSs5NooQlcHtAe6mp-6QI6OnhZH78KEnaf-heBKSSONIcHbEwExGf_kHuaU7Ob7t_9gv5yy6wPrYkgpYnc_L6vsfkvvYrf7LbXHLSXZq4dZ3Yvu1lL8BWC65bk</recordid><startdate>20130301</startdate><enddate>20130301</enddate><creator>Chen, Zheng</creator><creator>Zhang, Weixiong</creator><general>Public Library of Science</general><general>Public Library of Science (PLoS)</general><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>ISN</scope><scope>ISR</scope><scope>7X8</scope><scope>5PM</scope><scope>DOA</scope></search><sort><creationdate>20130301</creationdate><title>Integrative analysis using module-guided random forests reveals correlated genetic factors related to mouse weight</title><author>Chen, Zheng ; Zhang, Weixiong</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c671t-189c9867787d672e386758fe23a72e6125baf178f7002253e27e2d240a3231143</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2013</creationdate><topic>Algorithms</topic><topic>Animals</topic><topic>Artificial Intelligence</topic><topic>Biology</topic><topic>Body Weight - genetics</topic><topic>Computational Biology - methods</topic><topic>Decision Trees</topic><topic>DNA sequencing</topic><topic>Female</topic><topic>Gene expression</topic><topic>Genetics</topic><topic>Methods</topic><topic>Mice</topic><topic>Models, Genetic</topic><topic>Molecular genetics</topic><topic>Nucleotide sequencing</topic><topic>Obesity</topic><topic>Quantitative genetics</topic><topic>Quantitative trait loci</topic><topic>RNA sequencing</topic><topic>Studies</topic><topic>Variables</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Chen, Zheng</creatorcontrib><creatorcontrib>Zhang, Weixiong</creatorcontrib><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Gale In Context: Canada</collection><collection>Gale In Context: Science</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><collection>DOAJ Directory of Open Access Journals</collection><jtitle>PLoS computational biology</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Chen, Zheng</au><au>Zhang, Weixiong</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Integrative analysis using module-guided random forests reveals correlated genetic factors related to mouse weight</atitle><jtitle>PLoS computational biology</jtitle><addtitle>PLoS Comput Biol</addtitle><date>2013-03-01</date><risdate>2013</risdate><volume>9</volume><issue>3</issue><spage>e1002956</spage><epage>e1002956</epage><pages>e1002956-e1002956</pages><issn>1553-7358</issn><issn>1553-734X</issn><eissn>1553-7358</eissn><abstract>Complex traits such as obesity are manifestations of intricate interactions of multiple genetic factors. However, such relationships are difficult to identify. Thanks to the recent advance in high-throughput technology, a large amount of data has been collected for various complex traits, including obesity. These data often measure different biological aspects of the traits of interest, including genotypic variations at the DNA level and gene expression alterations at the RNA level. Integration of such heterogeneous data provides promising opportunities to understand the genetic components and possibly genetic architecture of complex traits. In this paper, we propose a machine learning based method, module-guided Random Forests (mgRF), to integrate genotypic and gene expression data to investigate genetic factors and molecular mechanism underlying complex traits. mgRF is an augmented Random Forests method enhanced by a network analysis for identifying multiple correlated variables of different types. We applied mgRF to genetic markers and gene expression data from a cohort of F2 female mouse intercross. mgRF outperformed several existing methods in our extensive comparison. Our new approach has an improved performance when combining both genotypic and gene expression data compared to using either one of the two types of data alone. The resulting predictive variables identified by mgRF provide information of perturbed pathways that are related to body weight. More importantly, the results uncovered intricate interactions among genetic markers and genes that have been overlooked if only one type of data was examined. Our results shed light on genetic mechanisms of obesity and our approach provides a promising complementary framework to the "genetics of gene expression" analysis for integrating genotypic and gene expression information for analyzing complex traits.</abstract><cop>United States</cop><pub>Public Library of Science</pub><pmid>23505362</pmid><doi>10.1371/journal.pcbi.1002956</doi><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 1553-7358
ispartof PLoS computational biology, 2013-03, Vol.9 (3), p.e1002956-e1002956
issn 1553-7358
1553-734X
1553-7358
language eng
recordid cdi_plos_journals_1327763105
source MEDLINE; DOAJ Directory of Open Access Journals; Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals; PubMed Central; Public Library of Science (PLoS)
subjects Algorithms
Animals
Artificial Intelligence
Biology
Body Weight - genetics
Computational Biology - methods
Decision Trees
DNA sequencing
Female
Gene expression
Genetics
Methods
Mice
Models, Genetic
Molecular genetics
Nucleotide sequencing
Obesity
Quantitative genetics
Quantitative trait loci
RNA sequencing
Studies
Variables
title Integrative analysis using module-guided random forests reveals correlated genetic factors related to mouse weight
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-02T12%3A43%3A47IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-gale_plos_&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Integrative%20analysis%20using%20module-guided%20random%20forests%20reveals%20correlated%20genetic%20factors%20related%20to%20mouse%20weight&rft.jtitle=PLoS%20computational%20biology&rft.au=Chen,%20Zheng&rft.date=2013-03-01&rft.volume=9&rft.issue=3&rft.spage=e1002956&rft.epage=e1002956&rft.pages=e1002956-e1002956&rft.issn=1553-7358&rft.eissn=1553-7358&rft_id=info:doi/10.1371/journal.pcbi.1002956&rft_dat=%3Cgale_plos_%3EA326658588%3C/gale_plos_%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=1317852786&rft_id=info:pmid/23505362&rft_galeid=A326658588&rft_doaj_id=oai_doaj_org_article_a40be3e7701b4ddaba827e86a59ae3e2&rfr_iscdi=true