MINT: a multivariate integrative method to identify reproducible molecular signatures across independent experiments and platforms
Molecular signatures identified from high-throughput transcriptomic studies often have poor reliability and fail to reproduce across studies. One solution is to combine independent studies into a single integrative analysis, additionally increasing sample size. However, the different protocols and t...
Gespeichert in:
Veröffentlicht in: | BMC bioinformatics 2017-02, Vol.18 (1), p.128-128, Article 128 |
---|---|
Hauptverfasser: | , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 128 |
---|---|
container_issue | 1 |
container_start_page | 128 |
container_title | BMC bioinformatics |
container_volume | 18 |
creator | Rohart, Florian Eslami, Aida Matigian, Nicholas Bougeard, Stéphanie Lê Cao, Kim-Anh |
description | Molecular signatures identified from high-throughput transcriptomic studies often have poor reliability and fail to reproduce across studies. One solution is to combine independent studies into a single integrative analysis, additionally increasing sample size. However, the different protocols and technological platforms across transcriptomic studies produce unwanted systematic variation that strongly confounds the integrative analysis results. When studies aim to discriminate an outcome of interest, the common approach is a sequential two-step procedure; unwanted systematic variation removal techniques are applied prior to classification methods.
To limit the risk of overfitting and over-optimistic results of a two-step procedure, we developed a novel multivariate integration method, MINT, that simultaneously accounts for unwanted systematic variation and identifies predictive gene signatures with greater reproducibility and accuracy. In two biological examples on the classification of three human cell types and four subtypes of breast cancer, we combined high-dimensional microarray and RNA-seq data sets and MINT identified highly reproducible and relevant gene signatures predictive of a given phenotype. MINT led to superior classification and prediction accuracy compared to the existing sequential two-step procedures.
MINT is a powerful approach and the first of its kind to solve the integrative classification framework in a single step by combining multiple independent studies. MINT is computationally fast as part of the mixOmics R CRAN package, available at http://www.mixOmics.org/mixMINT/ and http://cran.r-project.org/web/packages/mixOmics/ . |
doi_str_mv | 10.1186/s12859-017-1553-8 |
format | Article |
fullrecord | <record><control><sourceid>proquest_pubme</sourceid><recordid>TN_cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_5327533</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>4317818751</sourcerecordid><originalsourceid>FETCH-LOGICAL-c427t-69eb21419effb63fd4110540b5c803e93634ed51e5e2cc8d5b75216507c19e8b3</originalsourceid><addsrcrecordid>eNpdUc1u1jAQtBAVLYUH4IIsceES6rXjxOGAhCp-KrVwKWfLsTdfXSVxsJ2KXnlyHL5StZy88s6MZnYIeQXsHYBqThJwJbuKQVuBlKJST8gR1C1UHJh8-mA-JM9TumYFqJh8Rg654jW0ojsivy_Ovl2-p4ZO65j9jYneZKR-zriLpnwgnTBfBUdzoN7hnP1wSyMuMbjV-n4s-zCiXUcTafK72eQ1YqLGxpBS0XG44LzxKP5aMPqpjGU9O7qMJg8hTukFORjMmPDl3XtMfnz-dHn6tTr__uXs9ON5ZWve5qrpsOdQQ4fD0DdicDWUZDXrpVVMYCcaUaOTgBK5tcrJvpUcGslaWziqF8fkw153WfsJnS1Oohn1UkyZeKuD8frxZvZXehdutBS8lUIUgbd3AjH8XDFlPflkcRzNjGFNGlTLVdtIvkHf_Ae9DmucS7wNJXgHddcUFOxRf68Vcbg3A0xvDet9w7oUp7eGtSqc1w9T3DP-VSr-AHyPpRw</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1873291496</pqid></control><display><type>article</type><title>MINT: a multivariate integrative method to identify reproducible molecular signatures across independent experiments and platforms</title><source>MEDLINE</source><source>DOAJ Directory of Open Access Journals</source><source>SpringerLink Journals</source><source>Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals</source><source>PubMed Central Open Access</source><source>Springer Nature OA Free Journals</source><source>PubMed Central</source><creator>Rohart, Florian ; Eslami, Aida ; Matigian, Nicholas ; Bougeard, Stéphanie ; Lê Cao, Kim-Anh</creator><creatorcontrib>Rohart, Florian ; Eslami, Aida ; Matigian, Nicholas ; Bougeard, Stéphanie ; Lê Cao, Kim-Anh</creatorcontrib><description>Molecular signatures identified from high-throughput transcriptomic studies often have poor reliability and fail to reproduce across studies. One solution is to combine independent studies into a single integrative analysis, additionally increasing sample size. However, the different protocols and technological platforms across transcriptomic studies produce unwanted systematic variation that strongly confounds the integrative analysis results. When studies aim to discriminate an outcome of interest, the common approach is a sequential two-step procedure; unwanted systematic variation removal techniques are applied prior to classification methods.
To limit the risk of overfitting and over-optimistic results of a two-step procedure, we developed a novel multivariate integration method, MINT, that simultaneously accounts for unwanted systematic variation and identifies predictive gene signatures with greater reproducibility and accuracy. In two biological examples on the classification of three human cell types and four subtypes of breast cancer, we combined high-dimensional microarray and RNA-seq data sets and MINT identified highly reproducible and relevant gene signatures predictive of a given phenotype. MINT led to superior classification and prediction accuracy compared to the existing sequential two-step procedures.
MINT is a powerful approach and the first of its kind to solve the integrative classification framework in a single step by combining multiple independent studies. MINT is computationally fast as part of the mixOmics R CRAN package, available at http://www.mixOmics.org/mixMINT/ and http://cran.r-project.org/web/packages/mixOmics/ .</description><identifier>ISSN: 1471-2105</identifier><identifier>EISSN: 1471-2105</identifier><identifier>DOI: 10.1186/s12859-017-1553-8</identifier><identifier>PMID: 28241739</identifier><language>eng</language><publisher>England: BioMed Central</publisher><subject>Gene Expression Profiling ; Humans ; Methodology ; Multivariate Analysis ; Reproducibility of Results ; Sample Size</subject><ispartof>BMC bioinformatics, 2017-02, Vol.18 (1), p.128-128, Article 128</ispartof><rights>Copyright BioMed Central 2017</rights><rights>The Author(s) 2017</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c427t-69eb21419effb63fd4110540b5c803e93634ed51e5e2cc8d5b75216507c19e8b3</citedby><cites>FETCH-LOGICAL-c427t-69eb21419effb63fd4110540b5c803e93634ed51e5e2cc8d5b75216507c19e8b3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC5327533/pdf/$$EPDF$$P50$$Gpubmedcentral$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC5327533/$$EHTML$$P50$$Gpubmedcentral$$Hfree_for_read</linktohtml><link.rule.ids>230,314,724,777,781,861,882,27905,27906,53772,53774</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/28241739$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Rohart, Florian</creatorcontrib><creatorcontrib>Eslami, Aida</creatorcontrib><creatorcontrib>Matigian, Nicholas</creatorcontrib><creatorcontrib>Bougeard, Stéphanie</creatorcontrib><creatorcontrib>Lê Cao, Kim-Anh</creatorcontrib><title>MINT: a multivariate integrative method to identify reproducible molecular signatures across independent experiments and platforms</title><title>BMC bioinformatics</title><addtitle>BMC Bioinformatics</addtitle><description>Molecular signatures identified from high-throughput transcriptomic studies often have poor reliability and fail to reproduce across studies. One solution is to combine independent studies into a single integrative analysis, additionally increasing sample size. However, the different protocols and technological platforms across transcriptomic studies produce unwanted systematic variation that strongly confounds the integrative analysis results. When studies aim to discriminate an outcome of interest, the common approach is a sequential two-step procedure; unwanted systematic variation removal techniques are applied prior to classification methods.
To limit the risk of overfitting and over-optimistic results of a two-step procedure, we developed a novel multivariate integration method, MINT, that simultaneously accounts for unwanted systematic variation and identifies predictive gene signatures with greater reproducibility and accuracy. In two biological examples on the classification of three human cell types and four subtypes of breast cancer, we combined high-dimensional microarray and RNA-seq data sets and MINT identified highly reproducible and relevant gene signatures predictive of a given phenotype. MINT led to superior classification and prediction accuracy compared to the existing sequential two-step procedures.
MINT is a powerful approach and the first of its kind to solve the integrative classification framework in a single step by combining multiple independent studies. MINT is computationally fast as part of the mixOmics R CRAN package, available at http://www.mixOmics.org/mixMINT/ and http://cran.r-project.org/web/packages/mixOmics/ .</description><subject>Gene Expression Profiling</subject><subject>Humans</subject><subject>Methodology</subject><subject>Multivariate Analysis</subject><subject>Reproducibility of Results</subject><subject>Sample Size</subject><issn>1471-2105</issn><issn>1471-2105</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2017</creationdate><recordtype>article</recordtype><sourceid>EIF</sourceid><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><sourceid>GNUQQ</sourceid><recordid>eNpdUc1u1jAQtBAVLYUH4IIsceES6rXjxOGAhCp-KrVwKWfLsTdfXSVxsJ2KXnlyHL5StZy88s6MZnYIeQXsHYBqThJwJbuKQVuBlKJST8gR1C1UHJh8-mA-JM9TumYFqJh8Rg654jW0ojsivy_Ovl2-p4ZO65j9jYneZKR-zriLpnwgnTBfBUdzoN7hnP1wSyMuMbjV-n4s-zCiXUcTafK72eQ1YqLGxpBS0XG44LzxKP5aMPqpjGU9O7qMJg8hTukFORjMmPDl3XtMfnz-dHn6tTr__uXs9ON5ZWve5qrpsOdQQ4fD0DdicDWUZDXrpVVMYCcaUaOTgBK5tcrJvpUcGslaWziqF8fkw153WfsJnS1Oohn1UkyZeKuD8frxZvZXehdutBS8lUIUgbd3AjH8XDFlPflkcRzNjGFNGlTLVdtIvkHf_Ae9DmucS7wNJXgHddcUFOxRf68Vcbg3A0xvDet9w7oUp7eGtSqc1w9T3DP-VSr-AHyPpRw</recordid><startdate>20170227</startdate><enddate>20170227</enddate><creator>Rohart, Florian</creator><creator>Eslami, Aida</creator><creator>Matigian, Nicholas</creator><creator>Bougeard, Stéphanie</creator><creator>Lê Cao, Kim-Anh</creator><general>BioMed Central</general><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>3V.</scope><scope>7QO</scope><scope>7SC</scope><scope>7X7</scope><scope>7XB</scope><scope>88E</scope><scope>8AL</scope><scope>8AO</scope><scope>8FD</scope><scope>8FE</scope><scope>8FG</scope><scope>8FH</scope><scope>8FI</scope><scope>8FJ</scope><scope>8FK</scope><scope>ABUWG</scope><scope>AEUYN</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>AZQEC</scope><scope>BBNVY</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>BHPHI</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>FR3</scope><scope>FYUFA</scope><scope>GHDGH</scope><scope>GNUQQ</scope><scope>HCIFZ</scope><scope>JQ2</scope><scope>K7-</scope><scope>K9.</scope><scope>L7M</scope><scope>LK8</scope><scope>L~C</scope><scope>L~D</scope><scope>M0N</scope><scope>M0S</scope><scope>M1P</scope><scope>M7P</scope><scope>P5Z</scope><scope>P62</scope><scope>P64</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>Q9U</scope><scope>7X8</scope><scope>5PM</scope></search><sort><creationdate>20170227</creationdate><title>MINT: a multivariate integrative method to identify reproducible molecular signatures across independent experiments and platforms</title><author>Rohart, Florian ; Eslami, Aida ; Matigian, Nicholas ; Bougeard, Stéphanie ; Lê Cao, Kim-Anh</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c427t-69eb21419effb63fd4110540b5c803e93634ed51e5e2cc8d5b75216507c19e8b3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2017</creationdate><topic>Gene Expression Profiling</topic><topic>Humans</topic><topic>Methodology</topic><topic>Multivariate Analysis</topic><topic>Reproducibility of Results</topic><topic>Sample Size</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Rohart, Florian</creatorcontrib><creatorcontrib>Eslami, Aida</creatorcontrib><creatorcontrib>Matigian, Nicholas</creatorcontrib><creatorcontrib>Bougeard, Stéphanie</creatorcontrib><creatorcontrib>Lê Cao, Kim-Anh</creatorcontrib><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>ProQuest Central (Corporate)</collection><collection>Biotechnology Research Abstracts</collection><collection>Computer and Information Systems Abstracts</collection><collection>Health & Medical Collection</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>Medical Database (Alumni Edition)</collection><collection>Computing Database (Alumni Edition)</collection><collection>ProQuest Pharma Collection</collection><collection>Technology Research Database</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Natural Science Collection</collection><collection>Hospital Premium Collection</collection><collection>Hospital Premium Collection (Alumni Edition)</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest One Sustainability</collection><collection>ProQuest Central UK/Ireland</collection><collection>Advanced Technologies & Aerospace Collection</collection><collection>ProQuest Central Essentials</collection><collection>Biological Science Collection</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>Natural Science Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>Engineering Research Database</collection><collection>Health Research Premium Collection</collection><collection>Health Research Premium Collection (Alumni)</collection><collection>ProQuest Central Student</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Computer Science Collection</collection><collection>Computer Science Database</collection><collection>ProQuest Health & Medical Complete (Alumni)</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>ProQuest Biological Science Collection</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>Computing Database</collection><collection>Health & Medical Collection (Alumni Edition)</collection><collection>Medical Database</collection><collection>Biological Science Database</collection><collection>Advanced Technologies & Aerospace Database</collection><collection>ProQuest Advanced Technologies & Aerospace Collection</collection><collection>Biotechnology and BioEngineering Abstracts</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>ProQuest Central Basic</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>BMC bioinformatics</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Rohart, Florian</au><au>Eslami, Aida</au><au>Matigian, Nicholas</au><au>Bougeard, Stéphanie</au><au>Lê Cao, Kim-Anh</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>MINT: a multivariate integrative method to identify reproducible molecular signatures across independent experiments and platforms</atitle><jtitle>BMC bioinformatics</jtitle><addtitle>BMC Bioinformatics</addtitle><date>2017-02-27</date><risdate>2017</risdate><volume>18</volume><issue>1</issue><spage>128</spage><epage>128</epage><pages>128-128</pages><artnum>128</artnum><issn>1471-2105</issn><eissn>1471-2105</eissn><abstract>Molecular signatures identified from high-throughput transcriptomic studies often have poor reliability and fail to reproduce across studies. One solution is to combine independent studies into a single integrative analysis, additionally increasing sample size. However, the different protocols and technological platforms across transcriptomic studies produce unwanted systematic variation that strongly confounds the integrative analysis results. When studies aim to discriminate an outcome of interest, the common approach is a sequential two-step procedure; unwanted systematic variation removal techniques are applied prior to classification methods.
To limit the risk of overfitting and over-optimistic results of a two-step procedure, we developed a novel multivariate integration method, MINT, that simultaneously accounts for unwanted systematic variation and identifies predictive gene signatures with greater reproducibility and accuracy. In two biological examples on the classification of three human cell types and four subtypes of breast cancer, we combined high-dimensional microarray and RNA-seq data sets and MINT identified highly reproducible and relevant gene signatures predictive of a given phenotype. MINT led to superior classification and prediction accuracy compared to the existing sequential two-step procedures.
MINT is a powerful approach and the first of its kind to solve the integrative classification framework in a single step by combining multiple independent studies. MINT is computationally fast as part of the mixOmics R CRAN package, available at http://www.mixOmics.org/mixMINT/ and http://cran.r-project.org/web/packages/mixOmics/ .</abstract><cop>England</cop><pub>BioMed Central</pub><pmid>28241739</pmid><doi>10.1186/s12859-017-1553-8</doi><tpages>1</tpages><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 1471-2105 |
ispartof | BMC bioinformatics, 2017-02, Vol.18 (1), p.128-128, Article 128 |
issn | 1471-2105 1471-2105 |
language | eng |
recordid | cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_5327533 |
source | MEDLINE; DOAJ Directory of Open Access Journals; SpringerLink Journals; Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals; PubMed Central Open Access; Springer Nature OA Free Journals; PubMed Central |
subjects | Gene Expression Profiling Humans Methodology Multivariate Analysis Reproducibility of Results Sample Size |
title | MINT: a multivariate integrative method to identify reproducible molecular signatures across independent experiments and platforms |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-20T05%3A30%3A22IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=MINT:%20a%20multivariate%20integrative%20method%20to%20identify%20reproducible%20molecular%20signatures%20across%20independent%20experiments%20and%20platforms&rft.jtitle=BMC%20bioinformatics&rft.au=Rohart,%20Florian&rft.date=2017-02-27&rft.volume=18&rft.issue=1&rft.spage=128&rft.epage=128&rft.pages=128-128&rft.artnum=128&rft.issn=1471-2105&rft.eissn=1471-2105&rft_id=info:doi/10.1186/s12859-017-1553-8&rft_dat=%3Cproquest_pubme%3E4317818751%3C/proquest_pubme%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=1873291496&rft_id=info:pmid/28241739&rfr_iscdi=true |