MINT: a multivariate integrative method to identify reproducible molecular signatures across independent experiments and platforms

Molecular signatures identified from high-throughput transcriptomic studies often have poor reliability and fail to reproduce across studies. One solution is to combine independent studies into a single integrative analysis, additionally increasing sample size. However, the different protocols and t...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:BMC bioinformatics 2017-02, Vol.18 (1), p.128-128, Article 128
Hauptverfasser: Rohart, Florian, Eslami, Aida, Matigian, Nicholas, Bougeard, Stéphanie, Lê Cao, Kim-Anh
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 128
container_issue 1
container_start_page 128
container_title BMC bioinformatics
container_volume 18
creator Rohart, Florian
Eslami, Aida
Matigian, Nicholas
Bougeard, Stéphanie
Lê Cao, Kim-Anh
description Molecular signatures identified from high-throughput transcriptomic studies often have poor reliability and fail to reproduce across studies. One solution is to combine independent studies into a single integrative analysis, additionally increasing sample size. However, the different protocols and technological platforms across transcriptomic studies produce unwanted systematic variation that strongly confounds the integrative analysis results. When studies aim to discriminate an outcome of interest, the common approach is a sequential two-step procedure; unwanted systematic variation removal techniques are applied prior to classification methods. To limit the risk of overfitting and over-optimistic results of a two-step procedure, we developed a novel multivariate integration method, MINT, that simultaneously accounts for unwanted systematic variation and identifies predictive gene signatures with greater reproducibility and accuracy. In two biological examples on the classification of three human cell types and four subtypes of breast cancer, we combined high-dimensional microarray and RNA-seq data sets and MINT identified highly reproducible and relevant gene signatures predictive of a given phenotype. MINT led to superior classification and prediction accuracy compared to the existing sequential two-step procedures. MINT is a powerful approach and the first of its kind to solve the integrative classification framework in a single step by combining multiple independent studies. MINT is computationally fast as part of the mixOmics R CRAN package, available at http://www.mixOmics.org/mixMINT/ and http://cran.r-project.org/web/packages/mixOmics/ .
doi_str_mv 10.1186/s12859-017-1553-8
format Article
fullrecord <record><control><sourceid>proquest_pubme</sourceid><recordid>TN_cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_5327533</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>4317818751</sourcerecordid><originalsourceid>FETCH-LOGICAL-c427t-69eb21419effb63fd4110540b5c803e93634ed51e5e2cc8d5b75216507c19e8b3</originalsourceid><addsrcrecordid>eNpdUc1u1jAQtBAVLYUH4IIsceES6rXjxOGAhCp-KrVwKWfLsTdfXSVxsJ2KXnlyHL5StZy88s6MZnYIeQXsHYBqThJwJbuKQVuBlKJST8gR1C1UHJh8-mA-JM9TumYFqJh8Rg654jW0ojsivy_Ovl2-p4ZO65j9jYneZKR-zriLpnwgnTBfBUdzoN7hnP1wSyMuMbjV-n4s-zCiXUcTafK72eQ1YqLGxpBS0XG44LzxKP5aMPqpjGU9O7qMJg8hTukFORjMmPDl3XtMfnz-dHn6tTr__uXs9ON5ZWve5qrpsOdQQ4fD0DdicDWUZDXrpVVMYCcaUaOTgBK5tcrJvpUcGslaWziqF8fkw153WfsJnS1Oohn1UkyZeKuD8frxZvZXehdutBS8lUIUgbd3AjH8XDFlPflkcRzNjGFNGlTLVdtIvkHf_Ae9DmucS7wNJXgHddcUFOxRf68Vcbg3A0xvDet9w7oUp7eGtSqc1w9T3DP-VSr-AHyPpRw</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1873291496</pqid></control><display><type>article</type><title>MINT: a multivariate integrative method to identify reproducible molecular signatures across independent experiments and platforms</title><source>MEDLINE</source><source>DOAJ Directory of Open Access Journals</source><source>SpringerLink Journals</source><source>Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals</source><source>PubMed Central Open Access</source><source>Springer Nature OA Free Journals</source><source>PubMed Central</source><creator>Rohart, Florian ; Eslami, Aida ; Matigian, Nicholas ; Bougeard, Stéphanie ; Lê Cao, Kim-Anh</creator><creatorcontrib>Rohart, Florian ; Eslami, Aida ; Matigian, Nicholas ; Bougeard, Stéphanie ; Lê Cao, Kim-Anh</creatorcontrib><description>Molecular signatures identified from high-throughput transcriptomic studies often have poor reliability and fail to reproduce across studies. One solution is to combine independent studies into a single integrative analysis, additionally increasing sample size. However, the different protocols and technological platforms across transcriptomic studies produce unwanted systematic variation that strongly confounds the integrative analysis results. When studies aim to discriminate an outcome of interest, the common approach is a sequential two-step procedure; unwanted systematic variation removal techniques are applied prior to classification methods. To limit the risk of overfitting and over-optimistic results of a two-step procedure, we developed a novel multivariate integration method, MINT, that simultaneously accounts for unwanted systematic variation and identifies predictive gene signatures with greater reproducibility and accuracy. In two biological examples on the classification of three human cell types and four subtypes of breast cancer, we combined high-dimensional microarray and RNA-seq data sets and MINT identified highly reproducible and relevant gene signatures predictive of a given phenotype. MINT led to superior classification and prediction accuracy compared to the existing sequential two-step procedures. MINT is a powerful approach and the first of its kind to solve the integrative classification framework in a single step by combining multiple independent studies. MINT is computationally fast as part of the mixOmics R CRAN package, available at http://www.mixOmics.org/mixMINT/ and http://cran.r-project.org/web/packages/mixOmics/ .</description><identifier>ISSN: 1471-2105</identifier><identifier>EISSN: 1471-2105</identifier><identifier>DOI: 10.1186/s12859-017-1553-8</identifier><identifier>PMID: 28241739</identifier><language>eng</language><publisher>England: BioMed Central</publisher><subject>Gene Expression Profiling ; Humans ; Methodology ; Multivariate Analysis ; Reproducibility of Results ; Sample Size</subject><ispartof>BMC bioinformatics, 2017-02, Vol.18 (1), p.128-128, Article 128</ispartof><rights>Copyright BioMed Central 2017</rights><rights>The Author(s) 2017</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c427t-69eb21419effb63fd4110540b5c803e93634ed51e5e2cc8d5b75216507c19e8b3</citedby><cites>FETCH-LOGICAL-c427t-69eb21419effb63fd4110540b5c803e93634ed51e5e2cc8d5b75216507c19e8b3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC5327533/pdf/$$EPDF$$P50$$Gpubmedcentral$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC5327533/$$EHTML$$P50$$Gpubmedcentral$$Hfree_for_read</linktohtml><link.rule.ids>230,314,724,777,781,861,882,27905,27906,53772,53774</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/28241739$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Rohart, Florian</creatorcontrib><creatorcontrib>Eslami, Aida</creatorcontrib><creatorcontrib>Matigian, Nicholas</creatorcontrib><creatorcontrib>Bougeard, Stéphanie</creatorcontrib><creatorcontrib>Lê Cao, Kim-Anh</creatorcontrib><title>MINT: a multivariate integrative method to identify reproducible molecular signatures across independent experiments and platforms</title><title>BMC bioinformatics</title><addtitle>BMC Bioinformatics</addtitle><description>Molecular signatures identified from high-throughput transcriptomic studies often have poor reliability and fail to reproduce across studies. One solution is to combine independent studies into a single integrative analysis, additionally increasing sample size. However, the different protocols and technological platforms across transcriptomic studies produce unwanted systematic variation that strongly confounds the integrative analysis results. When studies aim to discriminate an outcome of interest, the common approach is a sequential two-step procedure; unwanted systematic variation removal techniques are applied prior to classification methods. To limit the risk of overfitting and over-optimistic results of a two-step procedure, we developed a novel multivariate integration method, MINT, that simultaneously accounts for unwanted systematic variation and identifies predictive gene signatures with greater reproducibility and accuracy. In two biological examples on the classification of three human cell types and four subtypes of breast cancer, we combined high-dimensional microarray and RNA-seq data sets and MINT identified highly reproducible and relevant gene signatures predictive of a given phenotype. MINT led to superior classification and prediction accuracy compared to the existing sequential two-step procedures. MINT is a powerful approach and the first of its kind to solve the integrative classification framework in a single step by combining multiple independent studies. MINT is computationally fast as part of the mixOmics R CRAN package, available at http://www.mixOmics.org/mixMINT/ and http://cran.r-project.org/web/packages/mixOmics/ .</description><subject>Gene Expression Profiling</subject><subject>Humans</subject><subject>Methodology</subject><subject>Multivariate Analysis</subject><subject>Reproducibility of Results</subject><subject>Sample Size</subject><issn>1471-2105</issn><issn>1471-2105</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2017</creationdate><recordtype>article</recordtype><sourceid>EIF</sourceid><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><sourceid>GNUQQ</sourceid><recordid>eNpdUc1u1jAQtBAVLYUH4IIsceES6rXjxOGAhCp-KrVwKWfLsTdfXSVxsJ2KXnlyHL5StZy88s6MZnYIeQXsHYBqThJwJbuKQVuBlKJST8gR1C1UHJh8-mA-JM9TumYFqJh8Rg654jW0ojsivy_Ovl2-p4ZO65j9jYneZKR-zriLpnwgnTBfBUdzoN7hnP1wSyMuMbjV-n4s-zCiXUcTafK72eQ1YqLGxpBS0XG44LzxKP5aMPqpjGU9O7qMJg8hTukFORjMmPDl3XtMfnz-dHn6tTr__uXs9ON5ZWve5qrpsOdQQ4fD0DdicDWUZDXrpVVMYCcaUaOTgBK5tcrJvpUcGslaWziqF8fkw153WfsJnS1Oohn1UkyZeKuD8frxZvZXehdutBS8lUIUgbd3AjH8XDFlPflkcRzNjGFNGlTLVdtIvkHf_Ae9DmucS7wNJXgHddcUFOxRf68Vcbg3A0xvDet9w7oUp7eGtSqc1w9T3DP-VSr-AHyPpRw</recordid><startdate>20170227</startdate><enddate>20170227</enddate><creator>Rohart, Florian</creator><creator>Eslami, Aida</creator><creator>Matigian, Nicholas</creator><creator>Bougeard, Stéphanie</creator><creator>Lê Cao, Kim-Anh</creator><general>BioMed Central</general><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>3V.</scope><scope>7QO</scope><scope>7SC</scope><scope>7X7</scope><scope>7XB</scope><scope>88E</scope><scope>8AL</scope><scope>8AO</scope><scope>8FD</scope><scope>8FE</scope><scope>8FG</scope><scope>8FH</scope><scope>8FI</scope><scope>8FJ</scope><scope>8FK</scope><scope>ABUWG</scope><scope>AEUYN</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>AZQEC</scope><scope>BBNVY</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>BHPHI</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>FR3</scope><scope>FYUFA</scope><scope>GHDGH</scope><scope>GNUQQ</scope><scope>HCIFZ</scope><scope>JQ2</scope><scope>K7-</scope><scope>K9.</scope><scope>L7M</scope><scope>LK8</scope><scope>L~C</scope><scope>L~D</scope><scope>M0N</scope><scope>M0S</scope><scope>M1P</scope><scope>M7P</scope><scope>P5Z</scope><scope>P62</scope><scope>P64</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>Q9U</scope><scope>7X8</scope><scope>5PM</scope></search><sort><creationdate>20170227</creationdate><title>MINT: a multivariate integrative method to identify reproducible molecular signatures across independent experiments and platforms</title><author>Rohart, Florian ; Eslami, Aida ; Matigian, Nicholas ; Bougeard, Stéphanie ; Lê Cao, Kim-Anh</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c427t-69eb21419effb63fd4110540b5c803e93634ed51e5e2cc8d5b75216507c19e8b3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2017</creationdate><topic>Gene Expression Profiling</topic><topic>Humans</topic><topic>Methodology</topic><topic>Multivariate Analysis</topic><topic>Reproducibility of Results</topic><topic>Sample Size</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Rohart, Florian</creatorcontrib><creatorcontrib>Eslami, Aida</creatorcontrib><creatorcontrib>Matigian, Nicholas</creatorcontrib><creatorcontrib>Bougeard, Stéphanie</creatorcontrib><creatorcontrib>Lê Cao, Kim-Anh</creatorcontrib><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>ProQuest Central (Corporate)</collection><collection>Biotechnology Research Abstracts</collection><collection>Computer and Information Systems Abstracts</collection><collection>Health &amp; Medical Collection</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>Medical Database (Alumni Edition)</collection><collection>Computing Database (Alumni Edition)</collection><collection>ProQuest Pharma Collection</collection><collection>Technology Research Database</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Natural Science Collection</collection><collection>Hospital Premium Collection</collection><collection>Hospital Premium Collection (Alumni Edition)</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest One Sustainability</collection><collection>ProQuest Central UK/Ireland</collection><collection>Advanced Technologies &amp; Aerospace Collection</collection><collection>ProQuest Central Essentials</collection><collection>Biological Science Collection</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>Natural Science Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>Engineering Research Database</collection><collection>Health Research Premium Collection</collection><collection>Health Research Premium Collection (Alumni)</collection><collection>ProQuest Central Student</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Computer Science Collection</collection><collection>Computer Science Database</collection><collection>ProQuest Health &amp; Medical Complete (Alumni)</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>ProQuest Biological Science Collection</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>Computing Database</collection><collection>Health &amp; Medical Collection (Alumni Edition)</collection><collection>Medical Database</collection><collection>Biological Science Database</collection><collection>Advanced Technologies &amp; Aerospace Database</collection><collection>ProQuest Advanced Technologies &amp; Aerospace Collection</collection><collection>Biotechnology and BioEngineering Abstracts</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>ProQuest Central Basic</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>BMC bioinformatics</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Rohart, Florian</au><au>Eslami, Aida</au><au>Matigian, Nicholas</au><au>Bougeard, Stéphanie</au><au>Lê Cao, Kim-Anh</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>MINT: a multivariate integrative method to identify reproducible molecular signatures across independent experiments and platforms</atitle><jtitle>BMC bioinformatics</jtitle><addtitle>BMC Bioinformatics</addtitle><date>2017-02-27</date><risdate>2017</risdate><volume>18</volume><issue>1</issue><spage>128</spage><epage>128</epage><pages>128-128</pages><artnum>128</artnum><issn>1471-2105</issn><eissn>1471-2105</eissn><abstract>Molecular signatures identified from high-throughput transcriptomic studies often have poor reliability and fail to reproduce across studies. One solution is to combine independent studies into a single integrative analysis, additionally increasing sample size. However, the different protocols and technological platforms across transcriptomic studies produce unwanted systematic variation that strongly confounds the integrative analysis results. When studies aim to discriminate an outcome of interest, the common approach is a sequential two-step procedure; unwanted systematic variation removal techniques are applied prior to classification methods. To limit the risk of overfitting and over-optimistic results of a two-step procedure, we developed a novel multivariate integration method, MINT, that simultaneously accounts for unwanted systematic variation and identifies predictive gene signatures with greater reproducibility and accuracy. In two biological examples on the classification of three human cell types and four subtypes of breast cancer, we combined high-dimensional microarray and RNA-seq data sets and MINT identified highly reproducible and relevant gene signatures predictive of a given phenotype. MINT led to superior classification and prediction accuracy compared to the existing sequential two-step procedures. MINT is a powerful approach and the first of its kind to solve the integrative classification framework in a single step by combining multiple independent studies. MINT is computationally fast as part of the mixOmics R CRAN package, available at http://www.mixOmics.org/mixMINT/ and http://cran.r-project.org/web/packages/mixOmics/ .</abstract><cop>England</cop><pub>BioMed Central</pub><pmid>28241739</pmid><doi>10.1186/s12859-017-1553-8</doi><tpages>1</tpages><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 1471-2105
ispartof BMC bioinformatics, 2017-02, Vol.18 (1), p.128-128, Article 128
issn 1471-2105
1471-2105
language eng
recordid cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_5327533
source MEDLINE; DOAJ Directory of Open Access Journals; SpringerLink Journals; Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals; PubMed Central Open Access; Springer Nature OA Free Journals; PubMed Central
subjects Gene Expression Profiling
Humans
Methodology
Multivariate Analysis
Reproducibility of Results
Sample Size
title MINT: a multivariate integrative method to identify reproducible molecular signatures across independent experiments and platforms
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-20T05%3A30%3A22IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=MINT:%20a%20multivariate%20integrative%20method%20to%20identify%20reproducible%20molecular%20signatures%20across%20independent%20experiments%20and%20platforms&rft.jtitle=BMC%20bioinformatics&rft.au=Rohart,%20Florian&rft.date=2017-02-27&rft.volume=18&rft.issue=1&rft.spage=128&rft.epage=128&rft.pages=128-128&rft.artnum=128&rft.issn=1471-2105&rft.eissn=1471-2105&rft_id=info:doi/10.1186/s12859-017-1553-8&rft_dat=%3Cproquest_pubme%3E4317818751%3C/proquest_pubme%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=1873291496&rft_id=info:pmid/28241739&rfr_iscdi=true