RNA-Seq Accurately Identifies Cancer Biomarker Signatures to Distinguish Tissue of Origin
Abstract Metastatic cancer of unknown primary (CUP) accounts for up to 5% of all new cancer cases, with a 5-year survival rate of only 10%. Accurate identification of tissue of origin would allow for directed, personalized therapies to improve clinical outcomes. Our objective was to use transcriptom...
Gespeichert in:
Veröffentlicht in: | Neoplasia (New York, N.Y.) N.Y.), 2014-11, Vol.16 (11), p.918-927 |
---|---|
Hauptverfasser: | , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 927 |
---|---|
container_issue | 11 |
container_start_page | 918 |
container_title | Neoplasia (New York, N.Y.) |
container_volume | 16 |
creator | Wei, Iris H Shi, Yang Jiang, Hui Kumar-Sinha, Chandan Chinnaiyan, Arul M |
description | Abstract Metastatic cancer of unknown primary (CUP) accounts for up to 5% of all new cancer cases, with a 5-year survival rate of only 10%. Accurate identification of tissue of origin would allow for directed, personalized therapies to improve clinical outcomes. Our objective was to use transcriptome sequencing (RNA-Seq) to identify lineage-specific biomarker signatures for the cancer types that most commonly metastasize as CUP (colorectum, kidney, liver, lung, ovary, pancreas, prostate, and stomach). RNA-Seq data of 17,471 transcripts from a total of 3,244 cancer samples across 26 different tissue types were compiled from in-house sequencing data and publically available International Cancer Genome Consortium and The Cancer Genome Atlas datasets. Robust cancer biomarker signatures were extracted using a 10-fold cross-validation method of log transformation, quantile normalization, transcript ranking by area under the receiver operating characteristic curve, and stepwise logistic regression. The entire algorithm was then repeated with a new set of randomly generated training and test sets, yielding highly concordant biomarker signatures. External validation of the cancer-specific signatures yielded high sensitivity (92.0% ± 3.15%; mean ± standard deviation) and specificity (97.7% ± 2.99%) for each cancer biomarker signature. The overall performance of this RNA-Seq biomarker-generating algorithm yielded an accuracy of 90.5%. In conclusion, we demonstrate a computational model for producing highly sensitive and specific cancer biomarker signatures from RNA-Seq data, generating signatures for the top eight cancer types responsible for CUP to accurately identify tumor origin. |
doi_str_mv | 10.1016/j.neo.2014.09.007 |
format | Article |
fullrecord | <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_1787972467</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><els_id>S1476558614001420</els_id><sourcerecordid>1628529018</sourcerecordid><originalsourceid>FETCH-LOGICAL-c484t-d123d67646945e10d54a4177d46ab3ff9e78592ba906bfc6da3a34ca9f91cd3c3</originalsourceid><addsrcrecordid>eNqFkU9r3DAQxUVoyP8P0EvxsRc7kixLFoXCdtukgZBANjnkJLTSeDsbr51IdmG_fbRsUkIPyWkG5r3H8HuEfGa0YJTJ02XRQV9wykRBdUGp2iEHTCiZV1UtP73Z98lhjEuaPEypPbLPK8ErLeUBub-5muQzeMomzo3BDtCuswsP3YANQsymtnMQsh_Yr2x4SNsMF50dxpBuQ5_9xDhgtxgx_sluMcYRsr7JrgMusDsmu41tI5y8zCNyd_brdvo7v7w-v5hOLnMnajHknvHSSyWF1KICRn0lrEhfeiHtvGwaDaquNJ9bTeW8cdLb0pbCWd1o5nzpyiPydZv7GPqnEeJgVhgdtK1NcMZomKqVVlxI9bFU8rrimrI6SdlW6kIfY4DGPAZMDNaGUbOBb5YmmcwGvqHaJPjJ8-UlfpyvwP9zvNJOgm9bASQefxGCiQ4hEfYYwA3G9_hu_Pf_3K7FDp1tH2ANcdmPoUugDTORG2pmm_Y35TORihecls9Hj6ke</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1628529018</pqid></control><display><type>article</type><title>RNA-Seq Accurately Identifies Cancer Biomarker Signatures to Distinguish Tissue of Origin</title><source>MEDLINE</source><source>DOAJ Directory of Open Access Journals</source><source>EZB-FREE-00999 freely available EZB journals</source><source>PubMed Central</source><source>Alma/SFX Local Collection</source><creator>Wei, Iris H ; Shi, Yang ; Jiang, Hui ; Kumar-Sinha, Chandan ; Chinnaiyan, Arul M</creator><creatorcontrib>Wei, Iris H ; Shi, Yang ; Jiang, Hui ; Kumar-Sinha, Chandan ; Chinnaiyan, Arul M</creatorcontrib><description>Abstract Metastatic cancer of unknown primary (CUP) accounts for up to 5% of all new cancer cases, with a 5-year survival rate of only 10%. Accurate identification of tissue of origin would allow for directed, personalized therapies to improve clinical outcomes. Our objective was to use transcriptome sequencing (RNA-Seq) to identify lineage-specific biomarker signatures for the cancer types that most commonly metastasize as CUP (colorectum, kidney, liver, lung, ovary, pancreas, prostate, and stomach). RNA-Seq data of 17,471 transcripts from a total of 3,244 cancer samples across 26 different tissue types were compiled from in-house sequencing data and publically available International Cancer Genome Consortium and The Cancer Genome Atlas datasets. Robust cancer biomarker signatures were extracted using a 10-fold cross-validation method of log transformation, quantile normalization, transcript ranking by area under the receiver operating characteristic curve, and stepwise logistic regression. The entire algorithm was then repeated with a new set of randomly generated training and test sets, yielding highly concordant biomarker signatures. External validation of the cancer-specific signatures yielded high sensitivity (92.0% ± 3.15%; mean ± standard deviation) and specificity (97.7% ± 2.99%) for each cancer biomarker signature. The overall performance of this RNA-Seq biomarker-generating algorithm yielded an accuracy of 90.5%. In conclusion, we demonstrate a computational model for producing highly sensitive and specific cancer biomarker signatures from RNA-Seq data, generating signatures for the top eight cancer types responsible for CUP to accurately identify tumor origin.</description><identifier>ISSN: 1476-5586</identifier><identifier>ISSN: 1522-8002</identifier><identifier>EISSN: 1476-5586</identifier><identifier>DOI: 10.1016/j.neo.2014.09.007</identifier><identifier>PMID: 25425966</identifier><language>eng</language><publisher>United States: Elsevier Inc</publisher><subject>Algorithms ; Biomarkers, Tumor - genetics ; Cell Line, Tumor ; Female ; Gene Expression Profiling - methods ; Gene Expression Regulation, Neoplastic ; Humans ; Logistic Models ; Male ; Models, Genetic ; Neoplasms, Unknown Primary - genetics ; Neoplasms, Unknown Primary - pathology ; Oncology ; Reproducibility of Results ; Sequence Analysis, RNA - methods</subject><ispartof>Neoplasia (New York, N.Y.), 2014-11, Vol.16 (11), p.918-927</ispartof><rights>Neoplasia Press, Inc.</rights><rights>2014 Neoplasia Press, Inc.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c484t-d123d67646945e10d54a4177d46ab3ff9e78592ba906bfc6da3a34ca9f91cd3c3</citedby><cites>FETCH-LOGICAL-c484t-d123d67646945e10d54a4177d46ab3ff9e78592ba906bfc6da3a34ca9f91cd3c3</cites><orcidid>0000-0003-2718-9811 ; 0000-0001-7046-2743</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,776,780,860,27902,27903</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/25425966$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Wei, Iris H</creatorcontrib><creatorcontrib>Shi, Yang</creatorcontrib><creatorcontrib>Jiang, Hui</creatorcontrib><creatorcontrib>Kumar-Sinha, Chandan</creatorcontrib><creatorcontrib>Chinnaiyan, Arul M</creatorcontrib><title>RNA-Seq Accurately Identifies Cancer Biomarker Signatures to Distinguish Tissue of Origin</title><title>Neoplasia (New York, N.Y.)</title><addtitle>Neoplasia</addtitle><description>Abstract Metastatic cancer of unknown primary (CUP) accounts for up to 5% of all new cancer cases, with a 5-year survival rate of only 10%. Accurate identification of tissue of origin would allow for directed, personalized therapies to improve clinical outcomes. Our objective was to use transcriptome sequencing (RNA-Seq) to identify lineage-specific biomarker signatures for the cancer types that most commonly metastasize as CUP (colorectum, kidney, liver, lung, ovary, pancreas, prostate, and stomach). RNA-Seq data of 17,471 transcripts from a total of 3,244 cancer samples across 26 different tissue types were compiled from in-house sequencing data and publically available International Cancer Genome Consortium and The Cancer Genome Atlas datasets. Robust cancer biomarker signatures were extracted using a 10-fold cross-validation method of log transformation, quantile normalization, transcript ranking by area under the receiver operating characteristic curve, and stepwise logistic regression. The entire algorithm was then repeated with a new set of randomly generated training and test sets, yielding highly concordant biomarker signatures. External validation of the cancer-specific signatures yielded high sensitivity (92.0% ± 3.15%; mean ± standard deviation) and specificity (97.7% ± 2.99%) for each cancer biomarker signature. The overall performance of this RNA-Seq biomarker-generating algorithm yielded an accuracy of 90.5%. In conclusion, we demonstrate a computational model for producing highly sensitive and specific cancer biomarker signatures from RNA-Seq data, generating signatures for the top eight cancer types responsible for CUP to accurately identify tumor origin.</description><subject>Algorithms</subject><subject>Biomarkers, Tumor - genetics</subject><subject>Cell Line, Tumor</subject><subject>Female</subject><subject>Gene Expression Profiling - methods</subject><subject>Gene Expression Regulation, Neoplastic</subject><subject>Humans</subject><subject>Logistic Models</subject><subject>Male</subject><subject>Models, Genetic</subject><subject>Neoplasms, Unknown Primary - genetics</subject><subject>Neoplasms, Unknown Primary - pathology</subject><subject>Oncology</subject><subject>Reproducibility of Results</subject><subject>Sequence Analysis, RNA - methods</subject><issn>1476-5586</issn><issn>1522-8002</issn><issn>1476-5586</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2014</creationdate><recordtype>article</recordtype><sourceid>EIF</sourceid><recordid>eNqFkU9r3DAQxUVoyP8P0EvxsRc7kixLFoXCdtukgZBANjnkJLTSeDsbr51IdmG_fbRsUkIPyWkG5r3H8HuEfGa0YJTJ02XRQV9wykRBdUGp2iEHTCiZV1UtP73Z98lhjEuaPEypPbLPK8ErLeUBub-5muQzeMomzo3BDtCuswsP3YANQsymtnMQsh_Yr2x4SNsMF50dxpBuQ5_9xDhgtxgx_sluMcYRsr7JrgMusDsmu41tI5y8zCNyd_brdvo7v7w-v5hOLnMnajHknvHSSyWF1KICRn0lrEhfeiHtvGwaDaquNJ9bTeW8cdLb0pbCWd1o5nzpyiPydZv7GPqnEeJgVhgdtK1NcMZomKqVVlxI9bFU8rrimrI6SdlW6kIfY4DGPAZMDNaGUbOBb5YmmcwGvqHaJPjJ8-UlfpyvwP9zvNJOgm9bASQefxGCiQ4hEfYYwA3G9_hu_Pf_3K7FDp1tH2ANcdmPoUugDTORG2pmm_Y35TORihecls9Hj6ke</recordid><startdate>20141101</startdate><enddate>20141101</enddate><creator>Wei, Iris H</creator><creator>Shi, Yang</creator><creator>Jiang, Hui</creator><creator>Kumar-Sinha, Chandan</creator><creator>Chinnaiyan, Arul M</creator><general>Elsevier Inc</general><scope>6I.</scope><scope>AAFTH</scope><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7X8</scope><scope>7TM</scope><scope>7TO</scope><scope>H94</scope><orcidid>https://orcid.org/0000-0003-2718-9811</orcidid><orcidid>https://orcid.org/0000-0001-7046-2743</orcidid></search><sort><creationdate>20141101</creationdate><title>RNA-Seq Accurately Identifies Cancer Biomarker Signatures to Distinguish Tissue of Origin</title><author>Wei, Iris H ; Shi, Yang ; Jiang, Hui ; Kumar-Sinha, Chandan ; Chinnaiyan, Arul M</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c484t-d123d67646945e10d54a4177d46ab3ff9e78592ba906bfc6da3a34ca9f91cd3c3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2014</creationdate><topic>Algorithms</topic><topic>Biomarkers, Tumor - genetics</topic><topic>Cell Line, Tumor</topic><topic>Female</topic><topic>Gene Expression Profiling - methods</topic><topic>Gene Expression Regulation, Neoplastic</topic><topic>Humans</topic><topic>Logistic Models</topic><topic>Male</topic><topic>Models, Genetic</topic><topic>Neoplasms, Unknown Primary - genetics</topic><topic>Neoplasms, Unknown Primary - pathology</topic><topic>Oncology</topic><topic>Reproducibility of Results</topic><topic>Sequence Analysis, RNA - methods</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Wei, Iris H</creatorcontrib><creatorcontrib>Shi, Yang</creatorcontrib><creatorcontrib>Jiang, Hui</creatorcontrib><creatorcontrib>Kumar-Sinha, Chandan</creatorcontrib><creatorcontrib>Chinnaiyan, Arul M</creatorcontrib><collection>ScienceDirect Open Access Titles</collection><collection>Elsevier:ScienceDirect:Open Access</collection><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>MEDLINE - Academic</collection><collection>Nucleic Acids Abstracts</collection><collection>Oncogenes and Growth Factors Abstracts</collection><collection>AIDS and Cancer Research Abstracts</collection><jtitle>Neoplasia (New York, N.Y.)</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Wei, Iris H</au><au>Shi, Yang</au><au>Jiang, Hui</au><au>Kumar-Sinha, Chandan</au><au>Chinnaiyan, Arul M</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>RNA-Seq Accurately Identifies Cancer Biomarker Signatures to Distinguish Tissue of Origin</atitle><jtitle>Neoplasia (New York, N.Y.)</jtitle><addtitle>Neoplasia</addtitle><date>2014-11-01</date><risdate>2014</risdate><volume>16</volume><issue>11</issue><spage>918</spage><epage>927</epage><pages>918-927</pages><issn>1476-5586</issn><issn>1522-8002</issn><eissn>1476-5586</eissn><abstract>Abstract Metastatic cancer of unknown primary (CUP) accounts for up to 5% of all new cancer cases, with a 5-year survival rate of only 10%. Accurate identification of tissue of origin would allow for directed, personalized therapies to improve clinical outcomes. Our objective was to use transcriptome sequencing (RNA-Seq) to identify lineage-specific biomarker signatures for the cancer types that most commonly metastasize as CUP (colorectum, kidney, liver, lung, ovary, pancreas, prostate, and stomach). RNA-Seq data of 17,471 transcripts from a total of 3,244 cancer samples across 26 different tissue types were compiled from in-house sequencing data and publically available International Cancer Genome Consortium and The Cancer Genome Atlas datasets. Robust cancer biomarker signatures were extracted using a 10-fold cross-validation method of log transformation, quantile normalization, transcript ranking by area under the receiver operating characteristic curve, and stepwise logistic regression. The entire algorithm was then repeated with a new set of randomly generated training and test sets, yielding highly concordant biomarker signatures. External validation of the cancer-specific signatures yielded high sensitivity (92.0% ± 3.15%; mean ± standard deviation) and specificity (97.7% ± 2.99%) for each cancer biomarker signature. The overall performance of this RNA-Seq biomarker-generating algorithm yielded an accuracy of 90.5%. In conclusion, we demonstrate a computational model for producing highly sensitive and specific cancer biomarker signatures from RNA-Seq data, generating signatures for the top eight cancer types responsible for CUP to accurately identify tumor origin.</abstract><cop>United States</cop><pub>Elsevier Inc</pub><pmid>25425966</pmid><doi>10.1016/j.neo.2014.09.007</doi><tpages>10</tpages><orcidid>https://orcid.org/0000-0003-2718-9811</orcidid><orcidid>https://orcid.org/0000-0001-7046-2743</orcidid><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 1476-5586 |
ispartof | Neoplasia (New York, N.Y.), 2014-11, Vol.16 (11), p.918-927 |
issn | 1476-5586 1522-8002 1476-5586 |
language | eng |
recordid | cdi_proquest_miscellaneous_1787972467 |
source | MEDLINE; DOAJ Directory of Open Access Journals; EZB-FREE-00999 freely available EZB journals; PubMed Central; Alma/SFX Local Collection |
subjects | Algorithms Biomarkers, Tumor - genetics Cell Line, Tumor Female Gene Expression Profiling - methods Gene Expression Regulation, Neoplastic Humans Logistic Models Male Models, Genetic Neoplasms, Unknown Primary - genetics Neoplasms, Unknown Primary - pathology Oncology Reproducibility of Results Sequence Analysis, RNA - methods |
title | RNA-Seq Accurately Identifies Cancer Biomarker Signatures to Distinguish Tissue of Origin |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-27T09%3A29%3A20IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=RNA-Seq%20Accurately%20Identifies%20Cancer%20Biomarker%20Signatures%20to%20Distinguish%20Tissue%20of%20Origin&rft.jtitle=Neoplasia%20(New%20York,%20N.Y.)&rft.au=Wei,%20Iris%20H&rft.date=2014-11-01&rft.volume=16&rft.issue=11&rft.spage=918&rft.epage=927&rft.pages=918-927&rft.issn=1476-5586&rft.eissn=1476-5586&rft_id=info:doi/10.1016/j.neo.2014.09.007&rft_dat=%3Cproquest_cross%3E1628529018%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=1628529018&rft_id=info:pmid/25425966&rft_els_id=S1476558614001420&rfr_iscdi=true |