Identifying cancer sub-types from genomic scale data sets using confidence based integration (CBI)

[Display omitted] •Disease subtyping involves extracting finer differences between samples.•The extraction of differences is a consensual process.•At every transit phase, decisions are taken based on the confidence of each participating feature.•This research accommodates outliers in data sets by a...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Journal of biomedical informatics 2022-02, Vol.126, p.103997-103997, Article 103997
Hauptverfasser: Sreekumar, R., Khursheed, Farida
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 103997
container_issue
container_start_page 103997
container_title Journal of biomedical informatics
container_volume 126
creator Sreekumar, R.
Khursheed, Farida
description [Display omitted] •Disease subtyping involves extracting finer differences between samples.•The extraction of differences is a consensual process.•At every transit phase, decisions are taken based on the confidence of each participating feature.•This research accommodates outliers in data sets by a smooth transition process.•A measure of self confidence is devised based on an assumption that “close neighbors have common neighbors”. Precision medicine is a method involving refined diagnosis of patients and searching for causes that are unseen in their patient cohorts who otherwise have largely similar health conditions. As the technology evolved to extract features from a wide variety of sources including genetics, a large quantum of data is available to the researchers for conducting micro studies in the field of disease and cures. In cancer research, integrative methods using genomic data sets has become a major area of interest. The petabytes of data that is available at The Cancer Genome Atlas (TCGA), a program jointly under NCI and National Human Genome Research Institute, has made possible more nuanced research in cancer genomics. Our method, Confidence Based Integration (CBI) is an integration method to extract similar as well as complementing information from the genomic data sets. This information will provide insight into the status of patients and their prospects. We used the expression data sets of gene, miRNA and DNA methylation in our fusion experiments on five different cancer types. These data sets, after fusion, are clustered using 'Spectral Clustering' algorithm, which derives clusters that form the disease sub types. Survival properties of each sub type demonstrates the reasons to consider the samples inside them highly similar. The performance of CBI, we report, is better, in terms of P-value in log-rank test, than other methods like similarity network fusion or SNF in forming clusters of significance. Individual features clustered extremely poor compared to CBI in most of the experiments.
doi_str_mv 10.1016/j.jbi.2022.103997
format Article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_2622277819</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><els_id>S1532046422000132</els_id><sourcerecordid>2622277819</sourcerecordid><originalsourceid>FETCH-LOGICAL-c348t-c5cddf6c6696152699d1d3ee8aa853af3f07a965771c7c1cbbae6a33cf1c7dce3</originalsourceid><addsrcrecordid>eNp9kMtOwzAQRS0E4v0BbJCXsEjxo3YSsYKKRyUkNrC2nPG4ctUkxXaQ-vekFFiymhnp3CvNIeSCswlnXN8sJ8smTAQTYrxlXZd75JgrKQo2rdj-366nR-QkpSVjnCulD8mRVExxzatj0swddjn4TegWFGwHGGkamiJv1pioj31LF9j1bQCawK6QOpstTZgTHdJ3pu98GDsAaWMTOhq6jItoc-g7ejW7n1-fkQNvVwnPf-YpeX98eJs9Fy-vT_PZ3UsBclrlAhQ45zVoXWuuhK5rx51ErKytlLReelbaWquy5FACh6axqK2U4MfbAcpTcrXrXcf-Y8CUTRsS4GplO-yHZIQWQpRlxesR5TsUYp9SRG_WMbQ2bgxnZqvWLM2o1mzVmp3aMXP5Uz80Lbq_xK_LEbjdATg--RkwmgRhK8aFiJCN68M_9V_tEIpN</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2622277819</pqid></control><display><type>article</type><title>Identifying cancer sub-types from genomic scale data sets using confidence based integration (CBI)</title><source>MEDLINE</source><source>Elsevier ScienceDirect Journals</source><source>Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals</source><creator>Sreekumar, R. ; Khursheed, Farida</creator><creatorcontrib>Sreekumar, R. ; Khursheed, Farida</creatorcontrib><description>[Display omitted] •Disease subtyping involves extracting finer differences between samples.•The extraction of differences is a consensual process.•At every transit phase, decisions are taken based on the confidence of each participating feature.•This research accommodates outliers in data sets by a smooth transition process.•A measure of self confidence is devised based on an assumption that “close neighbors have common neighbors”. Precision medicine is a method involving refined diagnosis of patients and searching for causes that are unseen in their patient cohorts who otherwise have largely similar health conditions. As the technology evolved to extract features from a wide variety of sources including genetics, a large quantum of data is available to the researchers for conducting micro studies in the field of disease and cures. In cancer research, integrative methods using genomic data sets has become a major area of interest. The petabytes of data that is available at The Cancer Genome Atlas (TCGA), a program jointly under NCI and National Human Genome Research Institute, has made possible more nuanced research in cancer genomics. Our method, Confidence Based Integration (CBI) is an integration method to extract similar as well as complementing information from the genomic data sets. This information will provide insight into the status of patients and their prospects. We used the expression data sets of gene, miRNA and DNA methylation in our fusion experiments on five different cancer types. These data sets, after fusion, are clustered using 'Spectral Clustering' algorithm, which derives clusters that form the disease sub types. Survival properties of each sub type demonstrates the reasons to consider the samples inside them highly similar. The performance of CBI, we report, is better, in terms of P-value in log-rank test, than other methods like similarity network fusion or SNF in forming clusters of significance. Individual features clustered extremely poor compared to CBI in most of the experiments.</description><identifier>ISSN: 1532-0464</identifier><identifier>EISSN: 1532-0480</identifier><identifier>DOI: 10.1016/j.jbi.2022.103997</identifier><identifier>PMID: 35051618</identifier><language>eng</language><publisher>United States: Elsevier Inc</publisher><subject>Cluster Analysis ; Genome ; Genomics - methods ; Humans ; MicroRNAs - genetics ; MicroRNAs - metabolism ; Neoplasms - genetics</subject><ispartof>Journal of biomedical informatics, 2022-02, Vol.126, p.103997-103997, Article 103997</ispartof><rights>2022 Elsevier Inc.</rights><rights>Copyright © 2022 Elsevier Inc. All rights reserved.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c348t-c5cddf6c6696152699d1d3ee8aa853af3f07a965771c7c1cbbae6a33cf1c7dce3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://dx.doi.org/10.1016/j.jbi.2022.103997$$EHTML$$P50$$Gelsevier$$Hfree_for_read</linktohtml><link.rule.ids>314,777,781,3537,27905,27906,45976</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/35051618$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Sreekumar, R.</creatorcontrib><creatorcontrib>Khursheed, Farida</creatorcontrib><title>Identifying cancer sub-types from genomic scale data sets using confidence based integration (CBI)</title><title>Journal of biomedical informatics</title><addtitle>J Biomed Inform</addtitle><description>[Display omitted] •Disease subtyping involves extracting finer differences between samples.•The extraction of differences is a consensual process.•At every transit phase, decisions are taken based on the confidence of each participating feature.•This research accommodates outliers in data sets by a smooth transition process.•A measure of self confidence is devised based on an assumption that “close neighbors have common neighbors”. Precision medicine is a method involving refined diagnosis of patients and searching for causes that are unseen in their patient cohorts who otherwise have largely similar health conditions. As the technology evolved to extract features from a wide variety of sources including genetics, a large quantum of data is available to the researchers for conducting micro studies in the field of disease and cures. In cancer research, integrative methods using genomic data sets has become a major area of interest. The petabytes of data that is available at The Cancer Genome Atlas (TCGA), a program jointly under NCI and National Human Genome Research Institute, has made possible more nuanced research in cancer genomics. Our method, Confidence Based Integration (CBI) is an integration method to extract similar as well as complementing information from the genomic data sets. This information will provide insight into the status of patients and their prospects. We used the expression data sets of gene, miRNA and DNA methylation in our fusion experiments on five different cancer types. These data sets, after fusion, are clustered using 'Spectral Clustering' algorithm, which derives clusters that form the disease sub types. Survival properties of each sub type demonstrates the reasons to consider the samples inside them highly similar. The performance of CBI, we report, is better, in terms of P-value in log-rank test, than other methods like similarity network fusion or SNF in forming clusters of significance. Individual features clustered extremely poor compared to CBI in most of the experiments.</description><subject>Cluster Analysis</subject><subject>Genome</subject><subject>Genomics - methods</subject><subject>Humans</subject><subject>MicroRNAs - genetics</subject><subject>MicroRNAs - metabolism</subject><subject>Neoplasms - genetics</subject><issn>1532-0464</issn><issn>1532-0480</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><sourceid>EIF</sourceid><recordid>eNp9kMtOwzAQRS0E4v0BbJCXsEjxo3YSsYKKRyUkNrC2nPG4ctUkxXaQ-vekFFiymhnp3CvNIeSCswlnXN8sJ8smTAQTYrxlXZd75JgrKQo2rdj-366nR-QkpSVjnCulD8mRVExxzatj0swddjn4TegWFGwHGGkamiJv1pioj31LF9j1bQCawK6QOpstTZgTHdJ3pu98GDsAaWMTOhq6jItoc-g7ejW7n1-fkQNvVwnPf-YpeX98eJs9Fy-vT_PZ3UsBclrlAhQ45zVoXWuuhK5rx51ErKytlLReelbaWquy5FACh6axqK2U4MfbAcpTcrXrXcf-Y8CUTRsS4GplO-yHZIQWQpRlxesR5TsUYp9SRG_WMbQ2bgxnZqvWLM2o1mzVmp3aMXP5Uz80Lbq_xK_LEbjdATg--RkwmgRhK8aFiJCN68M_9V_tEIpN</recordid><startdate>202202</startdate><enddate>202202</enddate><creator>Sreekumar, R.</creator><creator>Khursheed, Farida</creator><general>Elsevier Inc</general><scope>6I.</scope><scope>AAFTH</scope><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7X8</scope></search><sort><creationdate>202202</creationdate><title>Identifying cancer sub-types from genomic scale data sets using confidence based integration (CBI)</title><author>Sreekumar, R. ; Khursheed, Farida</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c348t-c5cddf6c6696152699d1d3ee8aa853af3f07a965771c7c1cbbae6a33cf1c7dce3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Cluster Analysis</topic><topic>Genome</topic><topic>Genomics - methods</topic><topic>Humans</topic><topic>MicroRNAs - genetics</topic><topic>MicroRNAs - metabolism</topic><topic>Neoplasms - genetics</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Sreekumar, R.</creatorcontrib><creatorcontrib>Khursheed, Farida</creatorcontrib><collection>ScienceDirect Open Access Titles</collection><collection>Elsevier:ScienceDirect:Open Access</collection><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>MEDLINE - Academic</collection><jtitle>Journal of biomedical informatics</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Sreekumar, R.</au><au>Khursheed, Farida</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Identifying cancer sub-types from genomic scale data sets using confidence based integration (CBI)</atitle><jtitle>Journal of biomedical informatics</jtitle><addtitle>J Biomed Inform</addtitle><date>2022-02</date><risdate>2022</risdate><volume>126</volume><spage>103997</spage><epage>103997</epage><pages>103997-103997</pages><artnum>103997</artnum><issn>1532-0464</issn><eissn>1532-0480</eissn><abstract>[Display omitted] •Disease subtyping involves extracting finer differences between samples.•The extraction of differences is a consensual process.•At every transit phase, decisions are taken based on the confidence of each participating feature.•This research accommodates outliers in data sets by a smooth transition process.•A measure of self confidence is devised based on an assumption that “close neighbors have common neighbors”. Precision medicine is a method involving refined diagnosis of patients and searching for causes that are unseen in their patient cohorts who otherwise have largely similar health conditions. As the technology evolved to extract features from a wide variety of sources including genetics, a large quantum of data is available to the researchers for conducting micro studies in the field of disease and cures. In cancer research, integrative methods using genomic data sets has become a major area of interest. The petabytes of data that is available at The Cancer Genome Atlas (TCGA), a program jointly under NCI and National Human Genome Research Institute, has made possible more nuanced research in cancer genomics. Our method, Confidence Based Integration (CBI) is an integration method to extract similar as well as complementing information from the genomic data sets. This information will provide insight into the status of patients and their prospects. We used the expression data sets of gene, miRNA and DNA methylation in our fusion experiments on five different cancer types. These data sets, after fusion, are clustered using 'Spectral Clustering' algorithm, which derives clusters that form the disease sub types. Survival properties of each sub type demonstrates the reasons to consider the samples inside them highly similar. The performance of CBI, we report, is better, in terms of P-value in log-rank test, than other methods like similarity network fusion or SNF in forming clusters of significance. Individual features clustered extremely poor compared to CBI in most of the experiments.</abstract><cop>United States</cop><pub>Elsevier Inc</pub><pmid>35051618</pmid><doi>10.1016/j.jbi.2022.103997</doi><tpages>1</tpages><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 1532-0464
ispartof Journal of biomedical informatics, 2022-02, Vol.126, p.103997-103997, Article 103997
issn 1532-0464
1532-0480
language eng
recordid cdi_proquest_miscellaneous_2622277819
source MEDLINE; Elsevier ScienceDirect Journals; Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals
subjects Cluster Analysis
Genome
Genomics - methods
Humans
MicroRNAs - genetics
MicroRNAs - metabolism
Neoplasms - genetics
title Identifying cancer sub-types from genomic scale data sets using confidence based integration (CBI)
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-21T08%3A24%3A21IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Identifying%20cancer%20sub-types%20from%20genomic%20scale%20data%20sets%20using%20confidence%20based%20integration%20(CBI)&rft.jtitle=Journal%20of%20biomedical%20informatics&rft.au=Sreekumar,%20R.&rft.date=2022-02&rft.volume=126&rft.spage=103997&rft.epage=103997&rft.pages=103997-103997&rft.artnum=103997&rft.issn=1532-0464&rft.eissn=1532-0480&rft_id=info:doi/10.1016/j.jbi.2022.103997&rft_dat=%3Cproquest_cross%3E2622277819%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2622277819&rft_id=info:pmid/35051618&rft_els_id=S1532046422000132&rfr_iscdi=true