Identifying cancer sub-types from genomic scale data sets using confidence based integration (CBI)

[Display omitted] •Disease subtyping involves extracting finer differences between samples.•The extraction of differences is a consensual process.•At every transit phase, decisions are taken based on the confidence of each participating feature.•This research accommodates outliers in data sets by a...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Journal of biomedical informatics 2022-02, Vol.126, p.103997-103997, Article 103997
Hauptverfasser:	Sreekumar, R., Khursheed, Farida
Format:	Artikel
Sprache:	eng
Schlagworte:	Cluster Analysis Genome Genomics - methods Humans MicroRNAs - genetics MicroRNAs - metabolism Neoplasms - genetics
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	103997
container_issue
container_start_page	103997
container_title	Journal of biomedical informatics
container_volume	126
creator	Sreekumar, R. Khursheed, Farida
description	[Display omitted] •Disease subtyping involves extracting finer differences between samples.•The extraction of differences is a consensual process.•At every transit phase, decisions are taken based on the confidence of each participating feature.•This research accommodates outliers in data sets by a smooth transition process.•A measure of self confidence is devised based on an assumption that “close neighbors have common neighbors”. Precision medicine is a method involving refined diagnosis of patients and searching for causes that are unseen in their patient cohorts who otherwise have largely similar health conditions. As the technology evolved to extract features from a wide variety of sources including genetics, a large quantum of data is available to the researchers for conducting micro studies in the field of disease and cures. In cancer research, integrative methods using genomic data sets has become a major area of interest. The petabytes of data that is available at The Cancer Genome Atlas (TCGA), a program jointly under NCI and National Human Genome Research Institute, has made possible more nuanced research in cancer genomics. Our method, Confidence Based Integration (CBI) is an integration method to extract similar as well as complementing information from the genomic data sets. This information will provide insight into the status of patients and their prospects. We used the expression data sets of gene, miRNA and DNA methylation in our fusion experiments on five different cancer types. These data sets, after fusion, are clustered using 'Spectral Clustering' algorithm, which derives clusters that form the disease sub types. Survival properties of each sub type demonstrates the reasons to consider the samples inside them highly similar. The performance of CBI, we report, is better, in terms of P-value in log-rank test, than other methods like similarity network fusion or SNF in forming clusters of significance. Individual features clustered extremely poor compared to CBI in most of the experiments.
doi_str_mv	10.1016/j.jbi.2022.103997
format	Article
fullrecord	<record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_2622277819</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><els_id>S1532046422000132</els_id><sourcerecordid>2622277819</sourcerecordid><originalsourceid>FETCH-LOGICAL-c348t-c5cddf6c6696152699d1d3ee8aa853af3f07a965771c7c1cbbae6a33cf1c7dce3</originalsourceid><addsrcrecordid>eNp9kMtOwzAQRS0E4v0BbJCXsEjxo3YSsYKKRyUkNrC2nPG4ctUkxXaQ-vekFFiymhnp3CvNIeSCswlnXN8sJ8smTAQTYrxlXZd75JgrKQo2rdj-366nR-QkpSVjnCulD8mRVExxzatj0swddjn4TegWFGwHGGkamiJv1pioj31LF9j1bQCawK6QOpstTZgTHdJ3pu98GDsAaWMTOhq6jItoc-g7ejW7n1-fkQNvVwnPf-YpeX98eJs9Fy-vT_PZ3UsBclrlAhQ45zVoXWuuhK5rx51ErKytlLReelbaWquy5FACh6axqK2U4MfbAcpTcrXrXcf-Y8CUTRsS4GplO-yHZIQWQpRlxesR5TsUYp9SRG_WMbQ2bgxnZqvWLM2o1mzVmp3aMXP5Uz80Lbq_xK_LEbjdATg--RkwmgRhK8aFiJCN68M_9V_tEIpN</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2622277819</pqid></control><display><type>article</type><title>Identifying cancer sub-types from genomic scale data sets using confidence based integration (CBI)</title><source>MEDLINE</source><source>Elsevier ScienceDirect Journals</source><source>Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals</source><creator>Sreekumar, R. ; Khursheed, Farida</creator><creatorcontrib>Sreekumar, R. ; Khursheed, Farida</creatorcontrib><description>[Display omitted] •Disease subtyping involves extracting finer differences between samples.•The extraction of differences is a consensual process.•At every transit phase, decisions are taken based on the confidence of each participating feature.•This research accommodates outliers in data sets by a smooth transition process.•A measure of self confidence is devised based on an assumption that “close neighbors have common neighbors”. Precision medicine is a method involving refined diagnosis of patients and searching for causes that are unseen in their patient cohorts who otherwise have largely similar health conditions. As the technology evolved to extract features from a wide variety of sources including genetics, a large quantum of data is available to the researchers for conducting micro studies in the field of disease and cures. In cancer research, integrative methods using genomic data sets has become a major area of interest. The petabytes of data that is available at The Cancer Genome Atlas (TCGA), a program jointly under NCI and National Human Genome Research Institute, has made possible more nuanced research in cancer genomics. Our method, Confidence Based Integration (CBI) is an integration method to extract similar as well as complementing information from the genomic data sets. This information will provide insight into the status of patients and their prospects. We used the expression data sets of gene, miRNA and DNA methylation in our fusion experiments on five different cancer types. These data sets, after fusion, are clustered using 'Spectral Clustering' algorithm, which derives clusters that form the disease sub types. Survival properties of each sub type demonstrates the reasons to consider the samples inside them highly similar. The performance of CBI, we report, is better, in terms of P-value in log-rank test, than other methods like similarity network fusion or SNF in forming clusters of significance. Individual features clustered extremely poor compared to CBI in most of the experiments.</description><identifier>ISSN: 1532-0464</identifier><identifier>EISSN: 1532-0480</identifier><identifier>DOI: 10.1016/j.jbi.2022.103997</identifier><identifier>PMID: 35051618</identifier><language>eng</language><publisher>United States: Elsevier Inc</publisher><subject>Cluster Analysis ; Genome ; Genomics - methods ; Humans ; MicroRNAs - genetics ; MicroRNAs - metabolism ; Neoplasms - genetics</subject><ispartof>Journal of biomedical informatics, 2022-02, Vol.126, p.103997-103997, Article 103997</ispartof><rights>2022 Elsevier Inc.</rights><rights>Copyright © 2022 Elsevier Inc. All rights reserved.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c348t-c5cddf6c6696152699d1d3ee8aa853af3f07a965771c7c1cbbae6a33cf1c7dce3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://dx.doi.org/10.1016/j.jbi.2022.103997$$EHTML$$P50$$Gelsevier$$Hfree_for_read</linktohtml><link.rule.ids>314,777,781,3537,27905,27906,45976</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/35051618$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Sreekumar, R.</creatorcontrib><creatorcontrib>Khursheed, Farida</creatorcontrib><title>Identifying cancer sub-types from genomic scale data sets using confidence based integration (CBI)</title><title>Journal of biomedical informatics</title><addtitle>J Biomed Inform</addtitle><description>[Display omitted] •Disease subtyping involves extracting finer differences between samples.•The extraction of differences is a consensual process.•At every transit phase, decisions are taken based on the confidence of each participating feature.•This research accommodates outliers in data sets by a smooth transition process.•A measure of self confidence is devised based on an assumption that “close neighbors have common neighbors”. Precision medicine is a method involving refined diagnosis of patients and searching for causes that are unseen in their patient cohorts who otherwise have largely similar health conditions. As the technology evolved to extract features from a wide variety of sources including genetics, a large quantum of data is available to the researchers for conducting micro studies in the field of disease and cures. In cancer research, integrative methods using genomic data sets has become a major area of interest. The petabytes of data that is available at The Cancer Genome Atlas (TCGA), a program jointly under NCI and National Human Genome Research Institute, has made possible more nuanced research in cancer genomics. Our method, Confidence Based Integration (CBI) is an integration method to extract similar as well as complementing information from the genomic data sets. This information will provide insight into the status of patients and their prospects. We used the expression data sets of gene, miRNA and DNA methylation in our fusion experiments on five different cancer types. These data sets, after fusion, are clustered using 'Spectral Clustering' algorithm, which derives clusters that form the disease sub types. Survival properties of each sub type demonstrates the reasons to consider the samples inside them highly similar. The performance of CBI, we report, is better, in terms of P-value in log-rank test, than other methods like similarity network fusion or SNF in forming clusters of significance. Individual features clustered extremely poor compared to CBI in most of the experiments.</description><subject>Cluster Analysis</subject><subject>Genome</subject><subject>Genomics - methods</subject><subject>Humans</subject><subject>MicroRNAs - genetics</subject><subject>MicroRNAs - metabolism</subject><subject>Neoplasms - genetics</subject><issn>1532-0464</issn><issn>1532-0480</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><sourceid>EIF</sourceid><recordid>eNp9kMtOwzAQRS0E4v0BbJCXsEjxo3YSsYKKRyUkNrC2nPG4ctUkxXaQ-vekFFiymhnp3CvNIeSCswlnXN8sJ8smTAQTYrxlXZd75JgrKQo2rdj-366nR-QkpSVjnCulD8mRVExxzatj0swddjn4TegWFGwHGGkamiJv1pioj31LF9j1bQCawK6QOpstTZgTHdJ3pu98GDsAaWMTOhq6jItoc-g7ejW7n1-fkQNvVwnPf-YpeX98eJs9Fy-vT_PZ3UsBclrlAhQ45zVoXWuuhK5rx51ErKytlLReelbaWquy5FACh6axqK2U4MfbAcpTcrXrXcf-Y8CUTRsS4GplO-yHZIQWQpRlxesR5TsUYp9SRG_WMbQ2bgxnZqvWLM2o1mzVmp3aMXP5Uz80Lbq_xK_LEbjdATg--RkwmgRhK8aFiJCN68M_9V_tEIpN</recordid><startdate>202202</startdate><enddate>202202</enddate><creator>Sreekumar, R.</creator><creator>Khursheed, Farida</creator><general>Elsevier Inc</general><scope>6I.</scope><scope>AAFTH</scope><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7X8</scope></search><sort><creationdate>202202</creationdate><title>Identifying cancer sub-types from genomic scale data sets using confidence based integration (CBI)</title><author>Sreekumar, R. ; Khursheed, Farida</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c348t-c5cddf6c6696152699d1d3ee8aa853af3f07a965771c7c1cbbae6a33cf1c7dce3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Cluster Analysis</topic><topic>Genome</topic><topic>Genomics - methods</topic><topic>Humans</topic><topic>MicroRNAs - genetics</topic><topic>MicroRNAs - metabolism</topic><topic>Neoplasms - genetics</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Sreekumar, R.</creatorcontrib><creatorcontrib>Khursheed, Farida</creatorcontrib><collection>ScienceDirect Open Access Titles</collection><collection>Elsevier:ScienceDirect:Open Access</collection><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>MEDLINE - Academic</collection><jtitle>Journal of biomedical informatics</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Sreekumar, R.</au><au>Khursheed, Farida</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Identifying cancer sub-types from genomic scale data sets using confidence based integration (CBI)</atitle><jtitle>Journal of biomedical informatics</jtitle><addtitle>J Biomed Inform</addtitle><date>2022-02</date><risdate>2022</risdate><volume>126</volume><spage>103997</spage><epage>103997</epage><pages>103997-103997</pages><artnum>103997</artnum><issn>1532-0464</issn><eissn>1532-0480</eissn><abstract>[Display omitted] •Disease subtyping involves extracting finer differences between samples.•The extraction of differences is a consensual process.•At every transit phase, decisions are taken based on the confidence of each participating feature.•This research accommodates outliers in data sets by a smooth transition process.•A measure of self confidence is devised based on an assumption that “close neighbors have common neighbors”. Precision medicine is a method involving refined diagnosis of patients and searching for causes that are unseen in their patient cohorts who otherwise have largely similar health conditions. As the technology evolved to extract features from a wide variety of sources including genetics, a large quantum of data is available to the researchers for conducting micro studies in the field of disease and cures. In cancer research, integrative methods using genomic data sets has become a major area of interest. The petabytes of data that is available at The Cancer Genome Atlas (TCGA), a program jointly under NCI and National Human Genome Research Institute, has made possible more nuanced research in cancer genomics. Our method, Confidence Based Integration (CBI) is an integration method to extract similar as well as complementing information from the genomic data sets. This information will provide insight into the status of patients and their prospects. We used the expression data sets of gene, miRNA and DNA methylation in our fusion experiments on five different cancer types. These data sets, after fusion, are clustered using 'Spectral Clustering' algorithm, which derives clusters that form the disease sub types. Survival properties of each sub type demonstrates the reasons to consider the samples inside them highly similar. The performance of CBI, we report, is better, in terms of P-value in log-rank test, than other methods like similarity network fusion or SNF in forming clusters of significance. Individual features clustered extremely poor compared to CBI in most of the experiments.</abstract><cop>United States</cop><pub>Elsevier Inc</pub><pmid>35051618</pmid><doi>10.1016/j.jbi.2022.103997</doi><tpages>1</tpages><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	ISSN: 1532-0464
ispartof	Journal of biomedical informatics, 2022-02, Vol.126, p.103997-103997, Article 103997
issn	1532-0464 1532-0480
language	eng
recordid	cdi_proquest_miscellaneous_2622277819
source	MEDLINE; Elsevier ScienceDirect Journals; Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals
subjects	Cluster Analysis Genome Genomics - methods Humans MicroRNAs - genetics MicroRNAs - metabolism Neoplasms - genetics
title	Identifying cancer sub-types from genomic scale data sets using confidence based integration (CBI)
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-21T08%3A24%3A21IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Identifying%20cancer%20sub-types%20from%20genomic%20scale%20data%20sets%20using%20confidence%20based%20integration%20(CBI)&rft.jtitle=Journal%20of%20biomedical%20informatics&rft.au=Sreekumar,%20R.&rft.date=2022-02&rft.volume=126&rft.spage=103997&rft.epage=103997&rft.pages=103997-103997&rft.artnum=103997&rft.issn=1532-0464&rft.eissn=1532-0480&rft_id=info:doi/10.1016/j.jbi.2022.103997&rft_dat=%3Cproquest_cross%3E2622277819%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2622277819&rft_id=info:pmid/35051618&rft_els_id=S1532046422000132&rfr_iscdi=true