Integrative analysis of RNA expression data unveils distinct cancer types through machine learning techniques

Cancer is a highly complex and heterogeneous disease. Traditional methods of cancer classification based on histopathology have limitations in guiding personalized prognosis and therapy. Gene expression profiling provides a powerful approach to unraveling molecular intricacies and better-stratifying...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Saudi journal of biological sciences 2024-03, Vol.31 (3), p.103918, Article 103918
Hauptverfasser: Alanazi, Saad Awadh, Alshammari, Nasser, Alruwaili, Maddalah, Junaid, Kashaf, Abid, Muhammad Rizwan, Ahmad, Fahad
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue 3
container_start_page 103918
container_title Saudi journal of biological sciences
container_volume 31
creator Alanazi, Saad Awadh
Alshammari, Nasser
Alruwaili, Maddalah
Junaid, Kashaf
Abid, Muhammad Rizwan
Ahmad, Fahad
description Cancer is a highly complex and heterogeneous disease. Traditional methods of cancer classification based on histopathology have limitations in guiding personalized prognosis and therapy. Gene expression profiling provides a powerful approach to unraveling molecular intricacies and better-stratifying cancer subtypes. In this study, we performed an integrative analysis of RNA sequencing data from five cancer types - BRCA, KIRC, COAD, LUAD, and PRAD. A machine learning workflow consisting of dataset identification, normalization, feature selection, dimensionality reduction, clustering, and classification was implemented. The k-means algorithm was applied to categorize samples into distinct clusters based solely on gene expression patterns. Five unique clusters emerged from the unsupervised machine learning based analysis, significantly correlating with the known cancer types. BRCA aligned predominantly with one cluster, while COAD spanned three clusters. KIRC was represented within two main clusters. LUAD is associated strongly with a single cluster and PRAD with another cluster. This demonstrates the ability of machine learning approaches to unravel complex signatures within transcriptomic profiles that can delineate cancer subtypes. The proposed study highlights the potential of integrative analytics to derive meaningful biological insights from high-dimensional omics datasets. Molecular subtyping through machine learning clustering enhances our understanding of the intrinsic heterogeneities and pathways dysregulated in different cancers. Overall, this study exemplifies a powerful computational framework to classify gene expressions of patients having different types of cancers and guide personalized therapeutic decisions. Finally, Wide Neural Network demonstrates a significantly higher accuracy, achieving 99.834% on the validation set and an even more impressive 99.995% on the test set.
doi_str_mv 10.1016/j.sjbs.2023.103918
format Article
fullrecord <record><control><sourceid>proquest_pubme</sourceid><recordid>TN_cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_10821588</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><els_id>S1319562X23003637</els_id><sourcerecordid>2919744238</sourcerecordid><originalsourceid>FETCH-LOGICAL-c3228-5f294c29143078c4458e9eca3c99e4afa3956f1325e7d02db6276edf0656939e3</originalsourceid><addsrcrecordid>eNp9kU9r3DAQxUVpaDZpv0APRcdevNUfW5agUEJom0BoIKTQm9DK47UWW95q5KX77WuzaWgvPQ3MvHnzmB8hbzlbc8bVh90adxtcCybk3JCG6xdkJQSXRc2ZeklWXHJTVEr8OCcXiDvGlJaavyLnUgst61qsyHAbM2yTy-EA1EXXHzEgHVv68O2Kwq99AsQwRtq47OgUDxB6pE3AHKLP1LvoIdF83APS3KVx2nZ0cL4LEWgPLsUQtzSD72L4OQG-Jmet6xHePNVL8v3L58frm-Lu_uvt9dVd4aUQuqhaYUovDC8lq7Uvy0qDAe-kNwZK1zppKtVyKSqoGyaajRK1gqZlqlJGGpCX5NPJdz9tBmg8xJxcb_cpDC4d7eiC_XcSQ2e348FypgWvtJ4d3j85pHFJnu0Q0EPfuwjjhHYOZ-qyFHKRipPUpxExQft8hzO7gLI7u4CyCyh7AjUvvfs74fPKHzKz4ONJAPOfDgGSRR9gfncTEvhsmzH8z_8323OnPg</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2919744238</pqid></control><display><type>article</type><title>Integrative analysis of RNA expression data unveils distinct cancer types through machine learning techniques</title><source>Elsevier ScienceDirect Journals</source><source>Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals</source><source>PubMed Central</source><creator>Alanazi, Saad Awadh ; Alshammari, Nasser ; Alruwaili, Maddalah ; Junaid, Kashaf ; Abid, Muhammad Rizwan ; Ahmad, Fahad</creator><creatorcontrib>Alanazi, Saad Awadh ; Alshammari, Nasser ; Alruwaili, Maddalah ; Junaid, Kashaf ; Abid, Muhammad Rizwan ; Ahmad, Fahad</creatorcontrib><description>Cancer is a highly complex and heterogeneous disease. Traditional methods of cancer classification based on histopathology have limitations in guiding personalized prognosis and therapy. Gene expression profiling provides a powerful approach to unraveling molecular intricacies and better-stratifying cancer subtypes. In this study, we performed an integrative analysis of RNA sequencing data from five cancer types - BRCA, KIRC, COAD, LUAD, and PRAD. A machine learning workflow consisting of dataset identification, normalization, feature selection, dimensionality reduction, clustering, and classification was implemented. The k-means algorithm was applied to categorize samples into distinct clusters based solely on gene expression patterns. Five unique clusters emerged from the unsupervised machine learning based analysis, significantly correlating with the known cancer types. BRCA aligned predominantly with one cluster, while COAD spanned three clusters. KIRC was represented within two main clusters. LUAD is associated strongly with a single cluster and PRAD with another cluster. This demonstrates the ability of machine learning approaches to unravel complex signatures within transcriptomic profiles that can delineate cancer subtypes. The proposed study highlights the potential of integrative analytics to derive meaningful biological insights from high-dimensional omics datasets. Molecular subtyping through machine learning clustering enhances our understanding of the intrinsic heterogeneities and pathways dysregulated in different cancers. Overall, this study exemplifies a powerful computational framework to classify gene expressions of patients having different types of cancers and guide personalized therapeutic decisions. Finally, Wide Neural Network demonstrates a significantly higher accuracy, achieving 99.834% on the validation set and an even more impressive 99.995% on the test set.</description><identifier>ISSN: 1319-562X</identifier><identifier>EISSN: 2213-7106</identifier><identifier>DOI: 10.1016/j.sjbs.2023.103918</identifier><identifier>PMID: 38283772</identifier><language>eng</language><publisher>Saudi Arabia: Elsevier B.V</publisher><subject>Classification ; Clustering ; Gene ; Machine learning ; Original ; RNA expression, cancer ; Tumour, diagnosis</subject><ispartof>Saudi journal of biological sciences, 2024-03, Vol.31 (3), p.103918, Article 103918</ispartof><rights>2023</rights><rights>2023 Published by Elsevier B.V. on behalf of King Saud University.</rights><rights>2023 Published by Elsevier B.V. on behalf of King Saud University. 2023</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c3228-5f294c29143078c4458e9eca3c99e4afa3956f1325e7d02db6276edf0656939e3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC10821588/pdf/$$EPDF$$P50$$Gpubmedcentral$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://www.sciencedirect.com/science/article/pii/S1319562X23003637$$EHTML$$P50$$Gelsevier$$Hfree_for_read</linktohtml><link.rule.ids>230,314,723,776,780,881,3536,27903,27904,53769,53771,65309</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/38283772$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Alanazi, Saad Awadh</creatorcontrib><creatorcontrib>Alshammari, Nasser</creatorcontrib><creatorcontrib>Alruwaili, Maddalah</creatorcontrib><creatorcontrib>Junaid, Kashaf</creatorcontrib><creatorcontrib>Abid, Muhammad Rizwan</creatorcontrib><creatorcontrib>Ahmad, Fahad</creatorcontrib><title>Integrative analysis of RNA expression data unveils distinct cancer types through machine learning techniques</title><title>Saudi journal of biological sciences</title><addtitle>Saudi J Biol Sci</addtitle><description>Cancer is a highly complex and heterogeneous disease. Traditional methods of cancer classification based on histopathology have limitations in guiding personalized prognosis and therapy. Gene expression profiling provides a powerful approach to unraveling molecular intricacies and better-stratifying cancer subtypes. In this study, we performed an integrative analysis of RNA sequencing data from five cancer types - BRCA, KIRC, COAD, LUAD, and PRAD. A machine learning workflow consisting of dataset identification, normalization, feature selection, dimensionality reduction, clustering, and classification was implemented. The k-means algorithm was applied to categorize samples into distinct clusters based solely on gene expression patterns. Five unique clusters emerged from the unsupervised machine learning based analysis, significantly correlating with the known cancer types. BRCA aligned predominantly with one cluster, while COAD spanned three clusters. KIRC was represented within two main clusters. LUAD is associated strongly with a single cluster and PRAD with another cluster. This demonstrates the ability of machine learning approaches to unravel complex signatures within transcriptomic profiles that can delineate cancer subtypes. The proposed study highlights the potential of integrative analytics to derive meaningful biological insights from high-dimensional omics datasets. Molecular subtyping through machine learning clustering enhances our understanding of the intrinsic heterogeneities and pathways dysregulated in different cancers. Overall, this study exemplifies a powerful computational framework to classify gene expressions of patients having different types of cancers and guide personalized therapeutic decisions. Finally, Wide Neural Network demonstrates a significantly higher accuracy, achieving 99.834% on the validation set and an even more impressive 99.995% on the test set.</description><subject>Classification</subject><subject>Clustering</subject><subject>Gene</subject><subject>Machine learning</subject><subject>Original</subject><subject>RNA expression, cancer</subject><subject>Tumour, diagnosis</subject><issn>1319-562X</issn><issn>2213-7106</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><recordid>eNp9kU9r3DAQxUVpaDZpv0APRcdevNUfW5agUEJom0BoIKTQm9DK47UWW95q5KX77WuzaWgvPQ3MvHnzmB8hbzlbc8bVh90adxtcCybk3JCG6xdkJQSXRc2ZeklWXHJTVEr8OCcXiDvGlJaavyLnUgst61qsyHAbM2yTy-EA1EXXHzEgHVv68O2Kwq99AsQwRtq47OgUDxB6pE3AHKLP1LvoIdF83APS3KVx2nZ0cL4LEWgPLsUQtzSD72L4OQG-Jmet6xHePNVL8v3L58frm-Lu_uvt9dVd4aUQuqhaYUovDC8lq7Uvy0qDAe-kNwZK1zppKtVyKSqoGyaajRK1gqZlqlJGGpCX5NPJdz9tBmg8xJxcb_cpDC4d7eiC_XcSQ2e348FypgWvtJ4d3j85pHFJnu0Q0EPfuwjjhHYOZ-qyFHKRipPUpxExQft8hzO7gLI7u4CyCyh7AjUvvfs74fPKHzKz4ONJAPOfDgGSRR9gfncTEvhsmzH8z_8323OnPg</recordid><startdate>20240301</startdate><enddate>20240301</enddate><creator>Alanazi, Saad Awadh</creator><creator>Alshammari, Nasser</creator><creator>Alruwaili, Maddalah</creator><creator>Junaid, Kashaf</creator><creator>Abid, Muhammad Rizwan</creator><creator>Ahmad, Fahad</creator><general>Elsevier B.V</general><general>Elsevier</general><scope>6I.</scope><scope>AAFTH</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7X8</scope><scope>5PM</scope></search><sort><creationdate>20240301</creationdate><title>Integrative analysis of RNA expression data unveils distinct cancer types through machine learning techniques</title><author>Alanazi, Saad Awadh ; Alshammari, Nasser ; Alruwaili, Maddalah ; Junaid, Kashaf ; Abid, Muhammad Rizwan ; Ahmad, Fahad</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c3228-5f294c29143078c4458e9eca3c99e4afa3956f1325e7d02db6276edf0656939e3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Classification</topic><topic>Clustering</topic><topic>Gene</topic><topic>Machine learning</topic><topic>Original</topic><topic>RNA expression, cancer</topic><topic>Tumour, diagnosis</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Alanazi, Saad Awadh</creatorcontrib><creatorcontrib>Alshammari, Nasser</creatorcontrib><creatorcontrib>Alruwaili, Maddalah</creatorcontrib><creatorcontrib>Junaid, Kashaf</creatorcontrib><creatorcontrib>Abid, Muhammad Rizwan</creatorcontrib><creatorcontrib>Ahmad, Fahad</creatorcontrib><collection>ScienceDirect Open Access Titles</collection><collection>Elsevier:ScienceDirect:Open Access</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>Saudi journal of biological sciences</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Alanazi, Saad Awadh</au><au>Alshammari, Nasser</au><au>Alruwaili, Maddalah</au><au>Junaid, Kashaf</au><au>Abid, Muhammad Rizwan</au><au>Ahmad, Fahad</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Integrative analysis of RNA expression data unveils distinct cancer types through machine learning techniques</atitle><jtitle>Saudi journal of biological sciences</jtitle><addtitle>Saudi J Biol Sci</addtitle><date>2024-03-01</date><risdate>2024</risdate><volume>31</volume><issue>3</issue><spage>103918</spage><pages>103918-</pages><artnum>103918</artnum><issn>1319-562X</issn><eissn>2213-7106</eissn><abstract>Cancer is a highly complex and heterogeneous disease. Traditional methods of cancer classification based on histopathology have limitations in guiding personalized prognosis and therapy. Gene expression profiling provides a powerful approach to unraveling molecular intricacies and better-stratifying cancer subtypes. In this study, we performed an integrative analysis of RNA sequencing data from five cancer types - BRCA, KIRC, COAD, LUAD, and PRAD. A machine learning workflow consisting of dataset identification, normalization, feature selection, dimensionality reduction, clustering, and classification was implemented. The k-means algorithm was applied to categorize samples into distinct clusters based solely on gene expression patterns. Five unique clusters emerged from the unsupervised machine learning based analysis, significantly correlating with the known cancer types. BRCA aligned predominantly with one cluster, while COAD spanned three clusters. KIRC was represented within two main clusters. LUAD is associated strongly with a single cluster and PRAD with another cluster. This demonstrates the ability of machine learning approaches to unravel complex signatures within transcriptomic profiles that can delineate cancer subtypes. The proposed study highlights the potential of integrative analytics to derive meaningful biological insights from high-dimensional omics datasets. Molecular subtyping through machine learning clustering enhances our understanding of the intrinsic heterogeneities and pathways dysregulated in different cancers. Overall, this study exemplifies a powerful computational framework to classify gene expressions of patients having different types of cancers and guide personalized therapeutic decisions. Finally, Wide Neural Network demonstrates a significantly higher accuracy, achieving 99.834% on the validation set and an even more impressive 99.995% on the test set.</abstract><cop>Saudi Arabia</cop><pub>Elsevier B.V</pub><pmid>38283772</pmid><doi>10.1016/j.sjbs.2023.103918</doi><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 1319-562X
ispartof Saudi journal of biological sciences, 2024-03, Vol.31 (3), p.103918, Article 103918
issn 1319-562X
2213-7106
language eng
recordid cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_10821588
source Elsevier ScienceDirect Journals; Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals; PubMed Central
subjects Classification
Clustering
Gene
Machine learning
Original
RNA expression, cancer
Tumour, diagnosis
title Integrative analysis of RNA expression data unveils distinct cancer types through machine learning techniques
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-24T04%3A14%3A01IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Integrative%20analysis%20of%20RNA%20expression%20data%20unveils%20distinct%20cancer%20types%20through%20machine%20learning%20techniques&rft.jtitle=Saudi%20journal%20of%20biological%20sciences&rft.au=Alanazi,%20Saad%20Awadh&rft.date=2024-03-01&rft.volume=31&rft.issue=3&rft.spage=103918&rft.pages=103918-&rft.artnum=103918&rft.issn=1319-562X&rft.eissn=2213-7106&rft_id=info:doi/10.1016/j.sjbs.2023.103918&rft_dat=%3Cproquest_pubme%3E2919744238%3C/proquest_pubme%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2919744238&rft_id=info:pmid/38283772&rft_els_id=S1319562X23003637&rfr_iscdi=true