Integrative analysis of RNA expression data unveils distinct cancer types through machine learning techniques
Cancer is a highly complex and heterogeneous disease. Traditional methods of cancer classification based on histopathology have limitations in guiding personalized prognosis and therapy. Gene expression profiling provides a powerful approach to unraveling molecular intricacies and better-stratifying...
Gespeichert in:
Veröffentlicht in: | Saudi journal of biological sciences 2024-03, Vol.31 (3), p.103918, Article 103918 |
---|---|
Hauptverfasser: | , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | |
---|---|
container_issue | 3 |
container_start_page | 103918 |
container_title | Saudi journal of biological sciences |
container_volume | 31 |
creator | Alanazi, Saad Awadh Alshammari, Nasser Alruwaili, Maddalah Junaid, Kashaf Abid, Muhammad Rizwan Ahmad, Fahad |
description | Cancer is a highly complex and heterogeneous disease. Traditional methods of cancer classification based on histopathology have limitations in guiding personalized prognosis and therapy. Gene expression profiling provides a powerful approach to unraveling molecular intricacies and better-stratifying cancer subtypes. In this study, we performed an integrative analysis of RNA sequencing data from five cancer types - BRCA, KIRC, COAD, LUAD, and PRAD. A machine learning workflow consisting of dataset identification, normalization, feature selection, dimensionality reduction, clustering, and classification was implemented. The k-means algorithm was applied to categorize samples into distinct clusters based solely on gene expression patterns. Five unique clusters emerged from the unsupervised machine learning based analysis, significantly correlating with the known cancer types. BRCA aligned predominantly with one cluster, while COAD spanned three clusters. KIRC was represented within two main clusters. LUAD is associated strongly with a single cluster and PRAD with another cluster. This demonstrates the ability of machine learning approaches to unravel complex signatures within transcriptomic profiles that can delineate cancer subtypes. The proposed study highlights the potential of integrative analytics to derive meaningful biological insights from high-dimensional omics datasets. Molecular subtyping through machine learning clustering enhances our understanding of the intrinsic heterogeneities and pathways dysregulated in different cancers. Overall, this study exemplifies a powerful computational framework to classify gene expressions of patients having different types of cancers and guide personalized therapeutic decisions. Finally, Wide Neural Network demonstrates a significantly higher accuracy, achieving 99.834% on the validation set and an even more impressive 99.995% on the test set. |
doi_str_mv | 10.1016/j.sjbs.2023.103918 |
format | Article |
fullrecord | <record><control><sourceid>proquest_pubme</sourceid><recordid>TN_cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_10821588</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><els_id>S1319562X23003637</els_id><sourcerecordid>2919744238</sourcerecordid><originalsourceid>FETCH-LOGICAL-c3228-5f294c29143078c4458e9eca3c99e4afa3956f1325e7d02db6276edf0656939e3</originalsourceid><addsrcrecordid>eNp9kU9r3DAQxUVpaDZpv0APRcdevNUfW5agUEJom0BoIKTQm9DK47UWW95q5KX77WuzaWgvPQ3MvHnzmB8hbzlbc8bVh90adxtcCybk3JCG6xdkJQSXRc2ZeklWXHJTVEr8OCcXiDvGlJaavyLnUgst61qsyHAbM2yTy-EA1EXXHzEgHVv68O2Kwq99AsQwRtq47OgUDxB6pE3AHKLP1LvoIdF83APS3KVx2nZ0cL4LEWgPLsUQtzSD72L4OQG-Jmet6xHePNVL8v3L58frm-Lu_uvt9dVd4aUQuqhaYUovDC8lq7Uvy0qDAe-kNwZK1zppKtVyKSqoGyaajRK1gqZlqlJGGpCX5NPJdz9tBmg8xJxcb_cpDC4d7eiC_XcSQ2e348FypgWvtJ4d3j85pHFJnu0Q0EPfuwjjhHYOZ-qyFHKRipPUpxExQft8hzO7gLI7u4CyCyh7AjUvvfs74fPKHzKz4ONJAPOfDgGSRR9gfncTEvhsmzH8z_8323OnPg</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2919744238</pqid></control><display><type>article</type><title>Integrative analysis of RNA expression data unveils distinct cancer types through machine learning techniques</title><source>Elsevier ScienceDirect Journals</source><source>Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals</source><source>PubMed Central</source><creator>Alanazi, Saad Awadh ; Alshammari, Nasser ; Alruwaili, Maddalah ; Junaid, Kashaf ; Abid, Muhammad Rizwan ; Ahmad, Fahad</creator><creatorcontrib>Alanazi, Saad Awadh ; Alshammari, Nasser ; Alruwaili, Maddalah ; Junaid, Kashaf ; Abid, Muhammad Rizwan ; Ahmad, Fahad</creatorcontrib><description>Cancer is a highly complex and heterogeneous disease. Traditional methods of cancer classification based on histopathology have limitations in guiding personalized prognosis and therapy. Gene expression profiling provides a powerful approach to unraveling molecular intricacies and better-stratifying cancer subtypes. In this study, we performed an integrative analysis of RNA sequencing data from five cancer types - BRCA, KIRC, COAD, LUAD, and PRAD. A machine learning workflow consisting of dataset identification, normalization, feature selection, dimensionality reduction, clustering, and classification was implemented. The k-means algorithm was applied to categorize samples into distinct clusters based solely on gene expression patterns. Five unique clusters emerged from the unsupervised machine learning based analysis, significantly correlating with the known cancer types. BRCA aligned predominantly with one cluster, while COAD spanned three clusters. KIRC was represented within two main clusters. LUAD is associated strongly with a single cluster and PRAD with another cluster. This demonstrates the ability of machine learning approaches to unravel complex signatures within transcriptomic profiles that can delineate cancer subtypes. The proposed study highlights the potential of integrative analytics to derive meaningful biological insights from high-dimensional omics datasets. Molecular subtyping through machine learning clustering enhances our understanding of the intrinsic heterogeneities and pathways dysregulated in different cancers. Overall, this study exemplifies a powerful computational framework to classify gene expressions of patients having different types of cancers and guide personalized therapeutic decisions. Finally, Wide Neural Network demonstrates a significantly higher accuracy, achieving 99.834% on the validation set and an even more impressive 99.995% on the test set.</description><identifier>ISSN: 1319-562X</identifier><identifier>EISSN: 2213-7106</identifier><identifier>DOI: 10.1016/j.sjbs.2023.103918</identifier><identifier>PMID: 38283772</identifier><language>eng</language><publisher>Saudi Arabia: Elsevier B.V</publisher><subject>Classification ; Clustering ; Gene ; Machine learning ; Original ; RNA expression, cancer ; Tumour, diagnosis</subject><ispartof>Saudi journal of biological sciences, 2024-03, Vol.31 (3), p.103918, Article 103918</ispartof><rights>2023</rights><rights>2023 Published by Elsevier B.V. on behalf of King Saud University.</rights><rights>2023 Published by Elsevier B.V. on behalf of King Saud University. 2023</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c3228-5f294c29143078c4458e9eca3c99e4afa3956f1325e7d02db6276edf0656939e3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC10821588/pdf/$$EPDF$$P50$$Gpubmedcentral$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://www.sciencedirect.com/science/article/pii/S1319562X23003637$$EHTML$$P50$$Gelsevier$$Hfree_for_read</linktohtml><link.rule.ids>230,314,723,776,780,881,3536,27903,27904,53769,53771,65309</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/38283772$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Alanazi, Saad Awadh</creatorcontrib><creatorcontrib>Alshammari, Nasser</creatorcontrib><creatorcontrib>Alruwaili, Maddalah</creatorcontrib><creatorcontrib>Junaid, Kashaf</creatorcontrib><creatorcontrib>Abid, Muhammad Rizwan</creatorcontrib><creatorcontrib>Ahmad, Fahad</creatorcontrib><title>Integrative analysis of RNA expression data unveils distinct cancer types through machine learning techniques</title><title>Saudi journal of biological sciences</title><addtitle>Saudi J Biol Sci</addtitle><description>Cancer is a highly complex and heterogeneous disease. Traditional methods of cancer classification based on histopathology have limitations in guiding personalized prognosis and therapy. Gene expression profiling provides a powerful approach to unraveling molecular intricacies and better-stratifying cancer subtypes. In this study, we performed an integrative analysis of RNA sequencing data from five cancer types - BRCA, KIRC, COAD, LUAD, and PRAD. A machine learning workflow consisting of dataset identification, normalization, feature selection, dimensionality reduction, clustering, and classification was implemented. The k-means algorithm was applied to categorize samples into distinct clusters based solely on gene expression patterns. Five unique clusters emerged from the unsupervised machine learning based analysis, significantly correlating with the known cancer types. BRCA aligned predominantly with one cluster, while COAD spanned three clusters. KIRC was represented within two main clusters. LUAD is associated strongly with a single cluster and PRAD with another cluster. This demonstrates the ability of machine learning approaches to unravel complex signatures within transcriptomic profiles that can delineate cancer subtypes. The proposed study highlights the potential of integrative analytics to derive meaningful biological insights from high-dimensional omics datasets. Molecular subtyping through machine learning clustering enhances our understanding of the intrinsic heterogeneities and pathways dysregulated in different cancers. Overall, this study exemplifies a powerful computational framework to classify gene expressions of patients having different types of cancers and guide personalized therapeutic decisions. Finally, Wide Neural Network demonstrates a significantly higher accuracy, achieving 99.834% on the validation set and an even more impressive 99.995% on the test set.</description><subject>Classification</subject><subject>Clustering</subject><subject>Gene</subject><subject>Machine learning</subject><subject>Original</subject><subject>RNA expression, cancer</subject><subject>Tumour, diagnosis</subject><issn>1319-562X</issn><issn>2213-7106</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><recordid>eNp9kU9r3DAQxUVpaDZpv0APRcdevNUfW5agUEJom0BoIKTQm9DK47UWW95q5KX77WuzaWgvPQ3MvHnzmB8hbzlbc8bVh90adxtcCybk3JCG6xdkJQSXRc2ZeklWXHJTVEr8OCcXiDvGlJaavyLnUgst61qsyHAbM2yTy-EA1EXXHzEgHVv68O2Kwq99AsQwRtq47OgUDxB6pE3AHKLP1LvoIdF83APS3KVx2nZ0cL4LEWgPLsUQtzSD72L4OQG-Jmet6xHePNVL8v3L58frm-Lu_uvt9dVd4aUQuqhaYUovDC8lq7Uvy0qDAe-kNwZK1zppKtVyKSqoGyaajRK1gqZlqlJGGpCX5NPJdz9tBmg8xJxcb_cpDC4d7eiC_XcSQ2e348FypgWvtJ4d3j85pHFJnu0Q0EPfuwjjhHYOZ-qyFHKRipPUpxExQft8hzO7gLI7u4CyCyh7AjUvvfs74fPKHzKz4ONJAPOfDgGSRR9gfncTEvhsmzH8z_8323OnPg</recordid><startdate>20240301</startdate><enddate>20240301</enddate><creator>Alanazi, Saad Awadh</creator><creator>Alshammari, Nasser</creator><creator>Alruwaili, Maddalah</creator><creator>Junaid, Kashaf</creator><creator>Abid, Muhammad Rizwan</creator><creator>Ahmad, Fahad</creator><general>Elsevier B.V</general><general>Elsevier</general><scope>6I.</scope><scope>AAFTH</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7X8</scope><scope>5PM</scope></search><sort><creationdate>20240301</creationdate><title>Integrative analysis of RNA expression data unveils distinct cancer types through machine learning techniques</title><author>Alanazi, Saad Awadh ; Alshammari, Nasser ; Alruwaili, Maddalah ; Junaid, Kashaf ; Abid, Muhammad Rizwan ; Ahmad, Fahad</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c3228-5f294c29143078c4458e9eca3c99e4afa3956f1325e7d02db6276edf0656939e3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Classification</topic><topic>Clustering</topic><topic>Gene</topic><topic>Machine learning</topic><topic>Original</topic><topic>RNA expression, cancer</topic><topic>Tumour, diagnosis</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Alanazi, Saad Awadh</creatorcontrib><creatorcontrib>Alshammari, Nasser</creatorcontrib><creatorcontrib>Alruwaili, Maddalah</creatorcontrib><creatorcontrib>Junaid, Kashaf</creatorcontrib><creatorcontrib>Abid, Muhammad Rizwan</creatorcontrib><creatorcontrib>Ahmad, Fahad</creatorcontrib><collection>ScienceDirect Open Access Titles</collection><collection>Elsevier:ScienceDirect:Open Access</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>Saudi journal of biological sciences</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Alanazi, Saad Awadh</au><au>Alshammari, Nasser</au><au>Alruwaili, Maddalah</au><au>Junaid, Kashaf</au><au>Abid, Muhammad Rizwan</au><au>Ahmad, Fahad</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Integrative analysis of RNA expression data unveils distinct cancer types through machine learning techniques</atitle><jtitle>Saudi journal of biological sciences</jtitle><addtitle>Saudi J Biol Sci</addtitle><date>2024-03-01</date><risdate>2024</risdate><volume>31</volume><issue>3</issue><spage>103918</spage><pages>103918-</pages><artnum>103918</artnum><issn>1319-562X</issn><eissn>2213-7106</eissn><abstract>Cancer is a highly complex and heterogeneous disease. Traditional methods of cancer classification based on histopathology have limitations in guiding personalized prognosis and therapy. Gene expression profiling provides a powerful approach to unraveling molecular intricacies and better-stratifying cancer subtypes. In this study, we performed an integrative analysis of RNA sequencing data from five cancer types - BRCA, KIRC, COAD, LUAD, and PRAD. A machine learning workflow consisting of dataset identification, normalization, feature selection, dimensionality reduction, clustering, and classification was implemented. The k-means algorithm was applied to categorize samples into distinct clusters based solely on gene expression patterns. Five unique clusters emerged from the unsupervised machine learning based analysis, significantly correlating with the known cancer types. BRCA aligned predominantly with one cluster, while COAD spanned three clusters. KIRC was represented within two main clusters. LUAD is associated strongly with a single cluster and PRAD with another cluster. This demonstrates the ability of machine learning approaches to unravel complex signatures within transcriptomic profiles that can delineate cancer subtypes. The proposed study highlights the potential of integrative analytics to derive meaningful biological insights from high-dimensional omics datasets. Molecular subtyping through machine learning clustering enhances our understanding of the intrinsic heterogeneities and pathways dysregulated in different cancers. Overall, this study exemplifies a powerful computational framework to classify gene expressions of patients having different types of cancers and guide personalized therapeutic decisions. Finally, Wide Neural Network demonstrates a significantly higher accuracy, achieving 99.834% on the validation set and an even more impressive 99.995% on the test set.</abstract><cop>Saudi Arabia</cop><pub>Elsevier B.V</pub><pmid>38283772</pmid><doi>10.1016/j.sjbs.2023.103918</doi><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 1319-562X |
ispartof | Saudi journal of biological sciences, 2024-03, Vol.31 (3), p.103918, Article 103918 |
issn | 1319-562X 2213-7106 |
language | eng |
recordid | cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_10821588 |
source | Elsevier ScienceDirect Journals; Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals; PubMed Central |
subjects | Classification Clustering Gene Machine learning Original RNA expression, cancer Tumour, diagnosis |
title | Integrative analysis of RNA expression data unveils distinct cancer types through machine learning techniques |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-24T04%3A14%3A01IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Integrative%20analysis%20of%20RNA%20expression%20data%20unveils%20distinct%20cancer%20types%20through%20machine%20learning%20techniques&rft.jtitle=Saudi%20journal%20of%20biological%20sciences&rft.au=Alanazi,%20Saad%20Awadh&rft.date=2024-03-01&rft.volume=31&rft.issue=3&rft.spage=103918&rft.pages=103918-&rft.artnum=103918&rft.issn=1319-562X&rft.eissn=2213-7106&rft_id=info:doi/10.1016/j.sjbs.2023.103918&rft_dat=%3Cproquest_pubme%3E2919744238%3C/proquest_pubme%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2919744238&rft_id=info:pmid/38283772&rft_els_id=S1319562X23003637&rfr_iscdi=true |