Genetic data visualization using literature text-based neural networks: Examples associated with myocardial infarction
Data visualization is critical to unraveling hidden information from complex and high-dimensional data. Interpretable visualization methods are critical, especially in the biology and medical fields, however, there are limited effective visualization methods for large genetic data. Current visualiza...
Gespeichert in:
Veröffentlicht in: | Neural networks 2023-08, Vol.165, p.562-595 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 595 |
---|---|
container_issue | |
container_start_page | 562 |
container_title | Neural networks |
container_volume | 165 |
creator | Moon, Jihye Posada-Quintero, Hugo F. Chon, Ki H. |
description | Data visualization is critical to unraveling hidden information from complex and high-dimensional data. Interpretable visualization methods are critical, especially in the biology and medical fields, however, there are limited effective visualization methods for large genetic data. Current visualization methods are limited to lower-dimensional data and their performance suffers if there is missing data. In this study, we propose a literature-based visualization method to reduce high-dimensional data without compromising the dynamics of the single nucleotide polymorphisms (SNP) and textual interpretability. Our method is innovative because it is shown to (1) preserves both global and local structures of SNP while reducing the dimension of the data using literature text representations, and (2) enables interpretable visualizations using textual information. For performance evaluations, we examined the proposed approach to classify various classification categories including race, myocardial infarction event age groups, and sex using several machine learning models on the literature-derived SNP data. We used visualization approaches to examine clustering of data as well as quantitative performance metrics for the classification of the risk factors examined above. Our method outperformed all popular dimensionality reduction and visualization methods for both classification and visualization, and it is robust against missing and higher-dimensional data. Moreover, we found it feasible to incorporate both genetic and other risk information obtained from literature with our method.
•Literature text-based neural networks enable robust genetic data structure estimation.•Literature information enables textual interpretability on unsupervised visualization.•Our literature-based methods are robust for high dimension and against noise.•Our method may combine genetic risk and other disease risks obtained from literature. |
doi_str_mv | 10.1016/j.neunet.2023.05.015 |
format | Article |
fullrecord | <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_2830220995</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><els_id>S0893608023002599</els_id><sourcerecordid>2830220995</sourcerecordid><originalsourceid>FETCH-LOGICAL-c357t-f129c94d1c205c9a9d12694b7cd2960571e057e42db7394c46fe724941c513db3</originalsourceid><addsrcrecordid>eNp9kMtOHDEQRa0oURggfxBFXmbTTfnRD7NAQghIJKRsYG257erEk34Mtnt4fD0eDbBkU7U5da_qEPKdQcmA1SfrcsJlwlRy4KKEqgRWfSIr1jaq4E3LP5MVtEoUNbRwQA5jXANA3UrxlRyIRtRS1mpFtteYM7ylziRDtz4uZvDPJvl5okv00186-ITBpCUgTfiYis5EdDR3BzPklR7m8D-e0stHM24GjNTEOFtvUoYefPpHx6fZmuB8pv3Um2B32cfkS2-GiN9e9xG5u7q8vfhV3Py5_n1xflNYUTWp6BlXVknHLIfKKqMc47WSXWMdVzVUDcM8UHLXNUJJK-seGy6VZLZiwnXiiPzc527CfL9gTHr00eIwmAnnJWreCuAclKoyKveoDXOMAXu9CX404Ukz0Dvjeq33xvXOuIZKZ-P57Mdrw9KN6N6P3hRn4GwPYP5z6zHoaD1OFp0PaJN2s_-44QVbDpZi</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2830220995</pqid></control><display><type>article</type><title>Genetic data visualization using literature text-based neural networks: Examples associated with myocardial infarction</title><source>Elsevier ScienceDirect Journals</source><creator>Moon, Jihye ; Posada-Quintero, Hugo F. ; Chon, Ki H.</creator><creatorcontrib>Moon, Jihye ; Posada-Quintero, Hugo F. ; Chon, Ki H.</creatorcontrib><description>Data visualization is critical to unraveling hidden information from complex and high-dimensional data. Interpretable visualization methods are critical, especially in the biology and medical fields, however, there are limited effective visualization methods for large genetic data. Current visualization methods are limited to lower-dimensional data and their performance suffers if there is missing data. In this study, we propose a literature-based visualization method to reduce high-dimensional data without compromising the dynamics of the single nucleotide polymorphisms (SNP) and textual interpretability. Our method is innovative because it is shown to (1) preserves both global and local structures of SNP while reducing the dimension of the data using literature text representations, and (2) enables interpretable visualizations using textual information. For performance evaluations, we examined the proposed approach to classify various classification categories including race, myocardial infarction event age groups, and sex using several machine learning models on the literature-derived SNP data. We used visualization approaches to examine clustering of data as well as quantitative performance metrics for the classification of the risk factors examined above. Our method outperformed all popular dimensionality reduction and visualization methods for both classification and visualization, and it is robust against missing and higher-dimensional data. Moreover, we found it feasible to incorporate both genetic and other risk information obtained from literature with our method.
•Literature text-based neural networks enable robust genetic data structure estimation.•Literature information enables textual interpretability on unsupervised visualization.•Our literature-based methods are robust for high dimension and against noise.•Our method may combine genetic risk and other disease risks obtained from literature.</description><identifier>ISSN: 0893-6080</identifier><identifier>EISSN: 1879-2782</identifier><identifier>DOI: 10.1016/j.neunet.2023.05.015</identifier><identifier>PMID: 37364469</identifier><language>eng</language><publisher>United States: Elsevier Ltd</publisher><subject>Cardiovascular Disease risk prediction ; Cross-modal representation ; Data visualization ; Explainable Artificial Intelligence ; Natural language processing ; Unsupervised learning</subject><ispartof>Neural networks, 2023-08, Vol.165, p.562-595</ispartof><rights>2023 The Author(s)</rights><rights>Copyright © 2023 The Author(s). Published by Elsevier Ltd.. All rights reserved.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c357t-f129c94d1c205c9a9d12694b7cd2960571e057e42db7394c46fe724941c513db3</cites><orcidid>0000-0003-4514-4772 ; 0000-0001-5501-5953</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://www.sciencedirect.com/science/article/pii/S0893608023002599$$EHTML$$P50$$Gelsevier$$Hfree_for_read</linktohtml><link.rule.ids>314,776,780,3537,27901,27902,65306</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/37364469$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Moon, Jihye</creatorcontrib><creatorcontrib>Posada-Quintero, Hugo F.</creatorcontrib><creatorcontrib>Chon, Ki H.</creatorcontrib><title>Genetic data visualization using literature text-based neural networks: Examples associated with myocardial infarction</title><title>Neural networks</title><addtitle>Neural Netw</addtitle><description>Data visualization is critical to unraveling hidden information from complex and high-dimensional data. Interpretable visualization methods are critical, especially in the biology and medical fields, however, there are limited effective visualization methods for large genetic data. Current visualization methods are limited to lower-dimensional data and their performance suffers if there is missing data. In this study, we propose a literature-based visualization method to reduce high-dimensional data without compromising the dynamics of the single nucleotide polymorphisms (SNP) and textual interpretability. Our method is innovative because it is shown to (1) preserves both global and local structures of SNP while reducing the dimension of the data using literature text representations, and (2) enables interpretable visualizations using textual information. For performance evaluations, we examined the proposed approach to classify various classification categories including race, myocardial infarction event age groups, and sex using several machine learning models on the literature-derived SNP data. We used visualization approaches to examine clustering of data as well as quantitative performance metrics for the classification of the risk factors examined above. Our method outperformed all popular dimensionality reduction and visualization methods for both classification and visualization, and it is robust against missing and higher-dimensional data. Moreover, we found it feasible to incorporate both genetic and other risk information obtained from literature with our method.
•Literature text-based neural networks enable robust genetic data structure estimation.•Literature information enables textual interpretability on unsupervised visualization.•Our literature-based methods are robust for high dimension and against noise.•Our method may combine genetic risk and other disease risks obtained from literature.</description><subject>Cardiovascular Disease risk prediction</subject><subject>Cross-modal representation</subject><subject>Data visualization</subject><subject>Explainable Artificial Intelligence</subject><subject>Natural language processing</subject><subject>Unsupervised learning</subject><issn>0893-6080</issn><issn>1879-2782</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><recordid>eNp9kMtOHDEQRa0oURggfxBFXmbTTfnRD7NAQghIJKRsYG257erEk34Mtnt4fD0eDbBkU7U5da_qEPKdQcmA1SfrcsJlwlRy4KKEqgRWfSIr1jaq4E3LP5MVtEoUNbRwQA5jXANA3UrxlRyIRtRS1mpFtteYM7ylziRDtz4uZvDPJvl5okv00186-ITBpCUgTfiYis5EdDR3BzPklR7m8D-e0stHM24GjNTEOFtvUoYefPpHx6fZmuB8pv3Um2B32cfkS2-GiN9e9xG5u7q8vfhV3Py5_n1xflNYUTWp6BlXVknHLIfKKqMc47WSXWMdVzVUDcM8UHLXNUJJK-seGy6VZLZiwnXiiPzc527CfL9gTHr00eIwmAnnJWreCuAclKoyKveoDXOMAXu9CX404Ukz0Dvjeq33xvXOuIZKZ-P57Mdrw9KN6N6P3hRn4GwPYP5z6zHoaD1OFp0PaJN2s_-44QVbDpZi</recordid><startdate>202308</startdate><enddate>202308</enddate><creator>Moon, Jihye</creator><creator>Posada-Quintero, Hugo F.</creator><creator>Chon, Ki H.</creator><general>Elsevier Ltd</general><scope>6I.</scope><scope>AAFTH</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7X8</scope><orcidid>https://orcid.org/0000-0003-4514-4772</orcidid><orcidid>https://orcid.org/0000-0001-5501-5953</orcidid></search><sort><creationdate>202308</creationdate><title>Genetic data visualization using literature text-based neural networks: Examples associated with myocardial infarction</title><author>Moon, Jihye ; Posada-Quintero, Hugo F. ; Chon, Ki H.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c357t-f129c94d1c205c9a9d12694b7cd2960571e057e42db7394c46fe724941c513db3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Cardiovascular Disease risk prediction</topic><topic>Cross-modal representation</topic><topic>Data visualization</topic><topic>Explainable Artificial Intelligence</topic><topic>Natural language processing</topic><topic>Unsupervised learning</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Moon, Jihye</creatorcontrib><creatorcontrib>Posada-Quintero, Hugo F.</creatorcontrib><creatorcontrib>Chon, Ki H.</creatorcontrib><collection>ScienceDirect Open Access Titles</collection><collection>Elsevier:ScienceDirect:Open Access</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>MEDLINE - Academic</collection><jtitle>Neural networks</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Moon, Jihye</au><au>Posada-Quintero, Hugo F.</au><au>Chon, Ki H.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Genetic data visualization using literature text-based neural networks: Examples associated with myocardial infarction</atitle><jtitle>Neural networks</jtitle><addtitle>Neural Netw</addtitle><date>2023-08</date><risdate>2023</risdate><volume>165</volume><spage>562</spage><epage>595</epage><pages>562-595</pages><issn>0893-6080</issn><eissn>1879-2782</eissn><abstract>Data visualization is critical to unraveling hidden information from complex and high-dimensional data. Interpretable visualization methods are critical, especially in the biology and medical fields, however, there are limited effective visualization methods for large genetic data. Current visualization methods are limited to lower-dimensional data and their performance suffers if there is missing data. In this study, we propose a literature-based visualization method to reduce high-dimensional data without compromising the dynamics of the single nucleotide polymorphisms (SNP) and textual interpretability. Our method is innovative because it is shown to (1) preserves both global and local structures of SNP while reducing the dimension of the data using literature text representations, and (2) enables interpretable visualizations using textual information. For performance evaluations, we examined the proposed approach to classify various classification categories including race, myocardial infarction event age groups, and sex using several machine learning models on the literature-derived SNP data. We used visualization approaches to examine clustering of data as well as quantitative performance metrics for the classification of the risk factors examined above. Our method outperformed all popular dimensionality reduction and visualization methods for both classification and visualization, and it is robust against missing and higher-dimensional data. Moreover, we found it feasible to incorporate both genetic and other risk information obtained from literature with our method.
•Literature text-based neural networks enable robust genetic data structure estimation.•Literature information enables textual interpretability on unsupervised visualization.•Our literature-based methods are robust for high dimension and against noise.•Our method may combine genetic risk and other disease risks obtained from literature.</abstract><cop>United States</cop><pub>Elsevier Ltd</pub><pmid>37364469</pmid><doi>10.1016/j.neunet.2023.05.015</doi><tpages>34</tpages><orcidid>https://orcid.org/0000-0003-4514-4772</orcidid><orcidid>https://orcid.org/0000-0001-5501-5953</orcidid><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 0893-6080 |
ispartof | Neural networks, 2023-08, Vol.165, p.562-595 |
issn | 0893-6080 1879-2782 |
language | eng |
recordid | cdi_proquest_miscellaneous_2830220995 |
source | Elsevier ScienceDirect Journals |
subjects | Cardiovascular Disease risk prediction Cross-modal representation Data visualization Explainable Artificial Intelligence Natural language processing Unsupervised learning |
title | Genetic data visualization using literature text-based neural networks: Examples associated with myocardial infarction |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-29T04%3A43%3A51IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Genetic%20data%20visualization%20using%20literature%20text-based%20neural%20networks:%20Examples%20associated%20with%20myocardial%20infarction&rft.jtitle=Neural%20networks&rft.au=Moon,%20Jihye&rft.date=2023-08&rft.volume=165&rft.spage=562&rft.epage=595&rft.pages=562-595&rft.issn=0893-6080&rft.eissn=1879-2782&rft_id=info:doi/10.1016/j.neunet.2023.05.015&rft_dat=%3Cproquest_cross%3E2830220995%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2830220995&rft_id=info:pmid/37364469&rft_els_id=S0893608023002599&rfr_iscdi=true |