Multimodal representation learning for predicting molecule-disease relations

Predicting molecule-disease indications and side effects is important for drug development and pharmacovigilance. Comprehensively mining molecule-molecule, molecule-disease and disease-disease semantic dependencies can potentially improve prediction performance. We introduce a Multi-Modal REpresenta...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Bioinformatics (Oxford, England) England), 2023-02, Vol.39 (2)
Hauptverfasser: Wen, Jun, Zhang, Xiang, Rush, Everett, Panickan, Vidul A, Li, Xingyu, Cai, Tianrun, Zhou, Doudou, Ho, Yuk-Lam, Costa, Lauren, Begoli, Edmon, Hong, Chuan, Gaziano, J Michael, Cho, Kelly, Lu, Junwei, Liao, Katherine P, Zitnik, Marinka, Cai, Tianxi
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue 2
container_start_page
container_title Bioinformatics (Oxford, England)
container_volume 39
creator Wen, Jun
Zhang, Xiang
Rush, Everett
Panickan, Vidul A
Li, Xingyu
Cai, Tianrun
Zhou, Doudou
Ho, Yuk-Lam
Costa, Lauren
Begoli, Edmon
Hong, Chuan
Gaziano, J Michael
Cho, Kelly
Lu, Junwei
Liao, Katherine P
Zitnik, Marinka
Cai, Tianxi
description Predicting molecule-disease indications and side effects is important for drug development and pharmacovigilance. Comprehensively mining molecule-molecule, molecule-disease and disease-disease semantic dependencies can potentially improve prediction performance. We introduce a Multi-Modal REpresentation Mapping Approach to Predicting molecular-disease relations (M2REMAP) by incorporating clinical semantics learned from electronic health records (EHR) of 12.6 million patients. Specifically, M2REMAP first learns a multimodal molecule representation that synthesizes chemical property and clinical semantic information by mapping molecule chemicals via a deep neural network onto the clinical semantic embedding space shared by drugs, diseases and other common clinical concepts. To infer molecule-disease relations, M2REMAP combines multimodal molecule representation and disease semantic embedding to jointly infer indications and side effects. We extensively evaluate M2REMAP on molecule indications, side effects and interactions. Results show that incorporating EHR embeddings improves performance significantly, for example, attaining an improvement over the baseline models by 23.6% in PRC-AUC on indications and 23.9% on side effects. Further, M2REMAP overcomes the limitation of existing methods and effectively predicts drugs for novel diseases and emerging pathogens. The code is available at https://github.com/celehs/M2REMAP, and prediction results are provided at https://shiny.parse-health.org/drugs-diseases-dev/. Supplementary data are available at Bioinformatics online.
doi_str_mv 10.1093/bioinformatics/btad085
format Article
fullrecord <record><control><sourceid>proquest_pubme</sourceid><recordid>TN_cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_9940625</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2778973834</sourcerecordid><originalsourceid>FETCH-LOGICAL-c441t-7d2c657b56dfeb586733d37cbeb74de94be45bd63a92a983237edf1da6760f993</originalsourceid><addsrcrecordid>eNpVUVtLHTEQDqLUS_sX5OCTL6vJZpNsXoQithaO-GKfQy6zGskmp0m24L9v9JyKPs0M811m-BA6JfiCYEkvjU8-TinPunpbLk3VDo9sDx0RykU3jITsf-gP0XEpzxhjhhn_gg4pH1vT0yO0vltC9XNyOqwybDIUiLVpprgKoHP08XHVbFZt47ytr-OcAtglQOd8AV2g8cIbo3xFB5MOBb7t6gn6_ePm4fq2W9___HX9fd3ZYSC1E663nAnDuJvAsJELSh0V1oARgwM5GBiYcZxq2Ws50p4KcBNxmguOJynpCbra6m4WM4Oz7eSsg9pkP-v8opL26vMm-if1mP4qKQfMe9YEzrYCqVSvivUV7JNNMYKtikjO-jfQ-c4lpz8LlKpmXyyEoCOkpaheiFEKOtKhQfkWanMqJcP0fgvB6jUv9TkvtcurEU8_fvJO-x8Q_QdYc5p_</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2778973834</pqid></control><display><type>article</type><title>Multimodal representation learning for predicting molecule-disease relations</title><source>MEDLINE</source><source>DOAJ Directory of Open Access Journals</source><source>Access via Oxford University Press (Open Access Collection)</source><source>EZB-FREE-00999 freely available EZB journals</source><source>PubMed Central</source><source>Alma/SFX Local Collection</source><creator>Wen, Jun ; Zhang, Xiang ; Rush, Everett ; Panickan, Vidul A ; Li, Xingyu ; Cai, Tianrun ; Zhou, Doudou ; Ho, Yuk-Lam ; Costa, Lauren ; Begoli, Edmon ; Hong, Chuan ; Gaziano, J Michael ; Cho, Kelly ; Lu, Junwei ; Liao, Katherine P ; Zitnik, Marinka ; Cai, Tianxi</creator><contributor>Lu, Zhiyong</contributor><creatorcontrib>Wen, Jun ; Zhang, Xiang ; Rush, Everett ; Panickan, Vidul A ; Li, Xingyu ; Cai, Tianrun ; Zhou, Doudou ; Ho, Yuk-Lam ; Costa, Lauren ; Begoli, Edmon ; Hong, Chuan ; Gaziano, J Michael ; Cho, Kelly ; Lu, Junwei ; Liao, Katherine P ; Zitnik, Marinka ; Cai, Tianxi ; Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States) ; Lu, Zhiyong</creatorcontrib><description>Predicting molecule-disease indications and side effects is important for drug development and pharmacovigilance. Comprehensively mining molecule-molecule, molecule-disease and disease-disease semantic dependencies can potentially improve prediction performance. We introduce a Multi-Modal REpresentation Mapping Approach to Predicting molecular-disease relations (M2REMAP) by incorporating clinical semantics learned from electronic health records (EHR) of 12.6 million patients. Specifically, M2REMAP first learns a multimodal molecule representation that synthesizes chemical property and clinical semantic information by mapping molecule chemicals via a deep neural network onto the clinical semantic embedding space shared by drugs, diseases and other common clinical concepts. To infer molecule-disease relations, M2REMAP combines multimodal molecule representation and disease semantic embedding to jointly infer indications and side effects. We extensively evaluate M2REMAP on molecule indications, side effects and interactions. Results show that incorporating EHR embeddings improves performance significantly, for example, attaining an improvement over the baseline models by 23.6% in PRC-AUC on indications and 23.9% on side effects. Further, M2REMAP overcomes the limitation of existing methods and effectively predicts drugs for novel diseases and emerging pathogens. The code is available at https://github.com/celehs/M2REMAP, and prediction results are provided at https://shiny.parse-health.org/drugs-diseases-dev/. Supplementary data are available at Bioinformatics online.</description><identifier>ISSN: 1367-4811</identifier><identifier>ISSN: 1367-4803</identifier><identifier>EISSN: 1367-4811</identifier><identifier>DOI: 10.1093/bioinformatics/btad085</identifier><identifier>PMID: 36805623</identifier><language>eng</language><publisher>England: Oxford University Press</publisher><subject>BASIC BIOLOGICAL SCIENCES ; Drug Development ; Drug-Related Side Effects and Adverse Reactions ; Electronic Health Records ; Humans ; Neural Networks, Computer ; Original Paper ; Pharmacovigilance</subject><ispartof>Bioinformatics (Oxford, England), 2023-02, Vol.39 (2)</ispartof><rights>The Author(s) 2023. Published by Oxford University Press.</rights><rights>The Author(s) 2023. Published by Oxford University Press. 2023</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c441t-7d2c657b56dfeb586733d37cbeb74de94be45bd63a92a983237edf1da6760f993</citedby><cites>FETCH-LOGICAL-c441t-7d2c657b56dfeb586733d37cbeb74de94be45bd63a92a983237edf1da6760f993</cites><orcidid>0000-0003-1727-7076 ; 0000-0002-5379-2502 ; 0000-0001-5067-2647 ; 0000-0001-8530-7228 ; 0000000256325723 ; 0000000150672647 ; 0000000221733663 ; 0000000253792502 ; 0000000185307228 ; 0000000317277076</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC9940625/pdf/$$EPDF$$P50$$Gpubmedcentral$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC9940625/$$EHTML$$P50$$Gpubmedcentral$$Hfree_for_read</linktohtml><link.rule.ids>230,314,727,780,784,864,885,27924,27925,53791,53793</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/36805623$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink><backlink>$$Uhttps://www.osti.gov/servlets/purl/1965225$$D View this record in Osti.gov$$Hfree_for_read</backlink></links><search><contributor>Lu, Zhiyong</contributor><creatorcontrib>Wen, Jun</creatorcontrib><creatorcontrib>Zhang, Xiang</creatorcontrib><creatorcontrib>Rush, Everett</creatorcontrib><creatorcontrib>Panickan, Vidul A</creatorcontrib><creatorcontrib>Li, Xingyu</creatorcontrib><creatorcontrib>Cai, Tianrun</creatorcontrib><creatorcontrib>Zhou, Doudou</creatorcontrib><creatorcontrib>Ho, Yuk-Lam</creatorcontrib><creatorcontrib>Costa, Lauren</creatorcontrib><creatorcontrib>Begoli, Edmon</creatorcontrib><creatorcontrib>Hong, Chuan</creatorcontrib><creatorcontrib>Gaziano, J Michael</creatorcontrib><creatorcontrib>Cho, Kelly</creatorcontrib><creatorcontrib>Lu, Junwei</creatorcontrib><creatorcontrib>Liao, Katherine P</creatorcontrib><creatorcontrib>Zitnik, Marinka</creatorcontrib><creatorcontrib>Cai, Tianxi</creatorcontrib><creatorcontrib>Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States)</creatorcontrib><title>Multimodal representation learning for predicting molecule-disease relations</title><title>Bioinformatics (Oxford, England)</title><addtitle>Bioinformatics</addtitle><description>Predicting molecule-disease indications and side effects is important for drug development and pharmacovigilance. Comprehensively mining molecule-molecule, molecule-disease and disease-disease semantic dependencies can potentially improve prediction performance. We introduce a Multi-Modal REpresentation Mapping Approach to Predicting molecular-disease relations (M2REMAP) by incorporating clinical semantics learned from electronic health records (EHR) of 12.6 million patients. Specifically, M2REMAP first learns a multimodal molecule representation that synthesizes chemical property and clinical semantic information by mapping molecule chemicals via a deep neural network onto the clinical semantic embedding space shared by drugs, diseases and other common clinical concepts. To infer molecule-disease relations, M2REMAP combines multimodal molecule representation and disease semantic embedding to jointly infer indications and side effects. We extensively evaluate M2REMAP on molecule indications, side effects and interactions. Results show that incorporating EHR embeddings improves performance significantly, for example, attaining an improvement over the baseline models by 23.6% in PRC-AUC on indications and 23.9% on side effects. Further, M2REMAP overcomes the limitation of existing methods and effectively predicts drugs for novel diseases and emerging pathogens. The code is available at https://github.com/celehs/M2REMAP, and prediction results are provided at https://shiny.parse-health.org/drugs-diseases-dev/. Supplementary data are available at Bioinformatics online.</description><subject>BASIC BIOLOGICAL SCIENCES</subject><subject>Drug Development</subject><subject>Drug-Related Side Effects and Adverse Reactions</subject><subject>Electronic Health Records</subject><subject>Humans</subject><subject>Neural Networks, Computer</subject><subject>Original Paper</subject><subject>Pharmacovigilance</subject><issn>1367-4811</issn><issn>1367-4803</issn><issn>1367-4811</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>EIF</sourceid><recordid>eNpVUVtLHTEQDqLUS_sX5OCTL6vJZpNsXoQithaO-GKfQy6zGskmp0m24L9v9JyKPs0M811m-BA6JfiCYEkvjU8-TinPunpbLk3VDo9sDx0RykU3jITsf-gP0XEpzxhjhhn_gg4pH1vT0yO0vltC9XNyOqwybDIUiLVpprgKoHP08XHVbFZt47ytr-OcAtglQOd8AV2g8cIbo3xFB5MOBb7t6gn6_ePm4fq2W9___HX9fd3ZYSC1E663nAnDuJvAsJELSh0V1oARgwM5GBiYcZxq2Ws50p4KcBNxmguOJynpCbra6m4WM4Oz7eSsg9pkP-v8opL26vMm-if1mP4qKQfMe9YEzrYCqVSvivUV7JNNMYKtikjO-jfQ-c4lpz8LlKpmXyyEoCOkpaheiFEKOtKhQfkWanMqJcP0fgvB6jUv9TkvtcurEU8_fvJO-x8Q_QdYc5p_</recordid><startdate>20230203</startdate><enddate>20230203</enddate><creator>Wen, Jun</creator><creator>Zhang, Xiang</creator><creator>Rush, Everett</creator><creator>Panickan, Vidul A</creator><creator>Li, Xingyu</creator><creator>Cai, Tianrun</creator><creator>Zhou, Doudou</creator><creator>Ho, Yuk-Lam</creator><creator>Costa, Lauren</creator><creator>Begoli, Edmon</creator><creator>Hong, Chuan</creator><creator>Gaziano, J Michael</creator><creator>Cho, Kelly</creator><creator>Lu, Junwei</creator><creator>Liao, Katherine P</creator><creator>Zitnik, Marinka</creator><creator>Cai, Tianxi</creator><general>Oxford University Press</general><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7X8</scope><scope>OIOZB</scope><scope>OTOTI</scope><scope>5PM</scope><orcidid>https://orcid.org/0000-0003-1727-7076</orcidid><orcidid>https://orcid.org/0000-0002-5379-2502</orcidid><orcidid>https://orcid.org/0000-0001-5067-2647</orcidid><orcidid>https://orcid.org/0000-0001-8530-7228</orcidid><orcidid>https://orcid.org/0000000256325723</orcidid><orcidid>https://orcid.org/0000000150672647</orcidid><orcidid>https://orcid.org/0000000221733663</orcidid><orcidid>https://orcid.org/0000000253792502</orcidid><orcidid>https://orcid.org/0000000185307228</orcidid><orcidid>https://orcid.org/0000000317277076</orcidid></search><sort><creationdate>20230203</creationdate><title>Multimodal representation learning for predicting molecule-disease relations</title><author>Wen, Jun ; Zhang, Xiang ; Rush, Everett ; Panickan, Vidul A ; Li, Xingyu ; Cai, Tianrun ; Zhou, Doudou ; Ho, Yuk-Lam ; Costa, Lauren ; Begoli, Edmon ; Hong, Chuan ; Gaziano, J Michael ; Cho, Kelly ; Lu, Junwei ; Liao, Katherine P ; Zitnik, Marinka ; Cai, Tianxi</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c441t-7d2c657b56dfeb586733d37cbeb74de94be45bd63a92a983237edf1da6760f993</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>BASIC BIOLOGICAL SCIENCES</topic><topic>Drug Development</topic><topic>Drug-Related Side Effects and Adverse Reactions</topic><topic>Electronic Health Records</topic><topic>Humans</topic><topic>Neural Networks, Computer</topic><topic>Original Paper</topic><topic>Pharmacovigilance</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Wen, Jun</creatorcontrib><creatorcontrib>Zhang, Xiang</creatorcontrib><creatorcontrib>Rush, Everett</creatorcontrib><creatorcontrib>Panickan, Vidul A</creatorcontrib><creatorcontrib>Li, Xingyu</creatorcontrib><creatorcontrib>Cai, Tianrun</creatorcontrib><creatorcontrib>Zhou, Doudou</creatorcontrib><creatorcontrib>Ho, Yuk-Lam</creatorcontrib><creatorcontrib>Costa, Lauren</creatorcontrib><creatorcontrib>Begoli, Edmon</creatorcontrib><creatorcontrib>Hong, Chuan</creatorcontrib><creatorcontrib>Gaziano, J Michael</creatorcontrib><creatorcontrib>Cho, Kelly</creatorcontrib><creatorcontrib>Lu, Junwei</creatorcontrib><creatorcontrib>Liao, Katherine P</creatorcontrib><creatorcontrib>Zitnik, Marinka</creatorcontrib><creatorcontrib>Cai, Tianxi</creatorcontrib><creatorcontrib>Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States)</creatorcontrib><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>MEDLINE - Academic</collection><collection>OSTI.GOV - Hybrid</collection><collection>OSTI.GOV</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>Bioinformatics (Oxford, England)</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Wen, Jun</au><au>Zhang, Xiang</au><au>Rush, Everett</au><au>Panickan, Vidul A</au><au>Li, Xingyu</au><au>Cai, Tianrun</au><au>Zhou, Doudou</au><au>Ho, Yuk-Lam</au><au>Costa, Lauren</au><au>Begoli, Edmon</au><au>Hong, Chuan</au><au>Gaziano, J Michael</au><au>Cho, Kelly</au><au>Lu, Junwei</au><au>Liao, Katherine P</au><au>Zitnik, Marinka</au><au>Cai, Tianxi</au><au>Lu, Zhiyong</au><aucorp>Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States)</aucorp><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Multimodal representation learning for predicting molecule-disease relations</atitle><jtitle>Bioinformatics (Oxford, England)</jtitle><addtitle>Bioinformatics</addtitle><date>2023-02-03</date><risdate>2023</risdate><volume>39</volume><issue>2</issue><issn>1367-4811</issn><issn>1367-4803</issn><eissn>1367-4811</eissn><abstract>Predicting molecule-disease indications and side effects is important for drug development and pharmacovigilance. Comprehensively mining molecule-molecule, molecule-disease and disease-disease semantic dependencies can potentially improve prediction performance. We introduce a Multi-Modal REpresentation Mapping Approach to Predicting molecular-disease relations (M2REMAP) by incorporating clinical semantics learned from electronic health records (EHR) of 12.6 million patients. Specifically, M2REMAP first learns a multimodal molecule representation that synthesizes chemical property and clinical semantic information by mapping molecule chemicals via a deep neural network onto the clinical semantic embedding space shared by drugs, diseases and other common clinical concepts. To infer molecule-disease relations, M2REMAP combines multimodal molecule representation and disease semantic embedding to jointly infer indications and side effects. We extensively evaluate M2REMAP on molecule indications, side effects and interactions. Results show that incorporating EHR embeddings improves performance significantly, for example, attaining an improvement over the baseline models by 23.6% in PRC-AUC on indications and 23.9% on side effects. Further, M2REMAP overcomes the limitation of existing methods and effectively predicts drugs for novel diseases and emerging pathogens. The code is available at https://github.com/celehs/M2REMAP, and prediction results are provided at https://shiny.parse-health.org/drugs-diseases-dev/. Supplementary data are available at Bioinformatics online.</abstract><cop>England</cop><pub>Oxford University Press</pub><pmid>36805623</pmid><doi>10.1093/bioinformatics/btad085</doi><orcidid>https://orcid.org/0000-0003-1727-7076</orcidid><orcidid>https://orcid.org/0000-0002-5379-2502</orcidid><orcidid>https://orcid.org/0000-0001-5067-2647</orcidid><orcidid>https://orcid.org/0000-0001-8530-7228</orcidid><orcidid>https://orcid.org/0000000256325723</orcidid><orcidid>https://orcid.org/0000000150672647</orcidid><orcidid>https://orcid.org/0000000221733663</orcidid><orcidid>https://orcid.org/0000000253792502</orcidid><orcidid>https://orcid.org/0000000185307228</orcidid><orcidid>https://orcid.org/0000000317277076</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 1367-4811
ispartof Bioinformatics (Oxford, England), 2023-02, Vol.39 (2)
issn 1367-4811
1367-4803
1367-4811
language eng
recordid cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_9940625
source MEDLINE; DOAJ Directory of Open Access Journals; Access via Oxford University Press (Open Access Collection); EZB-FREE-00999 freely available EZB journals; PubMed Central; Alma/SFX Local Collection
subjects BASIC BIOLOGICAL SCIENCES
Drug Development
Drug-Related Side Effects and Adverse Reactions
Electronic Health Records
Humans
Neural Networks, Computer
Original Paper
Pharmacovigilance
title Multimodal representation learning for predicting molecule-disease relations
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-02T20%3A59%3A16IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Multimodal%20representation%20learning%20for%20predicting%20molecule-disease%20relations&rft.jtitle=Bioinformatics%20(Oxford,%20England)&rft.au=Wen,%20Jun&rft.aucorp=Oak%20Ridge%20National%20Lab.%20(ORNL),%20Oak%20Ridge,%20TN%20(United%20States)&rft.date=2023-02-03&rft.volume=39&rft.issue=2&rft.issn=1367-4811&rft.eissn=1367-4811&rft_id=info:doi/10.1093/bioinformatics/btad085&rft_dat=%3Cproquest_pubme%3E2778973834%3C/proquest_pubme%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2778973834&rft_id=info:pmid/36805623&rfr_iscdi=true