Predicting drug–disease associations by network embedding and biomedical data integration

Purpose The traditional drug development process is costly, time consuming and risky. Using computational methods to discover drug repositioning opportunities is a promising and efficient strategy in the era of big data. The explosive growth of large-scale genomic, phenotypic data and all kinds of “...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Data technologies and applications 2019-06, Vol.53 (2), p.217-229
Hauptverfasser: Wei, Xiaomei, Zhang, Yaliang, Huang, Yu, Fang, Yaping
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 229
container_issue 2
container_start_page 217
container_title Data technologies and applications
container_volume 53
creator Wei, Xiaomei
Zhang, Yaliang
Huang, Yu
Fang, Yaping
description Purpose The traditional drug development process is costly, time consuming and risky. Using computational methods to discover drug repositioning opportunities is a promising and efficient strategy in the era of big data. The explosive growth of large-scale genomic, phenotypic data and all kinds of “omics” data brings opportunities for developing new computational drug repositioning methods based on big data. The paper aims to discuss this issue. Design/methodology/approach Here, a new computational strategy is proposed for inferring drug–disease associations from rich biomedical resources toward drug repositioning. First, the network embedding (NE) algorithm is adopted to learn the latent feature representation of drugs from multiple biomedical resources. Furthermore, on the basis of the latent vectors of drugs from the NE module, a binary support vector machine classifier is trained to divide unknown drug–disease pairs into positive and negative instances. Finally, this model is validated on a well-established drug–disease association data set with tenfold cross-validation. Findings This model obtains the performance of an area under the receiver operating characteristic curve of 90.3 percent, which is comparable to those of similar systems. The authors also analyze the performance of the model and validate its effect on predicting the new indications of old drugs. Originality/value This study shows that the authors’ method is predictive, identifying novel drug–disease interactions for drug discovery. The new feature learning methods also positively contribute to the heterogeneous data integration.
doi_str_mv 10.1108/DTA-01-2019-0004
format Article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2236107952</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2236107952</sourcerecordid><originalsourceid>FETCH-LOGICAL-c311t-ea5fdee352e3ded49293f2d67584bd7715e7752bb385e9e041aad4ea035aa58c3</originalsourceid><addsrcrecordid>eNptkD1PwzAQhi0EElXpzmiJOdQfceOMVfmUKsFQJgbrEl8qlyYudirUjf_AP-SXkBAYkJjuhvd57_QQcs7ZJedMT69W84TxRDCeJ4yx9IiMhOJpkkuuj393ofUpmcS46RKCqUxqNSLPjwGtK1vXrKkN-_Xn-4d1ESEihRh96aB1vom0ONAG2zcfXijWBVrbA9BYWjhf9w2wpRZaoK5pcR2-qTNyUsE24uRnjsnTzfVqcZcsH27vF_NlUkrO2wRBVRZRKoHSok1zkctK2FmmdFrYLOMKs0yJougexhxZygFsisCkAlC6lGNyMfTugn_dY2zNxu9D0500QsgZZ1muRJdiQ6oMPsaAldkFV0M4GM5Mb9F0Fg3jprdoeosdMh0QrDHA1v5H_PEuvwDnX3T0</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2236107952</pqid></control><display><type>article</type><title>Predicting drug–disease associations by network embedding and biomedical data integration</title><source>Standard: Emerald eJournal Premier Collection</source><creator>Wei, Xiaomei ; Zhang, Yaliang ; Huang, Yu ; Fang, Yaping</creator><creatorcontrib>Wei, Xiaomei ; Zhang, Yaliang ; Huang, Yu ; Fang, Yaping</creatorcontrib><description>Purpose The traditional drug development process is costly, time consuming and risky. Using computational methods to discover drug repositioning opportunities is a promising and efficient strategy in the era of big data. The explosive growth of large-scale genomic, phenotypic data and all kinds of “omics” data brings opportunities for developing new computational drug repositioning methods based on big data. The paper aims to discuss this issue. Design/methodology/approach Here, a new computational strategy is proposed for inferring drug–disease associations from rich biomedical resources toward drug repositioning. First, the network embedding (NE) algorithm is adopted to learn the latent feature representation of drugs from multiple biomedical resources. Furthermore, on the basis of the latent vectors of drugs from the NE module, a binary support vector machine classifier is trained to divide unknown drug–disease pairs into positive and negative instances. Finally, this model is validated on a well-established drug–disease association data set with tenfold cross-validation. Findings This model obtains the performance of an area under the receiver operating characteristic curve of 90.3 percent, which is comparable to those of similar systems. The authors also analyze the performance of the model and validate its effect on predicting the new indications of old drugs. Originality/value This study shows that the authors’ method is predictive, identifying novel drug–disease interactions for drug discovery. The new feature learning methods also positively contribute to the heterogeneous data integration.</description><identifier>ISSN: 2514-9288</identifier><identifier>EISSN: 2514-9318</identifier><identifier>DOI: 10.1108/DTA-01-2019-0004</identifier><language>eng</language><publisher>Bingley: Emerald Publishing Limited</publisher><subject>Algorithms ; Artificial intelligence ; Biomedical data ; Biomedicine ; Computation ; Data integration ; Data management ; Disease ; Drug development ; Drugs ; Embedding ; Feature selection ; Gastroesophageal reflux ; Gene expression ; Learning ; Learning Processes ; Machine learning ; Mathematics ; Methods ; Narcotics ; Neuroses ; Obsessive compulsive disorder ; Pain ; Pharmaceutical industry ; Prior Learning ; Semantics ; Similarity measures ; Social networks ; Support vector machines ; Topology ; Tourette syndrome</subject><ispartof>Data technologies and applications, 2019-06, Vol.53 (2), p.217-229</ispartof><rights>Emerald Publishing Limited</rights><rights>Emerald Publishing Limited 2019</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c311t-ea5fdee352e3ded49293f2d67584bd7715e7752bb385e9e041aad4ea035aa58c3</citedby><cites>FETCH-LOGICAL-c311t-ea5fdee352e3ded49293f2d67584bd7715e7752bb385e9e041aad4ea035aa58c3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://www.emerald.com/insight/content/doi/10.1108/DTA-01-2019-0004/full/html$$EHTML$$P50$$Gemerald$$H</linktohtml><link.rule.ids>314,776,780,21674,27901,27902,53219</link.rule.ids></links><search><creatorcontrib>Wei, Xiaomei</creatorcontrib><creatorcontrib>Zhang, Yaliang</creatorcontrib><creatorcontrib>Huang, Yu</creatorcontrib><creatorcontrib>Fang, Yaping</creatorcontrib><title>Predicting drug–disease associations by network embedding and biomedical data integration</title><title>Data technologies and applications</title><description>Purpose The traditional drug development process is costly, time consuming and risky. Using computational methods to discover drug repositioning opportunities is a promising and efficient strategy in the era of big data. The explosive growth of large-scale genomic, phenotypic data and all kinds of “omics” data brings opportunities for developing new computational drug repositioning methods based on big data. The paper aims to discuss this issue. Design/methodology/approach Here, a new computational strategy is proposed for inferring drug–disease associations from rich biomedical resources toward drug repositioning. First, the network embedding (NE) algorithm is adopted to learn the latent feature representation of drugs from multiple biomedical resources. Furthermore, on the basis of the latent vectors of drugs from the NE module, a binary support vector machine classifier is trained to divide unknown drug–disease pairs into positive and negative instances. Finally, this model is validated on a well-established drug–disease association data set with tenfold cross-validation. Findings This model obtains the performance of an area under the receiver operating characteristic curve of 90.3 percent, which is comparable to those of similar systems. The authors also analyze the performance of the model and validate its effect on predicting the new indications of old drugs. Originality/value This study shows that the authors’ method is predictive, identifying novel drug–disease interactions for drug discovery. The new feature learning methods also positively contribute to the heterogeneous data integration.</description><subject>Algorithms</subject><subject>Artificial intelligence</subject><subject>Biomedical data</subject><subject>Biomedicine</subject><subject>Computation</subject><subject>Data integration</subject><subject>Data management</subject><subject>Disease</subject><subject>Drug development</subject><subject>Drugs</subject><subject>Embedding</subject><subject>Feature selection</subject><subject>Gastroesophageal reflux</subject><subject>Gene expression</subject><subject>Learning</subject><subject>Learning Processes</subject><subject>Machine learning</subject><subject>Mathematics</subject><subject>Methods</subject><subject>Narcotics</subject><subject>Neuroses</subject><subject>Obsessive compulsive disorder</subject><subject>Pain</subject><subject>Pharmaceutical industry</subject><subject>Prior Learning</subject><subject>Semantics</subject><subject>Similarity measures</subject><subject>Social networks</subject><subject>Support vector machines</subject><subject>Topology</subject><subject>Tourette syndrome</subject><issn>2514-9288</issn><issn>2514-9318</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2019</creationdate><recordtype>article</recordtype><sourceid>BENPR</sourceid><recordid>eNptkD1PwzAQhi0EElXpzmiJOdQfceOMVfmUKsFQJgbrEl8qlyYudirUjf_AP-SXkBAYkJjuhvd57_QQcs7ZJedMT69W84TxRDCeJ4yx9IiMhOJpkkuuj393ofUpmcS46RKCqUxqNSLPjwGtK1vXrKkN-_Xn-4d1ESEihRh96aB1vom0ONAG2zcfXijWBVrbA9BYWjhf9w2wpRZaoK5pcR2-qTNyUsE24uRnjsnTzfVqcZcsH27vF_NlUkrO2wRBVRZRKoHSok1zkctK2FmmdFrYLOMKs0yJougexhxZygFsisCkAlC6lGNyMfTugn_dY2zNxu9D0500QsgZZ1muRJdiQ6oMPsaAldkFV0M4GM5Mb9F0Fg3jprdoeosdMh0QrDHA1v5H_PEuvwDnX3T0</recordid><startdate>20190607</startdate><enddate>20190607</enddate><creator>Wei, Xiaomei</creator><creator>Zhang, Yaliang</creator><creator>Huang, Yu</creator><creator>Fang, Yaping</creator><general>Emerald Publishing Limited</general><general>Emerald Group Publishing Limited</general><scope>AAYXX</scope><scope>CITATION</scope><scope>0-V</scope><scope>7SC</scope><scope>7WY</scope><scope>7WZ</scope><scope>7XB</scope><scope>8FD</scope><scope>8FE</scope><scope>8FG</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>ALSLI</scope><scope>ARAPS</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BEZIV</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>CJNVE</scope><scope>CNYFK</scope><scope>DWQXO</scope><scope>E3H</scope><scope>F2A</scope><scope>F~G</scope><scope>GNUQQ</scope><scope>HCIFZ</scope><scope>JQ2</scope><scope>K6~</scope><scope>K7-</scope><scope>L.-</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>M0C</scope><scope>M0N</scope><scope>M0P</scope><scope>M1O</scope><scope>P5Z</scope><scope>P62</scope><scope>PQBIZ</scope><scope>PQEDU</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PYYUZ</scope><scope>Q9U</scope></search><sort><creationdate>20190607</creationdate><title>Predicting drug–disease associations by network embedding and biomedical data integration</title><author>Wei, Xiaomei ; Zhang, Yaliang ; Huang, Yu ; Fang, Yaping</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c311t-ea5fdee352e3ded49293f2d67584bd7715e7752bb385e9e041aad4ea035aa58c3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2019</creationdate><topic>Algorithms</topic><topic>Artificial intelligence</topic><topic>Biomedical data</topic><topic>Biomedicine</topic><topic>Computation</topic><topic>Data integration</topic><topic>Data management</topic><topic>Disease</topic><topic>Drug development</topic><topic>Drugs</topic><topic>Embedding</topic><topic>Feature selection</topic><topic>Gastroesophageal reflux</topic><topic>Gene expression</topic><topic>Learning</topic><topic>Learning Processes</topic><topic>Machine learning</topic><topic>Mathematics</topic><topic>Methods</topic><topic>Narcotics</topic><topic>Neuroses</topic><topic>Obsessive compulsive disorder</topic><topic>Pain</topic><topic>Pharmaceutical industry</topic><topic>Prior Learning</topic><topic>Semantics</topic><topic>Similarity measures</topic><topic>Social networks</topic><topic>Support vector machines</topic><topic>Topology</topic><topic>Tourette syndrome</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Wei, Xiaomei</creatorcontrib><creatorcontrib>Zhang, Yaliang</creatorcontrib><creatorcontrib>Huang, Yu</creatorcontrib><creatorcontrib>Fang, Yaping</creatorcontrib><collection>CrossRef</collection><collection>ProQuest Social Sciences Premium Collection</collection><collection>Computer and Information Systems Abstracts</collection><collection>ABI/INFORM Collection</collection><collection>ABI/INFORM Global (PDF only)</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>Technology Research Database</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>Social Science Premium Collection</collection><collection>Advanced Technologies &amp; Aerospace Collection</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Business Premium Collection</collection><collection>Technology Collection (ProQuest)</collection><collection>ProQuest One Community College</collection><collection>Education Collection</collection><collection>Library &amp; Information Science Collection</collection><collection>ProQuest Central Korea</collection><collection>Library &amp; Information Sciences Abstracts (LISA)</collection><collection>Library &amp; Information Science Abstracts (LISA)</collection><collection>ABI/INFORM Global (Corporate)</collection><collection>ProQuest Central Student</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Computer Science Collection</collection><collection>ProQuest Business Collection</collection><collection>Computer Science Database</collection><collection>ABI/INFORM Professional Advanced</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>ABI/INFORM Global</collection><collection>Computing Database</collection><collection>Education Database</collection><collection>Library Science Database</collection><collection>Advanced Technologies &amp; Aerospace Database</collection><collection>ProQuest Advanced Technologies &amp; Aerospace Collection</collection><collection>ProQuest One Business</collection><collection>ProQuest One Education</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ABI/INFORM Collection China</collection><collection>ProQuest Central Basic</collection><jtitle>Data technologies and applications</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Wei, Xiaomei</au><au>Zhang, Yaliang</au><au>Huang, Yu</au><au>Fang, Yaping</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Predicting drug–disease associations by network embedding and biomedical data integration</atitle><jtitle>Data technologies and applications</jtitle><date>2019-06-07</date><risdate>2019</risdate><volume>53</volume><issue>2</issue><spage>217</spage><epage>229</epage><pages>217-229</pages><issn>2514-9288</issn><eissn>2514-9318</eissn><abstract>Purpose The traditional drug development process is costly, time consuming and risky. Using computational methods to discover drug repositioning opportunities is a promising and efficient strategy in the era of big data. The explosive growth of large-scale genomic, phenotypic data and all kinds of “omics” data brings opportunities for developing new computational drug repositioning methods based on big data. The paper aims to discuss this issue. Design/methodology/approach Here, a new computational strategy is proposed for inferring drug–disease associations from rich biomedical resources toward drug repositioning. First, the network embedding (NE) algorithm is adopted to learn the latent feature representation of drugs from multiple biomedical resources. Furthermore, on the basis of the latent vectors of drugs from the NE module, a binary support vector machine classifier is trained to divide unknown drug–disease pairs into positive and negative instances. Finally, this model is validated on a well-established drug–disease association data set with tenfold cross-validation. Findings This model obtains the performance of an area under the receiver operating characteristic curve of 90.3 percent, which is comparable to those of similar systems. The authors also analyze the performance of the model and validate its effect on predicting the new indications of old drugs. Originality/value This study shows that the authors’ method is predictive, identifying novel drug–disease interactions for drug discovery. The new feature learning methods also positively contribute to the heterogeneous data integration.</abstract><cop>Bingley</cop><pub>Emerald Publishing Limited</pub><doi>10.1108/DTA-01-2019-0004</doi><tpages>13</tpages></addata></record>
fulltext fulltext
identifier ISSN: 2514-9288
ispartof Data technologies and applications, 2019-06, Vol.53 (2), p.217-229
issn 2514-9288
2514-9318
language eng
recordid cdi_proquest_journals_2236107952
source Standard: Emerald eJournal Premier Collection
subjects Algorithms
Artificial intelligence
Biomedical data
Biomedicine
Computation
Data integration
Data management
Disease
Drug development
Drugs
Embedding
Feature selection
Gastroesophageal reflux
Gene expression
Learning
Learning Processes
Machine learning
Mathematics
Methods
Narcotics
Neuroses
Obsessive compulsive disorder
Pain
Pharmaceutical industry
Prior Learning
Semantics
Similarity measures
Social networks
Support vector machines
Topology
Tourette syndrome
title Predicting drug–disease associations by network embedding and biomedical data integration
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-31T11%3A57%3A27IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Predicting%20drug%E2%80%93disease%20associations%20by%20network%20embedding%20and%20biomedical%20data%20integration&rft.jtitle=Data%20technologies%20and%20applications&rft.au=Wei,%20Xiaomei&rft.date=2019-06-07&rft.volume=53&rft.issue=2&rft.spage=217&rft.epage=229&rft.pages=217-229&rft.issn=2514-9288&rft.eissn=2514-9318&rft_id=info:doi/10.1108/DTA-01-2019-0004&rft_dat=%3Cproquest_cross%3E2236107952%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2236107952&rft_id=info:pmid/&rfr_iscdi=true