Predicting drug–disease associations by network embedding and biomedical data integration
Purpose The traditional drug development process is costly, time consuming and risky. Using computational methods to discover drug repositioning opportunities is a promising and efficient strategy in the era of big data. The explosive growth of large-scale genomic, phenotypic data and all kinds of “...
Gespeichert in:
Veröffentlicht in: | Data technologies and applications 2019-06, Vol.53 (2), p.217-229 |
---|---|
Hauptverfasser: | , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 229 |
---|---|
container_issue | 2 |
container_start_page | 217 |
container_title | Data technologies and applications |
container_volume | 53 |
creator | Wei, Xiaomei Zhang, Yaliang Huang, Yu Fang, Yaping |
description | Purpose
The traditional drug development process is costly, time consuming and risky. Using computational methods to discover drug repositioning opportunities is a promising and efficient strategy in the era of big data. The explosive growth of large-scale genomic, phenotypic data and all kinds of “omics” data brings opportunities for developing new computational drug repositioning methods based on big data. The paper aims to discuss this issue.
Design/methodology/approach
Here, a new computational strategy is proposed for inferring drug–disease associations from rich biomedical resources toward drug repositioning. First, the network embedding (NE) algorithm is adopted to learn the latent feature representation of drugs from multiple biomedical resources. Furthermore, on the basis of the latent vectors of drugs from the NE module, a binary support vector machine classifier is trained to divide unknown drug–disease pairs into positive and negative instances. Finally, this model is validated on a well-established drug–disease association data set with tenfold cross-validation.
Findings
This model obtains the performance of an area under the receiver operating characteristic curve of 90.3 percent, which is comparable to those of similar systems. The authors also analyze the performance of the model and validate its effect on predicting the new indications of old drugs.
Originality/value
This study shows that the authors’ method is predictive, identifying novel drug–disease interactions for drug discovery. The new feature learning methods also positively contribute to the heterogeneous data integration. |
doi_str_mv | 10.1108/DTA-01-2019-0004 |
format | Article |
fullrecord | <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2236107952</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2236107952</sourcerecordid><originalsourceid>FETCH-LOGICAL-c311t-ea5fdee352e3ded49293f2d67584bd7715e7752bb385e9e041aad4ea035aa58c3</originalsourceid><addsrcrecordid>eNptkD1PwzAQhi0EElXpzmiJOdQfceOMVfmUKsFQJgbrEl8qlyYudirUjf_AP-SXkBAYkJjuhvd57_QQcs7ZJedMT69W84TxRDCeJ4yx9IiMhOJpkkuuj393ofUpmcS46RKCqUxqNSLPjwGtK1vXrKkN-_Xn-4d1ESEihRh96aB1vom0ONAG2zcfXijWBVrbA9BYWjhf9w2wpRZaoK5pcR2-qTNyUsE24uRnjsnTzfVqcZcsH27vF_NlUkrO2wRBVRZRKoHSok1zkctK2FmmdFrYLOMKs0yJougexhxZygFsisCkAlC6lGNyMfTugn_dY2zNxu9D0500QsgZZ1muRJdiQ6oMPsaAldkFV0M4GM5Mb9F0Fg3jprdoeosdMh0QrDHA1v5H_PEuvwDnX3T0</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2236107952</pqid></control><display><type>article</type><title>Predicting drug–disease associations by network embedding and biomedical data integration</title><source>Standard: Emerald eJournal Premier Collection</source><creator>Wei, Xiaomei ; Zhang, Yaliang ; Huang, Yu ; Fang, Yaping</creator><creatorcontrib>Wei, Xiaomei ; Zhang, Yaliang ; Huang, Yu ; Fang, Yaping</creatorcontrib><description>Purpose
The traditional drug development process is costly, time consuming and risky. Using computational methods to discover drug repositioning opportunities is a promising and efficient strategy in the era of big data. The explosive growth of large-scale genomic, phenotypic data and all kinds of “omics” data brings opportunities for developing new computational drug repositioning methods based on big data. The paper aims to discuss this issue.
Design/methodology/approach
Here, a new computational strategy is proposed for inferring drug–disease associations from rich biomedical resources toward drug repositioning. First, the network embedding (NE) algorithm is adopted to learn the latent feature representation of drugs from multiple biomedical resources. Furthermore, on the basis of the latent vectors of drugs from the NE module, a binary support vector machine classifier is trained to divide unknown drug–disease pairs into positive and negative instances. Finally, this model is validated on a well-established drug–disease association data set with tenfold cross-validation.
Findings
This model obtains the performance of an area under the receiver operating characteristic curve of 90.3 percent, which is comparable to those of similar systems. The authors also analyze the performance of the model and validate its effect on predicting the new indications of old drugs.
Originality/value
This study shows that the authors’ method is predictive, identifying novel drug–disease interactions for drug discovery. The new feature learning methods also positively contribute to the heterogeneous data integration.</description><identifier>ISSN: 2514-9288</identifier><identifier>EISSN: 2514-9318</identifier><identifier>DOI: 10.1108/DTA-01-2019-0004</identifier><language>eng</language><publisher>Bingley: Emerald Publishing Limited</publisher><subject>Algorithms ; Artificial intelligence ; Biomedical data ; Biomedicine ; Computation ; Data integration ; Data management ; Disease ; Drug development ; Drugs ; Embedding ; Feature selection ; Gastroesophageal reflux ; Gene expression ; Learning ; Learning Processes ; Machine learning ; Mathematics ; Methods ; Narcotics ; Neuroses ; Obsessive compulsive disorder ; Pain ; Pharmaceutical industry ; Prior Learning ; Semantics ; Similarity measures ; Social networks ; Support vector machines ; Topology ; Tourette syndrome</subject><ispartof>Data technologies and applications, 2019-06, Vol.53 (2), p.217-229</ispartof><rights>Emerald Publishing Limited</rights><rights>Emerald Publishing Limited 2019</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c311t-ea5fdee352e3ded49293f2d67584bd7715e7752bb385e9e041aad4ea035aa58c3</citedby><cites>FETCH-LOGICAL-c311t-ea5fdee352e3ded49293f2d67584bd7715e7752bb385e9e041aad4ea035aa58c3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://www.emerald.com/insight/content/doi/10.1108/DTA-01-2019-0004/full/html$$EHTML$$P50$$Gemerald$$H</linktohtml><link.rule.ids>314,776,780,21674,27901,27902,53219</link.rule.ids></links><search><creatorcontrib>Wei, Xiaomei</creatorcontrib><creatorcontrib>Zhang, Yaliang</creatorcontrib><creatorcontrib>Huang, Yu</creatorcontrib><creatorcontrib>Fang, Yaping</creatorcontrib><title>Predicting drug–disease associations by network embedding and biomedical data integration</title><title>Data technologies and applications</title><description>Purpose
The traditional drug development process is costly, time consuming and risky. Using computational methods to discover drug repositioning opportunities is a promising and efficient strategy in the era of big data. The explosive growth of large-scale genomic, phenotypic data and all kinds of “omics” data brings opportunities for developing new computational drug repositioning methods based on big data. The paper aims to discuss this issue.
Design/methodology/approach
Here, a new computational strategy is proposed for inferring drug–disease associations from rich biomedical resources toward drug repositioning. First, the network embedding (NE) algorithm is adopted to learn the latent feature representation of drugs from multiple biomedical resources. Furthermore, on the basis of the latent vectors of drugs from the NE module, a binary support vector machine classifier is trained to divide unknown drug–disease pairs into positive and negative instances. Finally, this model is validated on a well-established drug–disease association data set with tenfold cross-validation.
Findings
This model obtains the performance of an area under the receiver operating characteristic curve of 90.3 percent, which is comparable to those of similar systems. The authors also analyze the performance of the model and validate its effect on predicting the new indications of old drugs.
Originality/value
This study shows that the authors’ method is predictive, identifying novel drug–disease interactions for drug discovery. The new feature learning methods also positively contribute to the heterogeneous data integration.</description><subject>Algorithms</subject><subject>Artificial intelligence</subject><subject>Biomedical data</subject><subject>Biomedicine</subject><subject>Computation</subject><subject>Data integration</subject><subject>Data management</subject><subject>Disease</subject><subject>Drug development</subject><subject>Drugs</subject><subject>Embedding</subject><subject>Feature selection</subject><subject>Gastroesophageal reflux</subject><subject>Gene expression</subject><subject>Learning</subject><subject>Learning Processes</subject><subject>Machine learning</subject><subject>Mathematics</subject><subject>Methods</subject><subject>Narcotics</subject><subject>Neuroses</subject><subject>Obsessive compulsive disorder</subject><subject>Pain</subject><subject>Pharmaceutical industry</subject><subject>Prior Learning</subject><subject>Semantics</subject><subject>Similarity measures</subject><subject>Social networks</subject><subject>Support vector machines</subject><subject>Topology</subject><subject>Tourette syndrome</subject><issn>2514-9288</issn><issn>2514-9318</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2019</creationdate><recordtype>article</recordtype><sourceid>BENPR</sourceid><recordid>eNptkD1PwzAQhi0EElXpzmiJOdQfceOMVfmUKsFQJgbrEl8qlyYudirUjf_AP-SXkBAYkJjuhvd57_QQcs7ZJedMT69W84TxRDCeJ4yx9IiMhOJpkkuuj393ofUpmcS46RKCqUxqNSLPjwGtK1vXrKkN-_Xn-4d1ESEihRh96aB1vom0ONAG2zcfXijWBVrbA9BYWjhf9w2wpRZaoK5pcR2-qTNyUsE24uRnjsnTzfVqcZcsH27vF_NlUkrO2wRBVRZRKoHSok1zkctK2FmmdFrYLOMKs0yJougexhxZygFsisCkAlC6lGNyMfTugn_dY2zNxu9D0500QsgZZ1muRJdiQ6oMPsaAldkFV0M4GM5Mb9F0Fg3jprdoeosdMh0QrDHA1v5H_PEuvwDnX3T0</recordid><startdate>20190607</startdate><enddate>20190607</enddate><creator>Wei, Xiaomei</creator><creator>Zhang, Yaliang</creator><creator>Huang, Yu</creator><creator>Fang, Yaping</creator><general>Emerald Publishing Limited</general><general>Emerald Group Publishing Limited</general><scope>AAYXX</scope><scope>CITATION</scope><scope>0-V</scope><scope>7SC</scope><scope>7WY</scope><scope>7WZ</scope><scope>7XB</scope><scope>8FD</scope><scope>8FE</scope><scope>8FG</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>ALSLI</scope><scope>ARAPS</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BEZIV</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>CJNVE</scope><scope>CNYFK</scope><scope>DWQXO</scope><scope>E3H</scope><scope>F2A</scope><scope>F~G</scope><scope>GNUQQ</scope><scope>HCIFZ</scope><scope>JQ2</scope><scope>K6~</scope><scope>K7-</scope><scope>L.-</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>M0C</scope><scope>M0N</scope><scope>M0P</scope><scope>M1O</scope><scope>P5Z</scope><scope>P62</scope><scope>PQBIZ</scope><scope>PQEDU</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PYYUZ</scope><scope>Q9U</scope></search><sort><creationdate>20190607</creationdate><title>Predicting drug–disease associations by network embedding and biomedical data integration</title><author>Wei, Xiaomei ; Zhang, Yaliang ; Huang, Yu ; Fang, Yaping</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c311t-ea5fdee352e3ded49293f2d67584bd7715e7752bb385e9e041aad4ea035aa58c3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2019</creationdate><topic>Algorithms</topic><topic>Artificial intelligence</topic><topic>Biomedical data</topic><topic>Biomedicine</topic><topic>Computation</topic><topic>Data integration</topic><topic>Data management</topic><topic>Disease</topic><topic>Drug development</topic><topic>Drugs</topic><topic>Embedding</topic><topic>Feature selection</topic><topic>Gastroesophageal reflux</topic><topic>Gene expression</topic><topic>Learning</topic><topic>Learning Processes</topic><topic>Machine learning</topic><topic>Mathematics</topic><topic>Methods</topic><topic>Narcotics</topic><topic>Neuroses</topic><topic>Obsessive compulsive disorder</topic><topic>Pain</topic><topic>Pharmaceutical industry</topic><topic>Prior Learning</topic><topic>Semantics</topic><topic>Similarity measures</topic><topic>Social networks</topic><topic>Support vector machines</topic><topic>Topology</topic><topic>Tourette syndrome</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Wei, Xiaomei</creatorcontrib><creatorcontrib>Zhang, Yaliang</creatorcontrib><creatorcontrib>Huang, Yu</creatorcontrib><creatorcontrib>Fang, Yaping</creatorcontrib><collection>CrossRef</collection><collection>ProQuest Social Sciences Premium Collection</collection><collection>Computer and Information Systems Abstracts</collection><collection>ABI/INFORM Collection</collection><collection>ABI/INFORM Global (PDF only)</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>Technology Research Database</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>Social Science Premium Collection</collection><collection>Advanced Technologies & Aerospace Collection</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Business Premium Collection</collection><collection>Technology Collection (ProQuest)</collection><collection>ProQuest One Community College</collection><collection>Education Collection</collection><collection>Library & Information Science Collection</collection><collection>ProQuest Central Korea</collection><collection>Library & Information Sciences Abstracts (LISA)</collection><collection>Library & Information Science Abstracts (LISA)</collection><collection>ABI/INFORM Global (Corporate)</collection><collection>ProQuest Central Student</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Computer Science Collection</collection><collection>ProQuest Business Collection</collection><collection>Computer Science Database</collection><collection>ABI/INFORM Professional Advanced</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>ABI/INFORM Global</collection><collection>Computing Database</collection><collection>Education Database</collection><collection>Library Science Database</collection><collection>Advanced Technologies & Aerospace Database</collection><collection>ProQuest Advanced Technologies & Aerospace Collection</collection><collection>ProQuest One Business</collection><collection>ProQuest One Education</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ABI/INFORM Collection China</collection><collection>ProQuest Central Basic</collection><jtitle>Data technologies and applications</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Wei, Xiaomei</au><au>Zhang, Yaliang</au><au>Huang, Yu</au><au>Fang, Yaping</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Predicting drug–disease associations by network embedding and biomedical data integration</atitle><jtitle>Data technologies and applications</jtitle><date>2019-06-07</date><risdate>2019</risdate><volume>53</volume><issue>2</issue><spage>217</spage><epage>229</epage><pages>217-229</pages><issn>2514-9288</issn><eissn>2514-9318</eissn><abstract>Purpose
The traditional drug development process is costly, time consuming and risky. Using computational methods to discover drug repositioning opportunities is a promising and efficient strategy in the era of big data. The explosive growth of large-scale genomic, phenotypic data and all kinds of “omics” data brings opportunities for developing new computational drug repositioning methods based on big data. The paper aims to discuss this issue.
Design/methodology/approach
Here, a new computational strategy is proposed for inferring drug–disease associations from rich biomedical resources toward drug repositioning. First, the network embedding (NE) algorithm is adopted to learn the latent feature representation of drugs from multiple biomedical resources. Furthermore, on the basis of the latent vectors of drugs from the NE module, a binary support vector machine classifier is trained to divide unknown drug–disease pairs into positive and negative instances. Finally, this model is validated on a well-established drug–disease association data set with tenfold cross-validation.
Findings
This model obtains the performance of an area under the receiver operating characteristic curve of 90.3 percent, which is comparable to those of similar systems. The authors also analyze the performance of the model and validate its effect on predicting the new indications of old drugs.
Originality/value
This study shows that the authors’ method is predictive, identifying novel drug–disease interactions for drug discovery. The new feature learning methods also positively contribute to the heterogeneous data integration.</abstract><cop>Bingley</cop><pub>Emerald Publishing Limited</pub><doi>10.1108/DTA-01-2019-0004</doi><tpages>13</tpages></addata></record> |
fulltext | fulltext |
identifier | ISSN: 2514-9288 |
ispartof | Data technologies and applications, 2019-06, Vol.53 (2), p.217-229 |
issn | 2514-9288 2514-9318 |
language | eng |
recordid | cdi_proquest_journals_2236107952 |
source | Standard: Emerald eJournal Premier Collection |
subjects | Algorithms Artificial intelligence Biomedical data Biomedicine Computation Data integration Data management Disease Drug development Drugs Embedding Feature selection Gastroesophageal reflux Gene expression Learning Learning Processes Machine learning Mathematics Methods Narcotics Neuroses Obsessive compulsive disorder Pain Pharmaceutical industry Prior Learning Semantics Similarity measures Social networks Support vector machines Topology Tourette syndrome |
title | Predicting drug–disease associations by network embedding and biomedical data integration |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-31T11%3A57%3A27IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Predicting%20drug%E2%80%93disease%20associations%20by%20network%20embedding%20and%20biomedical%20data%20integration&rft.jtitle=Data%20technologies%20and%20applications&rft.au=Wei,%20Xiaomei&rft.date=2019-06-07&rft.volume=53&rft.issue=2&rft.spage=217&rft.epage=229&rft.pages=217-229&rft.issn=2514-9288&rft.eissn=2514-9318&rft_id=info:doi/10.1108/DTA-01-2019-0004&rft_dat=%3Cproquest_cross%3E2236107952%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2236107952&rft_id=info:pmid/&rfr_iscdi=true |