How can natural language processing help model informed drug development?: a review
Objective To summarize applications of natural language processing (NLP) in model informed drug development (MIDD) and identify potential areas of improvement. Materials and Methods Publications found on PubMed and Google Scholar, websites and GitHub repositories for NLP libraries and models. Public...
Gespeichert in:
Veröffentlicht in: | JAMIA open 2022-07, Vol.5 (2), p.ooac043-ooac043 |
---|---|
Hauptverfasser: | , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | ooac043 |
---|---|
container_issue | 2 |
container_start_page | ooac043 |
container_title | JAMIA open |
container_volume | 5 |
creator | Bhatnagar, Roopal Sardar, Sakshi Beheshti, Maedeh Podichetty, Jagdeep T |
description | Objective
To summarize applications of natural language processing (NLP) in model informed drug development (MIDD) and identify potential areas of improvement.
Materials and Methods
Publications found on PubMed and Google Scholar, websites and GitHub repositories for NLP libraries and models. Publications describing applications of NLP in MIDD were reviewed. The applications were stratified into 3 stages: drug discovery, clinical trials, and pharmacovigilance. Key NLP functionalities used for these applications were assessed. Programming libraries and open-source resources for the implementation of NLP functionalities in MIDD were identified.
Results
NLP has been utilized to aid various processes in drug development lifecycle such as gene-disease mapping, biomarker discovery, patient-trial matching, adverse drug events detection, etc. These applications commonly use NLP functionalities of named entity recognition, word embeddings, entity resolution, assertion status detection, relation extraction, and topic modeling. The current state-of-the-art for implementing these functionalities in MIDD applications are transformer models that utilize transfer learning for enhanced performance. Various libraries in python, R, and Java like huggingface, sparkNLP, and KoRpus as well as open-source platforms such as DisGeNet, DeepEnroll, and Transmol have enabled convenient implementation of NLP models to MIDD applications.
Discussion
Challenges such as reproducibility, explainability, fairness, limited data, limited language-support, and security need to be overcome to ensure wider adoption of NLP in MIDD landscape. There are opportunities to improve the performance of existing models and expand the use of NLP in newer areas of MIDD.
Conclusions
This review provides an overview of the potential and pitfalls of current NLP approaches in MIDD.
Lay Summary
One of the biggest problems in healthcare fields is that a large amount of medical data remains unstructured (eg, text, image, signal, etc.) and untapped after it is created. Natural language processing (NLP) has been leveraged in recent years to extract relevant information out of unstructured data. NLP is an artificial intelligence technique to process and analyze human-generated spoken or written data. This review focuses on current NLP applications in the field of drug discovery and development. It provides a comprehensive overview of NLP in model informed drug development (MIDD) which involves quantitative models fo |
doi_str_mv | 10.1093/jamiaopen/ooac043 |
format | Article |
fullrecord | <record><control><sourceid>gale_pubme</sourceid><recordid>TN_cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_9188322</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><galeid>A777499897</galeid><oup_id>10.1093/jamiaopen/ooac043</oup_id><sourcerecordid>A777499897</sourcerecordid><originalsourceid>FETCH-LOGICAL-c503t-53e3ca1e728cac2a8083f44e30a31b7bd62f37e109b4f8ba5b0f4652f3e1c223</originalsourceid><addsrcrecordid>eNqNkU9P3DAQxS3UCtDCB-BSWeqlBxb8J4njHloh1BYkpB7gbk2cSTBK7NTeLOq3x9vdrkDqofLB1vj3nmbmEXLG2QVnWl4-weggTOgvQwDLCnlAjkWpiqUoJX_36n1ETlN6YoxxrXUl2SE5kqViohLlMbm_Cc_UgqceVnOEgQ7g-xl6pFMMFlNyvqePOEx0DC0O1PkuxBFb2sa5py2ucQjTiH719TMFGnHt8PmEvO9gSHi6uxfk4fu3h-ub5d3PH7fXV3dLWzK5WpYSpQWOStQWrICa1bIrCpQMJG9U01aikwrzsE3R1Q2UDeuKqsxF5FYIuSBftrbT3OSObG4iD2Cm6EaIv00AZ97-ePdo-rA2mte1_GPwaWcQw68Z08qMLlkc8gowzMmISlVacJ3pBfm4RXsY0GyWkB3tBjdXSqlC61qrTF38g8qnxdHZ4LFzuf5GwLcCG0NKEbt995yZTcpmn7LZpZw1H16PvVf8zTQD51sgzNN_-L0AVgW2hw</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2676921988</pqid></control><display><type>article</type><title>How can natural language processing help model informed drug development?: a review</title><source>DOAJ Directory of Open Access Journals</source><source>Oxford Journals Open Access Collection</source><source>EZB-FREE-00999 freely available EZB journals</source><source>PubMed Central</source><creator>Bhatnagar, Roopal ; Sardar, Sakshi ; Beheshti, Maedeh ; Podichetty, Jagdeep T</creator><creatorcontrib>Bhatnagar, Roopal ; Sardar, Sakshi ; Beheshti, Maedeh ; Podichetty, Jagdeep T</creatorcontrib><description>Objective
To summarize applications of natural language processing (NLP) in model informed drug development (MIDD) and identify potential areas of improvement.
Materials and Methods
Publications found on PubMed and Google Scholar, websites and GitHub repositories for NLP libraries and models. Publications describing applications of NLP in MIDD were reviewed. The applications were stratified into 3 stages: drug discovery, clinical trials, and pharmacovigilance. Key NLP functionalities used for these applications were assessed. Programming libraries and open-source resources for the implementation of NLP functionalities in MIDD were identified.
Results
NLP has been utilized to aid various processes in drug development lifecycle such as gene-disease mapping, biomarker discovery, patient-trial matching, adverse drug events detection, etc. These applications commonly use NLP functionalities of named entity recognition, word embeddings, entity resolution, assertion status detection, relation extraction, and topic modeling. The current state-of-the-art for implementing these functionalities in MIDD applications are transformer models that utilize transfer learning for enhanced performance. Various libraries in python, R, and Java like huggingface, sparkNLP, and KoRpus as well as open-source platforms such as DisGeNet, DeepEnroll, and Transmol have enabled convenient implementation of NLP models to MIDD applications.
Discussion
Challenges such as reproducibility, explainability, fairness, limited data, limited language-support, and security need to be overcome to ensure wider adoption of NLP in MIDD landscape. There are opportunities to improve the performance of existing models and expand the use of NLP in newer areas of MIDD.
Conclusions
This review provides an overview of the potential and pitfalls of current NLP approaches in MIDD.
Lay Summary
One of the biggest problems in healthcare fields is that a large amount of medical data remains unstructured (eg, text, image, signal, etc.) and untapped after it is created. Natural language processing (NLP) has been leveraged in recent years to extract relevant information out of unstructured data. NLP is an artificial intelligence technique to process and analyze human-generated spoken or written data. This review focuses on current NLP applications in the field of drug discovery and development. It provides a comprehensive overview of NLP in model informed drug development (MIDD) which involves quantitative models for decision-making in drug development. Researchers utilize NLP to mine data from previously untapped sources. This aims to increase the efficiency of the drug development process. We also highlight the technical aspects of various tools utilized to develop the currently existing NLP models. We provide information on various easily accessible resources which can be deployed to develop an NLP model for MIDD applications. Lastly, this article gives insights into potential opportunities that currently exist to expand and carry NLP in MIDD forward.</description><identifier>ISSN: 2574-2531</identifier><identifier>EISSN: 2574-2531</identifier><identifier>DOI: 10.1093/jamiaopen/ooac043</identifier><identifier>PMID: 35702625</identifier><language>eng</language><publisher>United States: Oxford University Press</publisher><subject>Computational linguistics ; Drug discovery ; Language processing ; Machine learning ; Natural language interfaces ; Product development ; Review</subject><ispartof>JAMIA open, 2022-07, Vol.5 (2), p.ooac043-ooac043</ispartof><rights>The Author(s) 2022. Published by Oxford University Press on behalf of the American Medical Informatics Association. 2022</rights><rights>The Author(s) 2022. Published by Oxford University Press on behalf of the American Medical Informatics Association.</rights><rights>COPYRIGHT 2022 Oxford University Press</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c503t-53e3ca1e728cac2a8083f44e30a31b7bd62f37e109b4f8ba5b0f4652f3e1c223</citedby><cites>FETCH-LOGICAL-c503t-53e3ca1e728cac2a8083f44e30a31b7bd62f37e109b4f8ba5b0f4652f3e1c223</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC9188322/pdf/$$EPDF$$P50$$Gpubmedcentral$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC9188322/$$EHTML$$P50$$Gpubmedcentral$$Hfree_for_read</linktohtml><link.rule.ids>230,314,727,780,784,864,885,1604,27924,27925,53791,53793</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/35702625$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Bhatnagar, Roopal</creatorcontrib><creatorcontrib>Sardar, Sakshi</creatorcontrib><creatorcontrib>Beheshti, Maedeh</creatorcontrib><creatorcontrib>Podichetty, Jagdeep T</creatorcontrib><title>How can natural language processing help model informed drug development?: a review</title><title>JAMIA open</title><addtitle>JAMIA Open</addtitle><description>Objective
To summarize applications of natural language processing (NLP) in model informed drug development (MIDD) and identify potential areas of improvement.
Materials and Methods
Publications found on PubMed and Google Scholar, websites and GitHub repositories for NLP libraries and models. Publications describing applications of NLP in MIDD were reviewed. The applications were stratified into 3 stages: drug discovery, clinical trials, and pharmacovigilance. Key NLP functionalities used for these applications were assessed. Programming libraries and open-source resources for the implementation of NLP functionalities in MIDD were identified.
Results
NLP has been utilized to aid various processes in drug development lifecycle such as gene-disease mapping, biomarker discovery, patient-trial matching, adverse drug events detection, etc. These applications commonly use NLP functionalities of named entity recognition, word embeddings, entity resolution, assertion status detection, relation extraction, and topic modeling. The current state-of-the-art for implementing these functionalities in MIDD applications are transformer models that utilize transfer learning for enhanced performance. Various libraries in python, R, and Java like huggingface, sparkNLP, and KoRpus as well as open-source platforms such as DisGeNet, DeepEnroll, and Transmol have enabled convenient implementation of NLP models to MIDD applications.
Discussion
Challenges such as reproducibility, explainability, fairness, limited data, limited language-support, and security need to be overcome to ensure wider adoption of NLP in MIDD landscape. There are opportunities to improve the performance of existing models and expand the use of NLP in newer areas of MIDD.
Conclusions
This review provides an overview of the potential and pitfalls of current NLP approaches in MIDD.
Lay Summary
One of the biggest problems in healthcare fields is that a large amount of medical data remains unstructured (eg, text, image, signal, etc.) and untapped after it is created. Natural language processing (NLP) has been leveraged in recent years to extract relevant information out of unstructured data. NLP is an artificial intelligence technique to process and analyze human-generated spoken or written data. This review focuses on current NLP applications in the field of drug discovery and development. It provides a comprehensive overview of NLP in model informed drug development (MIDD) which involves quantitative models for decision-making in drug development. Researchers utilize NLP to mine data from previously untapped sources. This aims to increase the efficiency of the drug development process. We also highlight the technical aspects of various tools utilized to develop the currently existing NLP models. We provide information on various easily accessible resources which can be deployed to develop an NLP model for MIDD applications. Lastly, this article gives insights into potential opportunities that currently exist to expand and carry NLP in MIDD forward.</description><subject>Computational linguistics</subject><subject>Drug discovery</subject><subject>Language processing</subject><subject>Machine learning</subject><subject>Natural language interfaces</subject><subject>Product development</subject><subject>Review</subject><issn>2574-2531</issn><issn>2574-2531</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><sourceid>TOX</sourceid><recordid>eNqNkU9P3DAQxS3UCtDCB-BSWeqlBxb8J4njHloh1BYkpB7gbk2cSTBK7NTeLOq3x9vdrkDqofLB1vj3nmbmEXLG2QVnWl4-weggTOgvQwDLCnlAjkWpiqUoJX_36n1ETlN6YoxxrXUl2SE5kqViohLlMbm_Cc_UgqceVnOEgQ7g-xl6pFMMFlNyvqePOEx0DC0O1PkuxBFb2sa5py2ucQjTiH719TMFGnHt8PmEvO9gSHi6uxfk4fu3h-ub5d3PH7fXV3dLWzK5WpYSpQWOStQWrICa1bIrCpQMJG9U01aikwrzsE3R1Q2UDeuKqsxF5FYIuSBftrbT3OSObG4iD2Cm6EaIv00AZ97-ePdo-rA2mte1_GPwaWcQw68Z08qMLlkc8gowzMmISlVacJ3pBfm4RXsY0GyWkB3tBjdXSqlC61qrTF38g8qnxdHZ4LFzuf5GwLcCG0NKEbt995yZTcpmn7LZpZw1H16PvVf8zTQD51sgzNN_-L0AVgW2hw</recordid><startdate>20220701</startdate><enddate>20220701</enddate><creator>Bhatnagar, Roopal</creator><creator>Sardar, Sakshi</creator><creator>Beheshti, Maedeh</creator><creator>Podichetty, Jagdeep T</creator><general>Oxford University Press</general><scope>TOX</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7X8</scope><scope>5PM</scope></search><sort><creationdate>20220701</creationdate><title>How can natural language processing help model informed drug development?: a review</title><author>Bhatnagar, Roopal ; Sardar, Sakshi ; Beheshti, Maedeh ; Podichetty, Jagdeep T</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c503t-53e3ca1e728cac2a8083f44e30a31b7bd62f37e109b4f8ba5b0f4652f3e1c223</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Computational linguistics</topic><topic>Drug discovery</topic><topic>Language processing</topic><topic>Machine learning</topic><topic>Natural language interfaces</topic><topic>Product development</topic><topic>Review</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Bhatnagar, Roopal</creatorcontrib><creatorcontrib>Sardar, Sakshi</creatorcontrib><creatorcontrib>Beheshti, Maedeh</creatorcontrib><creatorcontrib>Podichetty, Jagdeep T</creatorcontrib><collection>Oxford Journals Open Access Collection</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>JAMIA open</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Bhatnagar, Roopal</au><au>Sardar, Sakshi</au><au>Beheshti, Maedeh</au><au>Podichetty, Jagdeep T</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>How can natural language processing help model informed drug development?: a review</atitle><jtitle>JAMIA open</jtitle><addtitle>JAMIA Open</addtitle><date>2022-07-01</date><risdate>2022</risdate><volume>5</volume><issue>2</issue><spage>ooac043</spage><epage>ooac043</epage><pages>ooac043-ooac043</pages><issn>2574-2531</issn><eissn>2574-2531</eissn><abstract>Objective
To summarize applications of natural language processing (NLP) in model informed drug development (MIDD) and identify potential areas of improvement.
Materials and Methods
Publications found on PubMed and Google Scholar, websites and GitHub repositories for NLP libraries and models. Publications describing applications of NLP in MIDD were reviewed. The applications were stratified into 3 stages: drug discovery, clinical trials, and pharmacovigilance. Key NLP functionalities used for these applications were assessed. Programming libraries and open-source resources for the implementation of NLP functionalities in MIDD were identified.
Results
NLP has been utilized to aid various processes in drug development lifecycle such as gene-disease mapping, biomarker discovery, patient-trial matching, adverse drug events detection, etc. These applications commonly use NLP functionalities of named entity recognition, word embeddings, entity resolution, assertion status detection, relation extraction, and topic modeling. The current state-of-the-art for implementing these functionalities in MIDD applications are transformer models that utilize transfer learning for enhanced performance. Various libraries in python, R, and Java like huggingface, sparkNLP, and KoRpus as well as open-source platforms such as DisGeNet, DeepEnroll, and Transmol have enabled convenient implementation of NLP models to MIDD applications.
Discussion
Challenges such as reproducibility, explainability, fairness, limited data, limited language-support, and security need to be overcome to ensure wider adoption of NLP in MIDD landscape. There are opportunities to improve the performance of existing models and expand the use of NLP in newer areas of MIDD.
Conclusions
This review provides an overview of the potential and pitfalls of current NLP approaches in MIDD.
Lay Summary
One of the biggest problems in healthcare fields is that a large amount of medical data remains unstructured (eg, text, image, signal, etc.) and untapped after it is created. Natural language processing (NLP) has been leveraged in recent years to extract relevant information out of unstructured data. NLP is an artificial intelligence technique to process and analyze human-generated spoken or written data. This review focuses on current NLP applications in the field of drug discovery and development. It provides a comprehensive overview of NLP in model informed drug development (MIDD) which involves quantitative models for decision-making in drug development. Researchers utilize NLP to mine data from previously untapped sources. This aims to increase the efficiency of the drug development process. We also highlight the technical aspects of various tools utilized to develop the currently existing NLP models. We provide information on various easily accessible resources which can be deployed to develop an NLP model for MIDD applications. Lastly, this article gives insights into potential opportunities that currently exist to expand and carry NLP in MIDD forward.</abstract><cop>United States</cop><pub>Oxford University Press</pub><pmid>35702625</pmid><doi>10.1093/jamiaopen/ooac043</doi><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 2574-2531 |
ispartof | JAMIA open, 2022-07, Vol.5 (2), p.ooac043-ooac043 |
issn | 2574-2531 2574-2531 |
language | eng |
recordid | cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_9188322 |
source | DOAJ Directory of Open Access Journals; Oxford Journals Open Access Collection; EZB-FREE-00999 freely available EZB journals; PubMed Central |
subjects | Computational linguistics Drug discovery Language processing Machine learning Natural language interfaces Product development Review |
title | How can natural language processing help model informed drug development?: a review |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-25T18%3A08%3A19IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-gale_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=How%20can%20natural%20language%20processing%20help%20model%20informed%20drug%20development?:%20a%20review&rft.jtitle=JAMIA%20open&rft.au=Bhatnagar,%20Roopal&rft.date=2022-07-01&rft.volume=5&rft.issue=2&rft.spage=ooac043&rft.epage=ooac043&rft.pages=ooac043-ooac043&rft.issn=2574-2531&rft.eissn=2574-2531&rft_id=info:doi/10.1093/jamiaopen/ooac043&rft_dat=%3Cgale_pubme%3EA777499897%3C/gale_pubme%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2676921988&rft_id=info:pmid/35702625&rft_galeid=A777499897&rft_oup_id=10.1093/jamiaopen/ooac043&rfr_iscdi=true |