An Algorithm of Query Expansion for Chinese EMR Retrieval by Improving Expansion Term Weights and Retrieval Scores
Query expansion (QE) has been widely used in electronic medical record (EMR) retrieval for assisted diagnosis and clinical research. However, existing QE algorithms haven't achieved satisfactory performance in Chinese EMR retrieval, and one noticeable problem is that the weights of expansion te...
Gespeichert in:
Veröffentlicht in: | IEEE access 2020, Vol.8, p.200063-200072 |
---|---|
Hauptverfasser: | , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 200072 |
---|---|
container_issue | |
container_start_page | 200063 |
container_title | IEEE access |
container_volume | 8 |
creator | Yang, Songchun Zheng, Xiangwen Yin, Xiangfei Mao, Huajian Zhao, Dongsheng |
description | Query expansion (QE) has been widely used in electronic medical record (EMR) retrieval for assisted diagnosis and clinical research. However, existing QE algorithms haven't achieved satisfactory performance in Chinese EMR retrieval, and one noticeable problem is that the weights of expansion terms and retrieval scores have unreasonable factors for lack of the solid consideration of clinical needs. Here we propose an algorithm of QE for Chinese EMR retrieval by improving expansion term weights and retrieval scores. First, the weights of expansion terms are assigned with semantic similarities, category weights and co-occurrence frequencies between expansion terms and multiple query terms. Then the retrieval scores calculated by expansion terms are limited to reduce the query drift caused by high-frequency expansion terms. Experiment results show that our method gets a 33.3% increase in the precision at top 10, a 90.4% increase in the recall, and a 13.2% increase in MAP compared with four baselines. It proves that our improvement scheme can ensure the accuracy of expansion term weights and decrease the query drift caused by QE, which substantially improves the performance of Chinese EMR retrieval. |
doi_str_mv | 10.1109/ACCESS.2020.3033017 |
format | Article |
fullrecord | <record><control><sourceid>proquest_ieee_</sourceid><recordid>TN_cdi_proquest_journals_2460159029</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>9250729</ieee_id><doaj_id>oai_doaj_org_article_3603858544014416b0ecaa74799f653e</doaj_id><sourcerecordid>2460159029</sourcerecordid><originalsourceid>FETCH-LOGICAL-c408t-b216f35c50525ca4e9644cdfc493986a2cecabdd53c82e985f3f122690f2563</originalsourceid><addsrcrecordid>eNpNkU1rGzEQhpeSQEKaX5CLoGc7o89dHc3iNoaUkjiQo5C1I1vGXrnSOsT_vko3BOsiMczzzIi3qu4oTCkFfT9r2_lyOWXAYMqBc6D1t-qaUaUnXHJ1cfa-qm5z3kI5TSnJ-rpKs57MduuYwrDZk-jJ0xHTiczfD7bPIfbEx0TaTegxI5n_fibPOKSAb3ZHViey2B9SfAv9-gx4wbQnrxjWmyET23dnxNLFhPl7dentLuPt531TLX_OX9qHyeOfX4t29jhxApphsipbey6dBMmkswK1EsJ13gnNdaMsc-jsquskdw1D3UjPPWVMafBMKn5TLUZrF-3WHFLY23Qy0QbzvxDT2tg0BLdDwxXwRjZSCKBCULWCYra1qLX2SnIsrh-jq_z27xHzYLbxmPqyvGFCAZUamC5dfOxyKeac0H9NpWA-kjJjUuYjKfOZVKHuRiog4hehmYS6OP8BBziOHQ</addsrcrecordid><sourcetype>Open Website</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2460159029</pqid></control><display><type>article</type><title>An Algorithm of Query Expansion for Chinese EMR Retrieval by Improving Expansion Term Weights and Retrieval Scores</title><source>IEEE Open Access Journals</source><source>DOAJ Directory of Open Access Journals</source><source>EZB-FREE-00999 freely available EZB journals</source><creator>Yang, Songchun ; Zheng, Xiangwen ; Yin, Xiangfei ; Mao, Huajian ; Zhao, Dongsheng</creator><creatorcontrib>Yang, Songchun ; Zheng, Xiangwen ; Yin, Xiangfei ; Mao, Huajian ; Zhao, Dongsheng</creatorcontrib><description>Query expansion (QE) has been widely used in electronic medical record (EMR) retrieval for assisted diagnosis and clinical research. However, existing QE algorithms haven't achieved satisfactory performance in Chinese EMR retrieval, and one noticeable problem is that the weights of expansion terms and retrieval scores have unreasonable factors for lack of the solid consideration of clinical needs. Here we propose an algorithm of QE for Chinese EMR retrieval by improving expansion term weights and retrieval scores. First, the weights of expansion terms are assigned with semantic similarities, category weights and co-occurrence frequencies between expansion terms and multiple query terms. Then the retrieval scores calculated by expansion terms are limited to reduce the query drift caused by high-frequency expansion terms. Experiment results show that our method gets a 33.3% increase in the precision at top 10, a 90.4% increase in the recall, and a 13.2% increase in MAP compared with four baselines. It proves that our improvement scheme can ensure the accuracy of expansion term weights and decrease the query drift caused by QE, which substantially improves the performance of Chinese EMR retrieval.</description><identifier>ISSN: 2169-3536</identifier><identifier>EISSN: 2169-3536</identifier><identifier>DOI: 10.1109/ACCESS.2020.3033017</identifier><identifier>CODEN: IAECCG</identifier><language>eng</language><publisher>Piscataway: IEEE</publisher><subject>Algorithms ; BM25 ; co-occurrence ; Drift ; Electronic health records ; Electronic medical record ; Electronic medical records ; Medical research ; Performance enhancement ; Queries ; Query expansion ; Retrieval ; Semantics ; Solids ; word2Vec</subject><ispartof>IEEE access, 2020, Vol.8, p.200063-200072</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2020</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c408t-b216f35c50525ca4e9644cdfc493986a2cecabdd53c82e985f3f122690f2563</citedby><cites>FETCH-LOGICAL-c408t-b216f35c50525ca4e9644cdfc493986a2cecabdd53c82e985f3f122690f2563</cites><orcidid>0000-0001-7940-0514 ; 0000-0002-1139-7889 ; 0000-0002-5609-6270 ; 0000-0003-2616-8891 ; 0000-0002-8424-0372</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/9250729$$EHTML$$P50$$Gieee$$Hfree_for_read</linktohtml><link.rule.ids>314,780,784,864,2102,4024,27633,27923,27924,27925,54933</link.rule.ids></links><search><creatorcontrib>Yang, Songchun</creatorcontrib><creatorcontrib>Zheng, Xiangwen</creatorcontrib><creatorcontrib>Yin, Xiangfei</creatorcontrib><creatorcontrib>Mao, Huajian</creatorcontrib><creatorcontrib>Zhao, Dongsheng</creatorcontrib><title>An Algorithm of Query Expansion for Chinese EMR Retrieval by Improving Expansion Term Weights and Retrieval Scores</title><title>IEEE access</title><addtitle>Access</addtitle><description>Query expansion (QE) has been widely used in electronic medical record (EMR) retrieval for assisted diagnosis and clinical research. However, existing QE algorithms haven't achieved satisfactory performance in Chinese EMR retrieval, and one noticeable problem is that the weights of expansion terms and retrieval scores have unreasonable factors for lack of the solid consideration of clinical needs. Here we propose an algorithm of QE for Chinese EMR retrieval by improving expansion term weights and retrieval scores. First, the weights of expansion terms are assigned with semantic similarities, category weights and co-occurrence frequencies between expansion terms and multiple query terms. Then the retrieval scores calculated by expansion terms are limited to reduce the query drift caused by high-frequency expansion terms. Experiment results show that our method gets a 33.3% increase in the precision at top 10, a 90.4% increase in the recall, and a 13.2% increase in MAP compared with four baselines. It proves that our improvement scheme can ensure the accuracy of expansion term weights and decrease the query drift caused by QE, which substantially improves the performance of Chinese EMR retrieval.</description><subject>Algorithms</subject><subject>BM25</subject><subject>co-occurrence</subject><subject>Drift</subject><subject>Electronic health records</subject><subject>Electronic medical record</subject><subject>Electronic medical records</subject><subject>Medical research</subject><subject>Performance enhancement</subject><subject>Queries</subject><subject>Query expansion</subject><subject>Retrieval</subject><subject>Semantics</subject><subject>Solids</subject><subject>word2Vec</subject><issn>2169-3536</issn><issn>2169-3536</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2020</creationdate><recordtype>article</recordtype><sourceid>ESBDL</sourceid><sourceid>RIE</sourceid><sourceid>DOA</sourceid><recordid>eNpNkU1rGzEQhpeSQEKaX5CLoGc7o89dHc3iNoaUkjiQo5C1I1vGXrnSOsT_vko3BOsiMczzzIi3qu4oTCkFfT9r2_lyOWXAYMqBc6D1t-qaUaUnXHJ1cfa-qm5z3kI5TSnJ-rpKs57MduuYwrDZk-jJ0xHTiczfD7bPIfbEx0TaTegxI5n_fibPOKSAb3ZHViey2B9SfAv9-gx4wbQnrxjWmyET23dnxNLFhPl7dentLuPt531TLX_OX9qHyeOfX4t29jhxApphsipbey6dBMmkswK1EsJ13gnNdaMsc-jsquskdw1D3UjPPWVMafBMKn5TLUZrF-3WHFLY23Qy0QbzvxDT2tg0BLdDwxXwRjZSCKBCULWCYra1qLX2SnIsrh-jq_z27xHzYLbxmPqyvGFCAZUamC5dfOxyKeac0H9NpWA-kjJjUuYjKfOZVKHuRiog4hehmYS6OP8BBziOHQ</recordid><startdate>2020</startdate><enddate>2020</enddate><creator>Yang, Songchun</creator><creator>Zheng, Xiangwen</creator><creator>Yin, Xiangfei</creator><creator>Mao, Huajian</creator><creator>Zhao, Dongsheng</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>ESBDL</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>7SR</scope><scope>8BQ</scope><scope>8FD</scope><scope>JG9</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>DOA</scope><orcidid>https://orcid.org/0000-0001-7940-0514</orcidid><orcidid>https://orcid.org/0000-0002-1139-7889</orcidid><orcidid>https://orcid.org/0000-0002-5609-6270</orcidid><orcidid>https://orcid.org/0000-0003-2616-8891</orcidid><orcidid>https://orcid.org/0000-0002-8424-0372</orcidid></search><sort><creationdate>2020</creationdate><title>An Algorithm of Query Expansion for Chinese EMR Retrieval by Improving Expansion Term Weights and Retrieval Scores</title><author>Yang, Songchun ; Zheng, Xiangwen ; Yin, Xiangfei ; Mao, Huajian ; Zhao, Dongsheng</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c408t-b216f35c50525ca4e9644cdfc493986a2cecabdd53c82e985f3f122690f2563</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2020</creationdate><topic>Algorithms</topic><topic>BM25</topic><topic>co-occurrence</topic><topic>Drift</topic><topic>Electronic health records</topic><topic>Electronic medical record</topic><topic>Electronic medical records</topic><topic>Medical research</topic><topic>Performance enhancement</topic><topic>Queries</topic><topic>Query expansion</topic><topic>Retrieval</topic><topic>Semantics</topic><topic>Solids</topic><topic>word2Vec</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Yang, Songchun</creatorcontrib><creatorcontrib>Zheng, Xiangwen</creatorcontrib><creatorcontrib>Yin, Xiangfei</creatorcontrib><creatorcontrib>Mao, Huajian</creatorcontrib><creatorcontrib>Zhao, Dongsheng</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE Open Access Journals</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Engineered Materials Abstracts</collection><collection>METADEX</collection><collection>Technology Research Database</collection><collection>Materials Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>DOAJ Directory of Open Access Journals</collection><jtitle>IEEE access</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Yang, Songchun</au><au>Zheng, Xiangwen</au><au>Yin, Xiangfei</au><au>Mao, Huajian</au><au>Zhao, Dongsheng</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>An Algorithm of Query Expansion for Chinese EMR Retrieval by Improving Expansion Term Weights and Retrieval Scores</atitle><jtitle>IEEE access</jtitle><stitle>Access</stitle><date>2020</date><risdate>2020</risdate><volume>8</volume><spage>200063</spage><epage>200072</epage><pages>200063-200072</pages><issn>2169-3536</issn><eissn>2169-3536</eissn><coden>IAECCG</coden><abstract>Query expansion (QE) has been widely used in electronic medical record (EMR) retrieval for assisted diagnosis and clinical research. However, existing QE algorithms haven't achieved satisfactory performance in Chinese EMR retrieval, and one noticeable problem is that the weights of expansion terms and retrieval scores have unreasonable factors for lack of the solid consideration of clinical needs. Here we propose an algorithm of QE for Chinese EMR retrieval by improving expansion term weights and retrieval scores. First, the weights of expansion terms are assigned with semantic similarities, category weights and co-occurrence frequencies between expansion terms and multiple query terms. Then the retrieval scores calculated by expansion terms are limited to reduce the query drift caused by high-frequency expansion terms. Experiment results show that our method gets a 33.3% increase in the precision at top 10, a 90.4% increase in the recall, and a 13.2% increase in MAP compared with four baselines. It proves that our improvement scheme can ensure the accuracy of expansion term weights and decrease the query drift caused by QE, which substantially improves the performance of Chinese EMR retrieval.</abstract><cop>Piscataway</cop><pub>IEEE</pub><doi>10.1109/ACCESS.2020.3033017</doi><tpages>10</tpages><orcidid>https://orcid.org/0000-0001-7940-0514</orcidid><orcidid>https://orcid.org/0000-0002-1139-7889</orcidid><orcidid>https://orcid.org/0000-0002-5609-6270</orcidid><orcidid>https://orcid.org/0000-0003-2616-8891</orcidid><orcidid>https://orcid.org/0000-0002-8424-0372</orcidid><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 2169-3536 |
ispartof | IEEE access, 2020, Vol.8, p.200063-200072 |
issn | 2169-3536 2169-3536 |
language | eng |
recordid | cdi_proquest_journals_2460159029 |
source | IEEE Open Access Journals; DOAJ Directory of Open Access Journals; EZB-FREE-00999 freely available EZB journals |
subjects | Algorithms BM25 co-occurrence Drift Electronic health records Electronic medical record Electronic medical records Medical research Performance enhancement Queries Query expansion Retrieval Semantics Solids word2Vec |
title | An Algorithm of Query Expansion for Chinese EMR Retrieval by Improving Expansion Term Weights and Retrieval Scores |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-28T02%3A09%3A26IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_ieee_&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=An%20Algorithm%20of%20Query%20Expansion%20for%20Chinese%20EMR%20Retrieval%20by%20Improving%20Expansion%20Term%20Weights%20and%20Retrieval%20Scores&rft.jtitle=IEEE%20access&rft.au=Yang,%20Songchun&rft.date=2020&rft.volume=8&rft.spage=200063&rft.epage=200072&rft.pages=200063-200072&rft.issn=2169-3536&rft.eissn=2169-3536&rft.coden=IAECCG&rft_id=info:doi/10.1109/ACCESS.2020.3033017&rft_dat=%3Cproquest_ieee_%3E2460159029%3C/proquest_ieee_%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2460159029&rft_id=info:pmid/&rft_ieee_id=9250729&rft_doaj_id=oai_doaj_org_article_3603858544014416b0ecaa74799f653e&rfr_iscdi=true |