Combining structured and unstructured data to identify a cohort of ICU patients who received dialysis

To develop a generalizable method for identifying patient cohorts from electronic health record (EHR) data-in this case, patients having dialysis-that uses simple information retrieval (IR) tools. We used the coded data and clinical notes from the 24,506 adult patients in the Multiparameter Intellig...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Journal of the American Medical Informatics Association : JAMIA 2014-09, Vol.21 (5), p.801-807
Hauptverfasser: Abhyankar, Swapna, Demner-Fushman, Dina, Callaghan, Fiona M, McDonald, Clement J
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 807
container_issue 5
container_start_page 801
container_title Journal of the American Medical Informatics Association : JAMIA
container_volume 21
creator Abhyankar, Swapna
Demner-Fushman, Dina
Callaghan, Fiona M
McDonald, Clement J
description To develop a generalizable method for identifying patient cohorts from electronic health record (EHR) data-in this case, patients having dialysis-that uses simple information retrieval (IR) tools. We used the coded data and clinical notes from the 24,506 adult patients in the Multiparameter Intelligent Monitoring in Intensive Care database to identify patients who had dialysis. We used SQL queries to search the procedure, diagnosis, and coded nursing observations tables based on ICD-9 and local codes. We used a domain-specific search engine to find clinical notes containing terms related to dialysis. We manually validated the available records for a 10% random sample of patients who potentially had dialysis and a random sample of 200 patients who were not identified as having dialysis based on any of the sources. We identified 1844 patients that potentially had dialysis: 1481 from the three coded sources and 1624 from the clinical notes. Precision for identifying dialysis patients based on available data was estimated to be 78.4% (95% CI 71.9% to 84.2%) and recall was 100% (95% CI 86% to 100%). Combining structured EHR data with information from clinical notes using simple queries increases the utility of both types of data for cohort identification. Patients identified by more than one source are more likely to meet the inclusion criteria; however, including patients found in any of the sources increases recall. This method is attractive because it is available to researchers with access to EHR data and off-the-shelf IR tools.
doi_str_mv 10.1136/amiajnl-2013-001915
format Article
fullrecord <record><control><sourceid>proquest_pubme</sourceid><recordid>TN_cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_4147606</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>1553321574</sourcerecordid><originalsourceid>FETCH-LOGICAL-c475t-6a35da7ae7ebcb4c16f14a10428059682151bdc9b5dd649a47c23dc1de7cacc93</originalsourceid><addsrcrecordid>eNpVkUtLJDEUhcPgMD5mfoEgWbopzTtdG0EaRxsENwqzC7eSVHeaqkqbpJT-91NNt6KrG-495-TAh9A5JVeUcnUNfYD10FWMUF4RQmsqf6ATKpmuai3-HU1vonQlCdPH6DTn9aRRjMtf6JgJPhOMkxPk57FvwhCGJc4ljbaMyTsMg8Pj8GXhoAAuEQfnhxLaLQZs4yqmgmOLF_MXvIESplPG76uIk7c-vO1sAbptDvk3-tlCl_2fwzxDL3_vnucP1ePT_WJ--1hZoWWpFHDpQIPXvrGNsFS1VAAlgs2IrNWMUUkbZ-tGOqdEDUJbxp2lzmsL1tb8DN3sczdj03tnp0YJOrNJoYe0NRGC-X4Zwsos45sRVGhF1BRweQhI8XX0uZg-ZOu7DgYfx2yolJxPNbSYpHwvtSnmnHz7-Q0lZgfIHACZHSCzBzS5Lr42_PR8EOH_AWMkkVY</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1553321574</pqid></control><display><type>article</type><title>Combining structured and unstructured data to identify a cohort of ICU patients who received dialysis</title><source>MEDLINE</source><source>Oxford University Press Journals All Titles (1996-Current)</source><source>EZB-FREE-00999 freely available EZB journals</source><source>PubMed Central</source><creator>Abhyankar, Swapna ; Demner-Fushman, Dina ; Callaghan, Fiona M ; McDonald, Clement J</creator><creatorcontrib>Abhyankar, Swapna ; Demner-Fushman, Dina ; Callaghan, Fiona M ; McDonald, Clement J</creatorcontrib><description>To develop a generalizable method for identifying patient cohorts from electronic health record (EHR) data-in this case, patients having dialysis-that uses simple information retrieval (IR) tools. We used the coded data and clinical notes from the 24,506 adult patients in the Multiparameter Intelligent Monitoring in Intensive Care database to identify patients who had dialysis. We used SQL queries to search the procedure, diagnosis, and coded nursing observations tables based on ICD-9 and local codes. We used a domain-specific search engine to find clinical notes containing terms related to dialysis. We manually validated the available records for a 10% random sample of patients who potentially had dialysis and a random sample of 200 patients who were not identified as having dialysis based on any of the sources. We identified 1844 patients that potentially had dialysis: 1481 from the three coded sources and 1624 from the clinical notes. Precision for identifying dialysis patients based on available data was estimated to be 78.4% (95% CI 71.9% to 84.2%) and recall was 100% (95% CI 86% to 100%). Combining structured EHR data with information from clinical notes using simple queries increases the utility of both types of data for cohort identification. Patients identified by more than one source are more likely to meet the inclusion criteria; however, including patients found in any of the sources increases recall. This method is attractive because it is available to researchers with access to EHR data and off-the-shelf IR tools.</description><identifier>ISSN: 1067-5027</identifier><identifier>EISSN: 1527-974X</identifier><identifier>DOI: 10.1136/amiajnl-2013-001915</identifier><identifier>PMID: 24384230</identifier><language>eng</language><publisher>England: BMJ Publishing Group</publisher><subject>Adult ; Electronic Health Records ; Focus on Biomedical Natural Language Processing and Data Modeling ; Humans ; Information Storage and Retrieval - methods ; International Classification of Diseases ; Kidney Failure, Chronic - therapy ; Programming Languages ; Renal Dialysis - statistics &amp; numerical data</subject><ispartof>Journal of the American Medical Informatics Association : JAMIA, 2014-09, Vol.21 (5), p.801-807</ispartof><rights>Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.</rights><rights>Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions 2014</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c475t-6a35da7ae7ebcb4c16f14a10428059682151bdc9b5dd649a47c23dc1de7cacc93</citedby><cites>FETCH-LOGICAL-c475t-6a35da7ae7ebcb4c16f14a10428059682151bdc9b5dd649a47c23dc1de7cacc93</cites><orcidid>0000-0002-4361-5799</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC4147606/pdf/$$EPDF$$P50$$Gpubmedcentral$$H</linktopdf><linktohtml>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC4147606/$$EHTML$$P50$$Gpubmedcentral$$H</linktohtml><link.rule.ids>230,314,727,780,784,885,27924,27925,53791,53793</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/24384230$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Abhyankar, Swapna</creatorcontrib><creatorcontrib>Demner-Fushman, Dina</creatorcontrib><creatorcontrib>Callaghan, Fiona M</creatorcontrib><creatorcontrib>McDonald, Clement J</creatorcontrib><title>Combining structured and unstructured data to identify a cohort of ICU patients who received dialysis</title><title>Journal of the American Medical Informatics Association : JAMIA</title><addtitle>J Am Med Inform Assoc</addtitle><description>To develop a generalizable method for identifying patient cohorts from electronic health record (EHR) data-in this case, patients having dialysis-that uses simple information retrieval (IR) tools. We used the coded data and clinical notes from the 24,506 adult patients in the Multiparameter Intelligent Monitoring in Intensive Care database to identify patients who had dialysis. We used SQL queries to search the procedure, diagnosis, and coded nursing observations tables based on ICD-9 and local codes. We used a domain-specific search engine to find clinical notes containing terms related to dialysis. We manually validated the available records for a 10% random sample of patients who potentially had dialysis and a random sample of 200 patients who were not identified as having dialysis based on any of the sources. We identified 1844 patients that potentially had dialysis: 1481 from the three coded sources and 1624 from the clinical notes. Precision for identifying dialysis patients based on available data was estimated to be 78.4% (95% CI 71.9% to 84.2%) and recall was 100% (95% CI 86% to 100%). Combining structured EHR data with information from clinical notes using simple queries increases the utility of both types of data for cohort identification. Patients identified by more than one source are more likely to meet the inclusion criteria; however, including patients found in any of the sources increases recall. This method is attractive because it is available to researchers with access to EHR data and off-the-shelf IR tools.</description><subject>Adult</subject><subject>Electronic Health Records</subject><subject>Focus on Biomedical Natural Language Processing and Data Modeling</subject><subject>Humans</subject><subject>Information Storage and Retrieval - methods</subject><subject>International Classification of Diseases</subject><subject>Kidney Failure, Chronic - therapy</subject><subject>Programming Languages</subject><subject>Renal Dialysis - statistics &amp; numerical data</subject><issn>1067-5027</issn><issn>1527-974X</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2014</creationdate><recordtype>article</recordtype><sourceid>EIF</sourceid><recordid>eNpVkUtLJDEUhcPgMD5mfoEgWbopzTtdG0EaRxsENwqzC7eSVHeaqkqbpJT-91NNt6KrG-495-TAh9A5JVeUcnUNfYD10FWMUF4RQmsqf6ATKpmuai3-HU1vonQlCdPH6DTn9aRRjMtf6JgJPhOMkxPk57FvwhCGJc4ljbaMyTsMg8Pj8GXhoAAuEQfnhxLaLQZs4yqmgmOLF_MXvIESplPG76uIk7c-vO1sAbptDvk3-tlCl_2fwzxDL3_vnucP1ePT_WJ--1hZoWWpFHDpQIPXvrGNsFS1VAAlgs2IrNWMUUkbZ-tGOqdEDUJbxp2lzmsL1tb8DN3sczdj03tnp0YJOrNJoYe0NRGC-X4Zwsos45sRVGhF1BRweQhI8XX0uZg-ZOu7DgYfx2yolJxPNbSYpHwvtSnmnHz7-Q0lZgfIHACZHSCzBzS5Lr42_PR8EOH_AWMkkVY</recordid><startdate>20140901</startdate><enddate>20140901</enddate><creator>Abhyankar, Swapna</creator><creator>Demner-Fushman, Dina</creator><creator>Callaghan, Fiona M</creator><creator>McDonald, Clement J</creator><general>BMJ Publishing Group</general><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7X8</scope><scope>5PM</scope><orcidid>https://orcid.org/0000-0002-4361-5799</orcidid></search><sort><creationdate>20140901</creationdate><title>Combining structured and unstructured data to identify a cohort of ICU patients who received dialysis</title><author>Abhyankar, Swapna ; Demner-Fushman, Dina ; Callaghan, Fiona M ; McDonald, Clement J</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c475t-6a35da7ae7ebcb4c16f14a10428059682151bdc9b5dd649a47c23dc1de7cacc93</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2014</creationdate><topic>Adult</topic><topic>Electronic Health Records</topic><topic>Focus on Biomedical Natural Language Processing and Data Modeling</topic><topic>Humans</topic><topic>Information Storage and Retrieval - methods</topic><topic>International Classification of Diseases</topic><topic>Kidney Failure, Chronic - therapy</topic><topic>Programming Languages</topic><topic>Renal Dialysis - statistics &amp; numerical data</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Abhyankar, Swapna</creatorcontrib><creatorcontrib>Demner-Fushman, Dina</creatorcontrib><creatorcontrib>Callaghan, Fiona M</creatorcontrib><creatorcontrib>McDonald, Clement J</creatorcontrib><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>Journal of the American Medical Informatics Association : JAMIA</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Abhyankar, Swapna</au><au>Demner-Fushman, Dina</au><au>Callaghan, Fiona M</au><au>McDonald, Clement J</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Combining structured and unstructured data to identify a cohort of ICU patients who received dialysis</atitle><jtitle>Journal of the American Medical Informatics Association : JAMIA</jtitle><addtitle>J Am Med Inform Assoc</addtitle><date>2014-09-01</date><risdate>2014</risdate><volume>21</volume><issue>5</issue><spage>801</spage><epage>807</epage><pages>801-807</pages><issn>1067-5027</issn><eissn>1527-974X</eissn><abstract>To develop a generalizable method for identifying patient cohorts from electronic health record (EHR) data-in this case, patients having dialysis-that uses simple information retrieval (IR) tools. We used the coded data and clinical notes from the 24,506 adult patients in the Multiparameter Intelligent Monitoring in Intensive Care database to identify patients who had dialysis. We used SQL queries to search the procedure, diagnosis, and coded nursing observations tables based on ICD-9 and local codes. We used a domain-specific search engine to find clinical notes containing terms related to dialysis. We manually validated the available records for a 10% random sample of patients who potentially had dialysis and a random sample of 200 patients who were not identified as having dialysis based on any of the sources. We identified 1844 patients that potentially had dialysis: 1481 from the three coded sources and 1624 from the clinical notes. Precision for identifying dialysis patients based on available data was estimated to be 78.4% (95% CI 71.9% to 84.2%) and recall was 100% (95% CI 86% to 100%). Combining structured EHR data with information from clinical notes using simple queries increases the utility of both types of data for cohort identification. Patients identified by more than one source are more likely to meet the inclusion criteria; however, including patients found in any of the sources increases recall. This method is attractive because it is available to researchers with access to EHR data and off-the-shelf IR tools.</abstract><cop>England</cop><pub>BMJ Publishing Group</pub><pmid>24384230</pmid><doi>10.1136/amiajnl-2013-001915</doi><tpages>7</tpages><orcidid>https://orcid.org/0000-0002-4361-5799</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 1067-5027
ispartof Journal of the American Medical Informatics Association : JAMIA, 2014-09, Vol.21 (5), p.801-807
issn 1067-5027
1527-974X
language eng
recordid cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_4147606
source MEDLINE; Oxford University Press Journals All Titles (1996-Current); EZB-FREE-00999 freely available EZB journals; PubMed Central
subjects Adult
Electronic Health Records
Focus on Biomedical Natural Language Processing and Data Modeling
Humans
Information Storage and Retrieval - methods
International Classification of Diseases
Kidney Failure, Chronic - therapy
Programming Languages
Renal Dialysis - statistics & numerical data
title Combining structured and unstructured data to identify a cohort of ICU patients who received dialysis
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-05T18%3A53%3A09IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Combining%20structured%20and%20unstructured%20data%20to%20identify%20a%20cohort%20of%20ICU%20patients%20who%20received%20dialysis&rft.jtitle=Journal%20of%20the%20American%20Medical%20Informatics%20Association%20:%20JAMIA&rft.au=Abhyankar,%20Swapna&rft.date=2014-09-01&rft.volume=21&rft.issue=5&rft.spage=801&rft.epage=807&rft.pages=801-807&rft.issn=1067-5027&rft.eissn=1527-974X&rft_id=info:doi/10.1136/amiajnl-2013-001915&rft_dat=%3Cproquest_pubme%3E1553321574%3C/proquest_pubme%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=1553321574&rft_id=info:pmid/24384230&rfr_iscdi=true