Human intelligence versus Chat-GPT: who performs better in correctly classifying patients in triage?
Chat-GPT is rapidly emerging as a promising and potentially revolutionary tool in medicine. One of its possible applications is the stratification of patients according to the severity of clinical conditions and prognosis during the triage evaluation in the emergency department (ED). Using a randoml...
Gespeichert in:
Veröffentlicht in: | The American journal of emergency medicine 2024-05, Vol.79, p.44-47 |
---|---|
Hauptverfasser: | , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 47 |
---|---|
container_issue | |
container_start_page | 44 |
container_title | The American journal of emergency medicine |
container_volume | 79 |
creator | Zaboli, Arian Brigo, Francesco Sibilio, Serena Mian, Michael Turcato, Gianni |
description | Chat-GPT is rapidly emerging as a promising and potentially revolutionary tool in medicine. One of its possible applications is the stratification of patients according to the severity of clinical conditions and prognosis during the triage evaluation in the emergency department (ED).
Using a randomly selected sample of 30 vignettes recreated from real clinical cases, we compared the concordance in risk stratification of ED patients between healthcare personnel and Chat-GPT. The concordance was assessed with Cohen's kappa, and the performance was evaluated with the area under the receiver operating characteristic curve (AUROC) curves. Among the outcomes, we considered mortality within 72 h, the need for hospitalization, and the presence of a severe or time-dependent condition.
The concordance in triage code assignment between triage nurses and Chat-GPT was 0.278 (unweighted Cohen's kappa; 95% confidence intervals: 0.231–0.388). For all outcomes, the ROC values were higher for the triage nurses. The most relevant difference was found in 72-h mortality, where triage nurses showed an AUROC of 0.910 (0.757–1.000) compared to only 0.669 (0.153–1.000) for Chat-GPT.
The current level of Chat-GPT reliability is insufficient to make it a valid substitute for the expertise of triage nurses in prioritizing ED patients. Further developments are required to enhance the safety and effectiveness of AI for risk stratification of ED patients. |
doi_str_mv | 10.1016/j.ajem.2024.02.008 |
format | Article |
fullrecord | <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_2925484195</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><els_id>S0735675724000664</els_id><sourcerecordid>2925484195</sourcerecordid><originalsourceid>FETCH-LOGICAL-c384t-ee646190d1df5560345457a84d6819ca5e733c5061b2e3c1a99f4eb97180f70b3</originalsourceid><addsrcrecordid>eNp9kUuLFDEUhYMoTjv6B1xIwI2bKvNOSoRBGp0RBnQxrkMqdasnRT3aJDXS_94UPbpw4epuvnM49xyEXlNSU0LV-6F2A0w1I0zUhNWEmCdoRyVnlaGaPkU7ormslJb6Ar1IaSCEUiHFc3TBDRe0afgOdTfr5GYc5gzjGA4we8APENOa8P7e5er6-90H_Ot-wUeI_RKnhFvIGWJRYL_ECD6PJ-xHl1LoT2E-4KPLAeacNiLH4A5w9RI9692Y4NXjvUQ_vny-299Ut9-uv-4_3VaeG5ErACUUbUhHu15KRXgJK7UzolOGNt5J0Jx7SRRtGXBPXdP0AtpGU0N6TVp-id6dfY9x-blCynYKyZfH3AzLmixrmBSmfC4L-vYfdFjWOJd0lhOuDRFKmUKxM-XjklKE3h5jmFw8WUrstoEd7LaB3TawhNmyQRG9ebRe2wm6v5I_pRfg4xmA0sVDgGiTD1vzXdj6tN0S_uf_G8-clxw</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3037804668</pqid></control><display><type>article</type><title>Human intelligence versus Chat-GPT: who performs better in correctly classifying patients in triage?</title><source>MEDLINE</source><source>Elsevier ScienceDirect Journals</source><creator>Zaboli, Arian ; Brigo, Francesco ; Sibilio, Serena ; Mian, Michael ; Turcato, Gianni</creator><creatorcontrib>Zaboli, Arian ; Brigo, Francesco ; Sibilio, Serena ; Mian, Michael ; Turcato, Gianni</creatorcontrib><description>Chat-GPT is rapidly emerging as a promising and potentially revolutionary tool in medicine. One of its possible applications is the stratification of patients according to the severity of clinical conditions and prognosis during the triage evaluation in the emergency department (ED).
Using a randomly selected sample of 30 vignettes recreated from real clinical cases, we compared the concordance in risk stratification of ED patients between healthcare personnel and Chat-GPT. The concordance was assessed with Cohen's kappa, and the performance was evaluated with the area under the receiver operating characteristic curve (AUROC) curves. Among the outcomes, we considered mortality within 72 h, the need for hospitalization, and the presence of a severe or time-dependent condition.
The concordance in triage code assignment between triage nurses and Chat-GPT was 0.278 (unweighted Cohen's kappa; 95% confidence intervals: 0.231–0.388). For all outcomes, the ROC values were higher for the triage nurses. The most relevant difference was found in 72-h mortality, where triage nurses showed an AUROC of 0.910 (0.757–1.000) compared to only 0.669 (0.153–1.000) for Chat-GPT.
The current level of Chat-GPT reliability is insufficient to make it a valid substitute for the expertise of triage nurses in prioritizing ED patients. Further developments are required to enhance the safety and effectiveness of AI for risk stratification of ED patients.</description><identifier>ISSN: 0735-6757</identifier><identifier>ISSN: 1532-8171</identifier><identifier>EISSN: 1532-8171</identifier><identifier>DOI: 10.1016/j.ajem.2024.02.008</identifier><identifier>PMID: 38341993</identifier><language>eng</language><publisher>United States: Elsevier Inc</publisher><subject>Advanced nurse practice ; Artificial intelligence ; ChatGPT ; Codes ; Emergency medical care ; Emergency Service, Hospital ; Hospitalization ; Humans ; Kappa coefficient ; Manchester triage system ; Medical personnel ; Mortality ; Nurses ; Nursing ; Patient safety ; Patients ; Performance evaluation ; Physicians ; Reproducibility of Results ; Triage</subject><ispartof>The American journal of emergency medicine, 2024-05, Vol.79, p.44-47</ispartof><rights>2024 Elsevier Inc.</rights><rights>Copyright © 2024 Elsevier Inc. All rights reserved.</rights><rights>2024. Elsevier Inc.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c384t-ee646190d1df5560345457a84d6819ca5e733c5061b2e3c1a99f4eb97180f70b3</citedby><cites>FETCH-LOGICAL-c384t-ee646190d1df5560345457a84d6819ca5e733c5061b2e3c1a99f4eb97180f70b3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://www.sciencedirect.com/science/article/pii/S0735675724000664$$EHTML$$P50$$Gelsevier$$H</linktohtml><link.rule.ids>314,776,780,3537,27901,27902,65306</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/38341993$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Zaboli, Arian</creatorcontrib><creatorcontrib>Brigo, Francesco</creatorcontrib><creatorcontrib>Sibilio, Serena</creatorcontrib><creatorcontrib>Mian, Michael</creatorcontrib><creatorcontrib>Turcato, Gianni</creatorcontrib><title>Human intelligence versus Chat-GPT: who performs better in correctly classifying patients in triage?</title><title>The American journal of emergency medicine</title><addtitle>Am J Emerg Med</addtitle><description>Chat-GPT is rapidly emerging as a promising and potentially revolutionary tool in medicine. One of its possible applications is the stratification of patients according to the severity of clinical conditions and prognosis during the triage evaluation in the emergency department (ED).
Using a randomly selected sample of 30 vignettes recreated from real clinical cases, we compared the concordance in risk stratification of ED patients between healthcare personnel and Chat-GPT. The concordance was assessed with Cohen's kappa, and the performance was evaluated with the area under the receiver operating characteristic curve (AUROC) curves. Among the outcomes, we considered mortality within 72 h, the need for hospitalization, and the presence of a severe or time-dependent condition.
The concordance in triage code assignment between triage nurses and Chat-GPT was 0.278 (unweighted Cohen's kappa; 95% confidence intervals: 0.231–0.388). For all outcomes, the ROC values were higher for the triage nurses. The most relevant difference was found in 72-h mortality, where triage nurses showed an AUROC of 0.910 (0.757–1.000) compared to only 0.669 (0.153–1.000) for Chat-GPT.
The current level of Chat-GPT reliability is insufficient to make it a valid substitute for the expertise of triage nurses in prioritizing ED patients. Further developments are required to enhance the safety and effectiveness of AI for risk stratification of ED patients.</description><subject>Advanced nurse practice</subject><subject>Artificial intelligence</subject><subject>ChatGPT</subject><subject>Codes</subject><subject>Emergency medical care</subject><subject>Emergency Service, Hospital</subject><subject>Hospitalization</subject><subject>Humans</subject><subject>Kappa coefficient</subject><subject>Manchester triage system</subject><subject>Medical personnel</subject><subject>Mortality</subject><subject>Nurses</subject><subject>Nursing</subject><subject>Patient safety</subject><subject>Patients</subject><subject>Performance evaluation</subject><subject>Physicians</subject><subject>Reproducibility of Results</subject><subject>Triage</subject><issn>0735-6757</issn><issn>1532-8171</issn><issn>1532-8171</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>EIF</sourceid><sourceid>8G5</sourceid><sourceid>BENPR</sourceid><sourceid>GUQSH</sourceid><sourceid>M2O</sourceid><recordid>eNp9kUuLFDEUhYMoTjv6B1xIwI2bKvNOSoRBGp0RBnQxrkMqdasnRT3aJDXS_94UPbpw4epuvnM49xyEXlNSU0LV-6F2A0w1I0zUhNWEmCdoRyVnlaGaPkU7ormslJb6Ar1IaSCEUiHFc3TBDRe0afgOdTfr5GYc5gzjGA4we8APENOa8P7e5er6-90H_Ot-wUeI_RKnhFvIGWJRYL_ECD6PJ-xHl1LoT2E-4KPLAeacNiLH4A5w9RI9692Y4NXjvUQ_vny-299Ut9-uv-4_3VaeG5ErACUUbUhHu15KRXgJK7UzolOGNt5J0Jx7SRRtGXBPXdP0AtpGU0N6TVp-id6dfY9x-blCynYKyZfH3AzLmixrmBSmfC4L-vYfdFjWOJd0lhOuDRFKmUKxM-XjklKE3h5jmFw8WUrstoEd7LaB3TawhNmyQRG9ebRe2wm6v5I_pRfg4xmA0sVDgGiTD1vzXdj6tN0S_uf_G8-clxw</recordid><startdate>202405</startdate><enddate>202405</enddate><creator>Zaboli, Arian</creator><creator>Brigo, Francesco</creator><creator>Sibilio, Serena</creator><creator>Mian, Michael</creator><creator>Turcato, Gianni</creator><general>Elsevier Inc</general><general>Elsevier Limited</general><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>3V.</scope><scope>7RV</scope><scope>7T5</scope><scope>7X7</scope><scope>7XB</scope><scope>88E</scope><scope>8FI</scope><scope>8FJ</scope><scope>8FK</scope><scope>8G5</scope><scope>ABUWG</scope><scope>AEUYN</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>FYUFA</scope><scope>GHDGH</scope><scope>GNUQQ</scope><scope>GUQSH</scope><scope>H94</scope><scope>K9.</scope><scope>KB0</scope><scope>M0S</scope><scope>M1P</scope><scope>M2O</scope><scope>MBDVC</scope><scope>NAPCQ</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>Q9U</scope><scope>7X8</scope></search><sort><creationdate>202405</creationdate><title>Human intelligence versus Chat-GPT: who performs better in correctly classifying patients in triage?</title><author>Zaboli, Arian ; Brigo, Francesco ; Sibilio, Serena ; Mian, Michael ; Turcato, Gianni</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c384t-ee646190d1df5560345457a84d6819ca5e733c5061b2e3c1a99f4eb97180f70b3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Advanced nurse practice</topic><topic>Artificial intelligence</topic><topic>ChatGPT</topic><topic>Codes</topic><topic>Emergency medical care</topic><topic>Emergency Service, Hospital</topic><topic>Hospitalization</topic><topic>Humans</topic><topic>Kappa coefficient</topic><topic>Manchester triage system</topic><topic>Medical personnel</topic><topic>Mortality</topic><topic>Nurses</topic><topic>Nursing</topic><topic>Patient safety</topic><topic>Patients</topic><topic>Performance evaluation</topic><topic>Physicians</topic><topic>Reproducibility of Results</topic><topic>Triage</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Zaboli, Arian</creatorcontrib><creatorcontrib>Brigo, Francesco</creatorcontrib><creatorcontrib>Sibilio, Serena</creatorcontrib><creatorcontrib>Mian, Michael</creatorcontrib><creatorcontrib>Turcato, Gianni</creatorcontrib><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>ProQuest Central (Corporate)</collection><collection>Nursing & Allied Health Database</collection><collection>Immunology Abstracts</collection><collection>Health & Medical Collection</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>Medical Database (Alumni Edition)</collection><collection>Hospital Premium Collection</collection><collection>Hospital Premium Collection (Alumni Edition)</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>Research Library (Alumni Edition)</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest One Sustainability</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>Health Research Premium Collection</collection><collection>Health Research Premium Collection (Alumni)</collection><collection>ProQuest Central Student</collection><collection>Research Library Prep</collection><collection>AIDS and Cancer Research Abstracts</collection><collection>ProQuest Health & Medical Complete (Alumni)</collection><collection>Nursing & Allied Health Database (Alumni Edition)</collection><collection>Health & Medical Collection (Alumni Edition)</collection><collection>Medical Database</collection><collection>Research Library</collection><collection>Research Library (Corporate)</collection><collection>Nursing & Allied Health Premium</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central Basic</collection><collection>MEDLINE - Academic</collection><jtitle>The American journal of emergency medicine</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Zaboli, Arian</au><au>Brigo, Francesco</au><au>Sibilio, Serena</au><au>Mian, Michael</au><au>Turcato, Gianni</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Human intelligence versus Chat-GPT: who performs better in correctly classifying patients in triage?</atitle><jtitle>The American journal of emergency medicine</jtitle><addtitle>Am J Emerg Med</addtitle><date>2024-05</date><risdate>2024</risdate><volume>79</volume><spage>44</spage><epage>47</epage><pages>44-47</pages><issn>0735-6757</issn><issn>1532-8171</issn><eissn>1532-8171</eissn><abstract>Chat-GPT is rapidly emerging as a promising and potentially revolutionary tool in medicine. One of its possible applications is the stratification of patients according to the severity of clinical conditions and prognosis during the triage evaluation in the emergency department (ED).
Using a randomly selected sample of 30 vignettes recreated from real clinical cases, we compared the concordance in risk stratification of ED patients between healthcare personnel and Chat-GPT. The concordance was assessed with Cohen's kappa, and the performance was evaluated with the area under the receiver operating characteristic curve (AUROC) curves. Among the outcomes, we considered mortality within 72 h, the need for hospitalization, and the presence of a severe or time-dependent condition.
The concordance in triage code assignment between triage nurses and Chat-GPT was 0.278 (unweighted Cohen's kappa; 95% confidence intervals: 0.231–0.388). For all outcomes, the ROC values were higher for the triage nurses. The most relevant difference was found in 72-h mortality, where triage nurses showed an AUROC of 0.910 (0.757–1.000) compared to only 0.669 (0.153–1.000) for Chat-GPT.
The current level of Chat-GPT reliability is insufficient to make it a valid substitute for the expertise of triage nurses in prioritizing ED patients. Further developments are required to enhance the safety and effectiveness of AI for risk stratification of ED patients.</abstract><cop>United States</cop><pub>Elsevier Inc</pub><pmid>38341993</pmid><doi>10.1016/j.ajem.2024.02.008</doi><tpages>4</tpages></addata></record> |
fulltext | fulltext |
identifier | ISSN: 0735-6757 |
ispartof | The American journal of emergency medicine, 2024-05, Vol.79, p.44-47 |
issn | 0735-6757 1532-8171 1532-8171 |
language | eng |
recordid | cdi_proquest_miscellaneous_2925484195 |
source | MEDLINE; Elsevier ScienceDirect Journals |
subjects | Advanced nurse practice Artificial intelligence ChatGPT Codes Emergency medical care Emergency Service, Hospital Hospitalization Humans Kappa coefficient Manchester triage system Medical personnel Mortality Nurses Nursing Patient safety Patients Performance evaluation Physicians Reproducibility of Results Triage |
title | Human intelligence versus Chat-GPT: who performs better in correctly classifying patients in triage? |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-13T00%3A55%3A47IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Human%20intelligence%20versus%20Chat-GPT:%20who%20performs%20better%20in%20correctly%20classifying%20patients%20in%20triage?&rft.jtitle=The%20American%20journal%20of%20emergency%20medicine&rft.au=Zaboli,%20Arian&rft.date=2024-05&rft.volume=79&rft.spage=44&rft.epage=47&rft.pages=44-47&rft.issn=0735-6757&rft.eissn=1532-8171&rft_id=info:doi/10.1016/j.ajem.2024.02.008&rft_dat=%3Cproquest_cross%3E2925484195%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=3037804668&rft_id=info:pmid/38341993&rft_els_id=S0735675724000664&rfr_iscdi=true |