Evaluation of an Artificial Intelligence Chatbot for Delivery of IR Patient Education Material: A Comparison with Societal Website Content

To assess the accuracy, completeness, and readability of patient educational material produced by a machine learning model and compare the output to that provided by a societal website. Content from the Society of Interventional Radiology Patient Center website was retrieved, categorized, and organi...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Journal of vascular and interventional radiology 2023-10, Vol.34 (10), p.1760-1768.e32
Hauptverfasser:	McCarthy, Colin J., Berkowitz, Seth, Ramalingam, Vijay, Ahmed, Muneeb
Format:	Artikel
Sprache:	eng
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	1768.e32
container_issue	10
container_start_page	1760
container_title	Journal of vascular and interventional radiology
container_volume	34
creator	McCarthy, Colin J. Berkowitz, Seth Ramalingam, Vijay Ahmed, Muneeb
description	To assess the accuracy, completeness, and readability of patient educational material produced by a machine learning model and compare the output to that provided by a societal website. Content from the Society of Interventional Radiology Patient Center website was retrieved, categorized, and organized into discrete questions. These questions were entered into the ChatGPT platform, and the output was analyzed for word and sentence counts, readability using multiple validated scales, factual correctness, and suitability for patient education using the Patient Education Materials Assessment Tool for Printable Materials (PEMAT-P) instrument. A total of 21,154 words were analyzed, including 7,917 words from the website and 13,377 words representing the total output of the ChatGPT platform across 22 text passages. Compared to the societal website, output from the ChatGPT platform was longer and more difficult to read on 4 of 5 readability scales. The ChatGPT output was incorrect for 12 (11.5%) of 104 questions. When reviewed using the PEMAT-P tool, the ChatGPT content scored lower than the website material. Content from both the website and ChatGPT were significantly above the recommended fifth or sixth grade level for patient education, with a mean Flesch-Kincaid grade level of 11.1 (±1.3) for the website and 11.9 (±1.6) for the ChatGPT content. The ChatGPT platform may produce incomplete or inaccurate patient educational content, and providers should be familiar with the limitations of the system in its current form. Opportunities may exist to fine-tune existing large language models, which could be optimized for the delivery of patient educational content. [Display omitted]
doi_str_mv	10.1016/j.jvir.2023.05.037
format	Article
fullrecord	<record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_2827264083</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><els_id>S1051044323004232</els_id><sourcerecordid>2827264083</sourcerecordid><originalsourceid>FETCH-LOGICAL-c377t-66aabb8de20a94ffe18c789e0ad6c54dca873f2f9d62cce8a7eef9e1826cde293</originalsourceid><addsrcrecordid>eNp9kM1OwzAQhCMEElB4AU4-cklw7CROEJeqFKhUBOJHHC3XWVNXaVxst6ivwFOzVTmjPXhlzTermSS5yGmW07y6WmSLjfUZo4xntMwoFwfJSV7yMhWCs0PcaZmntCj4cXIawoJSWuOcJD_jjerWKlrXE2eI6snQR2ustqojkz5C19lP6DWQ0VzFmYvEOE9uobMb8NsdMnkhz8hDH8m4Xeu91aOK4NHimgzJyC1XytuA3982zsmr0xYi2n_ALNiIzg7v9PEsOTKqC3D-9w6S97vx2-ghnT7dT0bDaaq5EDGtKqVms7oFRlVTGAN5rUXdAFVtpcui1aoW3DDTtBXTGmolAEyDKlZphBo-SC73vivvvtYQolzaoDGo6sGtg2Q1E6wqaM1RyvZS7V0IHoxcebtUfitzKnfFy4XcFS93xUtaSiweoZs9BBhiY8HLgIGxwtZ60FG2zv6H_wLYpY-R</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2827264083</pqid></control><display><type>article</type><title>Evaluation of an Artificial Intelligence Chatbot for Delivery of IR Patient Education Material: A Comparison with Societal Website Content</title><source>Elsevier ScienceDirect Journals</source><creator>McCarthy, Colin J. ; Berkowitz, Seth ; Ramalingam, Vijay ; Ahmed, Muneeb</creator><creatorcontrib>McCarthy, Colin J. ; Berkowitz, Seth ; Ramalingam, Vijay ; Ahmed, Muneeb</creatorcontrib><description>To assess the accuracy, completeness, and readability of patient educational material produced by a machine learning model and compare the output to that provided by a societal website. Content from the Society of Interventional Radiology Patient Center website was retrieved, categorized, and organized into discrete questions. These questions were entered into the ChatGPT platform, and the output was analyzed for word and sentence counts, readability using multiple validated scales, factual correctness, and suitability for patient education using the Patient Education Materials Assessment Tool for Printable Materials (PEMAT-P) instrument. A total of 21,154 words were analyzed, including 7,917 words from the website and 13,377 words representing the total output of the ChatGPT platform across 22 text passages. Compared to the societal website, output from the ChatGPT platform was longer and more difficult to read on 4 of 5 readability scales. The ChatGPT output was incorrect for 12 (11.5%) of 104 questions. When reviewed using the PEMAT-P tool, the ChatGPT content scored lower than the website material. Content from both the website and ChatGPT were significantly above the recommended fifth or sixth grade level for patient education, with a mean Flesch-Kincaid grade level of 11.1 (±1.3) for the website and 11.9 (±1.6) for the ChatGPT content. The ChatGPT platform may produce incomplete or inaccurate patient educational content, and providers should be familiar with the limitations of the system in its current form. Opportunities may exist to fine-tune existing large language models, which could be optimized for the delivery of patient educational content. [Display omitted]</description><identifier>ISSN: 1051-0443</identifier><identifier>EISSN: 1535-7732</identifier><identifier>DOI: 10.1016/j.jvir.2023.05.037</identifier><language>eng</language><publisher>Elsevier Inc</publisher><ispartof>Journal of vascular and interventional radiology, 2023-10, Vol.34 (10), p.1760-1768.e32</ispartof><rights>2023 SIR</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c377t-66aabb8de20a94ffe18c789e0ad6c54dca873f2f9d62cce8a7eef9e1826cde293</citedby><cites>FETCH-LOGICAL-c377t-66aabb8de20a94ffe18c789e0ad6c54dca873f2f9d62cce8a7eef9e1826cde293</cites><orcidid>0000-0003-3515-5961</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://www.sciencedirect.com/science/article/pii/S1051044323004232$$EHTML$$P50$$Gelsevier$$H</linktohtml><link.rule.ids>314,776,780,3536,27902,27903,65308</link.rule.ids></links><search><creatorcontrib>McCarthy, Colin J.</creatorcontrib><creatorcontrib>Berkowitz, Seth</creatorcontrib><creatorcontrib>Ramalingam, Vijay</creatorcontrib><creatorcontrib>Ahmed, Muneeb</creatorcontrib><title>Evaluation of an Artificial Intelligence Chatbot for Delivery of IR Patient Education Material: A Comparison with Societal Website Content</title><title>Journal of vascular and interventional radiology</title><description>To assess the accuracy, completeness, and readability of patient educational material produced by a machine learning model and compare the output to that provided by a societal website. Content from the Society of Interventional Radiology Patient Center website was retrieved, categorized, and organized into discrete questions. These questions were entered into the ChatGPT platform, and the output was analyzed for word and sentence counts, readability using multiple validated scales, factual correctness, and suitability for patient education using the Patient Education Materials Assessment Tool for Printable Materials (PEMAT-P) instrument. A total of 21,154 words were analyzed, including 7,917 words from the website and 13,377 words representing the total output of the ChatGPT platform across 22 text passages. Compared to the societal website, output from the ChatGPT platform was longer and more difficult to read on 4 of 5 readability scales. The ChatGPT output was incorrect for 12 (11.5%) of 104 questions. When reviewed using the PEMAT-P tool, the ChatGPT content scored lower than the website material. Content from both the website and ChatGPT were significantly above the recommended fifth or sixth grade level for patient education, with a mean Flesch-Kincaid grade level of 11.1 (±1.3) for the website and 11.9 (±1.6) for the ChatGPT content. The ChatGPT platform may produce incomplete or inaccurate patient educational content, and providers should be familiar with the limitations of the system in its current form. Opportunities may exist to fine-tune existing large language models, which could be optimized for the delivery of patient educational content. [Display omitted]</description><issn>1051-0443</issn><issn>1535-7732</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><recordid>eNp9kM1OwzAQhCMEElB4AU4-cklw7CROEJeqFKhUBOJHHC3XWVNXaVxst6ivwFOzVTmjPXhlzTermSS5yGmW07y6WmSLjfUZo4xntMwoFwfJSV7yMhWCs0PcaZmntCj4cXIawoJSWuOcJD_jjerWKlrXE2eI6snQR2ustqojkz5C19lP6DWQ0VzFmYvEOE9uobMb8NsdMnkhz8hDH8m4Xeu91aOK4NHimgzJyC1XytuA3982zsmr0xYi2n_ALNiIzg7v9PEsOTKqC3D-9w6S97vx2-ghnT7dT0bDaaq5EDGtKqVms7oFRlVTGAN5rUXdAFVtpcui1aoW3DDTtBXTGmolAEyDKlZphBo-SC73vivvvtYQolzaoDGo6sGtg2Q1E6wqaM1RyvZS7V0IHoxcebtUfitzKnfFy4XcFS93xUtaSiweoZs9BBhiY8HLgIGxwtZ60FG2zv6H_wLYpY-R</recordid><startdate>202310</startdate><enddate>202310</enddate><creator>McCarthy, Colin J.</creator><creator>Berkowitz, Seth</creator><creator>Ramalingam, Vijay</creator><creator>Ahmed, Muneeb</creator><general>Elsevier Inc</general><scope>AAYXX</scope><scope>CITATION</scope><scope>7X8</scope><orcidid>https://orcid.org/0000-0003-3515-5961</orcidid></search><sort><creationdate>202310</creationdate><title>Evaluation of an Artificial Intelligence Chatbot for Delivery of IR Patient Education Material: A Comparison with Societal Website Content</title><author>McCarthy, Colin J. ; Berkowitz, Seth ; Ramalingam, Vijay ; Ahmed, Muneeb</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c377t-66aabb8de20a94ffe18c789e0ad6c54dca873f2f9d62cce8a7eef9e1826cde293</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>McCarthy, Colin J.</creatorcontrib><creatorcontrib>Berkowitz, Seth</creatorcontrib><creatorcontrib>Ramalingam, Vijay</creatorcontrib><creatorcontrib>Ahmed, Muneeb</creatorcontrib><collection>CrossRef</collection><collection>MEDLINE - Academic</collection><jtitle>Journal of vascular and interventional radiology</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>McCarthy, Colin J.</au><au>Berkowitz, Seth</au><au>Ramalingam, Vijay</au><au>Ahmed, Muneeb</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Evaluation of an Artificial Intelligence Chatbot for Delivery of IR Patient Education Material: A Comparison with Societal Website Content</atitle><jtitle>Journal of vascular and interventional radiology</jtitle><date>2023-10</date><risdate>2023</risdate><volume>34</volume><issue>10</issue><spage>1760</spage><epage>1768.e32</epage><pages>1760-1768.e32</pages><issn>1051-0443</issn><eissn>1535-7732</eissn><abstract>To assess the accuracy, completeness, and readability of patient educational material produced by a machine learning model and compare the output to that provided by a societal website. Content from the Society of Interventional Radiology Patient Center website was retrieved, categorized, and organized into discrete questions. These questions were entered into the ChatGPT platform, and the output was analyzed for word and sentence counts, readability using multiple validated scales, factual correctness, and suitability for patient education using the Patient Education Materials Assessment Tool for Printable Materials (PEMAT-P) instrument. A total of 21,154 words were analyzed, including 7,917 words from the website and 13,377 words representing the total output of the ChatGPT platform across 22 text passages. Compared to the societal website, output from the ChatGPT platform was longer and more difficult to read on 4 of 5 readability scales. The ChatGPT output was incorrect for 12 (11.5%) of 104 questions. When reviewed using the PEMAT-P tool, the ChatGPT content scored lower than the website material. Content from both the website and ChatGPT were significantly above the recommended fifth or sixth grade level for patient education, with a mean Flesch-Kincaid grade level of 11.1 (±1.3) for the website and 11.9 (±1.6) for the ChatGPT content. The ChatGPT platform may produce incomplete or inaccurate patient educational content, and providers should be familiar with the limitations of the system in its current form. Opportunities may exist to fine-tune existing large language models, which could be optimized for the delivery of patient educational content. [Display omitted]</abstract><pub>Elsevier Inc</pub><doi>10.1016/j.jvir.2023.05.037</doi><orcidid>https://orcid.org/0000-0003-3515-5961</orcidid><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	ISSN: 1051-0443
ispartof	Journal of vascular and interventional radiology, 2023-10, Vol.34 (10), p.1760-1768.e32
issn	1051-0443 1535-7732
language	eng
recordid	cdi_proquest_miscellaneous_2827264083
source	Elsevier ScienceDirect Journals
title	Evaluation of an Artificial Intelligence Chatbot for Delivery of IR Patient Education Material: A Comparison with Societal Website Content
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-27T08%3A30%3A00IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Evaluation%20of%20an%20Artificial%20Intelligence%20Chatbot%20for%20Delivery%20of%20IR%20Patient%20Education%20Material:%20A%20Comparison%20with%20Societal%20Website%20Content&rft.jtitle=Journal%20of%20vascular%20and%20interventional%20radiology&rft.au=McCarthy,%20Colin%20J.&rft.date=2023-10&rft.volume=34&rft.issue=10&rft.spage=1760&rft.epage=1768.e32&rft.pages=1760-1768.e32&rft.issn=1051-0443&rft.eissn=1535-7732&rft_id=info:doi/10.1016/j.jvir.2023.05.037&rft_dat=%3Cproquest_cross%3E2827264083%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2827264083&rft_id=info:pmid/&rft_els_id=S1051044323004232&rfr_iscdi=true