Communicative competence of generative artificial intelligence in responding to patient queries about colorectal cancer surgery

Purpose To examine the ability of generative artificial intelligence (GAI) to answer patients’ questions regarding colorectal cancer (CRC). Methods Ten clinically relevant questions about CRC were selected from top-rated hospitals’ websites and patient surveys and presented to three GAI tools (Chatb...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:International journal of colorectal disease 2024-06, Vol.39 (1), p.94, Article 94
Hauptverfasser: Jo, Min Hyeong, Kim, Min-Jun, Oh, Heung-Kwon, Choi, Mi Jeong, Shin, Hye-Rim, Lee, Tae-Gyun, Ahn, Hong-min, Kim, Duck-Woo, Kang, Sung-Bum
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Purpose To examine the ability of generative artificial intelligence (GAI) to answer patients’ questions regarding colorectal cancer (CRC). Methods Ten clinically relevant questions about CRC were selected from top-rated hospitals’ websites and patient surveys and presented to three GAI tools (Chatbot Generative Pre-Trained Transformer [GPT-4], Google Bard, and CLOVA X). Their responses were compared with answers from the CRC information book. Response evaluation was performed by two groups, each consisting of five healthcare professionals (HCP) and patients. Each question was scored on a 1–5 Likert scale based on four evaluation criteria (maximum score, 20 points/question). Results In an analysis including only HCPs, the information book scored 11.8 ± 1.2, GPT-4 scored 13.5 ± 1.1, Google Bard scored 11.5 ± 0.7, and CLOVA X scored 12.2 ± 1.4 ( P = 0.001). The score of GPT-4 was significantly higher than those of the information book ( P = 0.020) and Google Bard ( P = 0.001). In an analysis including only patients, the information book scored 14.1 ± 1.4, GPT-4 scored 15.2 ± 1.8, Google Bard scored 15.5 ± 1.8, and CLOVA X scored 14.4 ± 1.8, without significant differences ( P = 0.234). When both groups of evaluators were included, the information book scored 13.0 ± 0.9, GPT-4 scored 14.4 ± 1.2, Google Bard scored 13.5 ± 1.0, and CLOVA X scored 13.3 ± 1.5 ( P = 0.070). Conclusion The three GAIs demonstrated similar or better communicative competence than the information book regarding questions related to CRC surgery in Korean. If high-quality medical information provided by GAI is supervised properly by HCPs and published as an information book, it could be helpful for patients to obtain accurate information and make informed decisions.
ISSN:1432-1262
0179-1958
1432-1262
DOI:10.1007/s00384-024-04670-3