Communicative competence of generative artificial intelligence in responding to patient queries about colorectal cancer surgery
Purpose To examine the ability of generative artificial intelligence (GAI) to answer patients’ questions regarding colorectal cancer (CRC). Methods Ten clinically relevant questions about CRC were selected from top-rated hospitals’ websites and patient surveys and presented to three GAI tools (Chatb...
Gespeichert in:
Veröffentlicht in: | International journal of colorectal disease 2024-06, Vol.39 (1), p.94, Article 94 |
---|---|
Hauptverfasser: | , , , , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Purpose
To examine the ability of generative artificial intelligence (GAI) to answer patients’ questions regarding colorectal cancer (CRC).
Methods
Ten clinically relevant questions about CRC were selected from top-rated hospitals’ websites and patient surveys and presented to three GAI tools (Chatbot Generative Pre-Trained Transformer [GPT-4], Google Bard, and CLOVA X). Their responses were compared with answers from the CRC information book. Response evaluation was performed by two groups, each consisting of five healthcare professionals (HCP) and patients. Each question was scored on a 1–5 Likert scale based on four evaluation criteria (maximum score, 20 points/question).
Results
In an analysis including only HCPs, the information book scored 11.8 ± 1.2, GPT-4 scored 13.5 ± 1.1, Google Bard scored 11.5 ± 0.7, and CLOVA X scored 12.2 ± 1.4 (
P
= 0.001). The score of GPT-4 was significantly higher than those of the information book (
P
= 0.020) and Google Bard (
P
= 0.001). In an analysis including only patients, the information book scored 14.1 ± 1.4, GPT-4 scored 15.2 ± 1.8, Google Bard scored 15.5 ± 1.8, and CLOVA X scored 14.4 ± 1.8, without significant differences (
P
= 0.234). When both groups of evaluators were included, the information book scored 13.0 ± 0.9, GPT-4 scored 14.4 ± 1.2, Google Bard scored 13.5 ± 1.0, and CLOVA X scored 13.3 ± 1.5 (
P
= 0.070).
Conclusion
The three GAIs demonstrated similar or better communicative competence than the information book regarding questions related to CRC surgery in Korean. If high-quality medical information provided by GAI is supervised properly by HCPs and published as an information book, it could be helpful for patients to obtain accurate information and make informed decisions. |
---|---|
ISSN: | 1432-1262 0179-1958 1432-1262 |
DOI: | 10.1007/s00384-024-04670-3 |