Evaluation of the Usability of ChatGPT-4 and Google Gemini in Patient Education About Rhinosinusitis

Artificial intelligence (AI) based chat robots are increasingly used by users for patient education about common diseases in the health field, as in every field. This study aims to evaluate and compare patient education materials on rhinosinusitis created by two frequently used chat robots, ChatGPT-...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Clinical otolaryngology 2025-01
Hauptverfasser:	Becerik, Çağrı, Yıldız, Selçuk, Tepe Karaca, Çiğdem, Toros, Sema Zer
Format:	Artikel
Sprache:	eng
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Artificial intelligence (AI) based chat robots are increasingly used by users for patient education about common diseases in the health field, as in every field. This study aims to evaluate and compare patient education materials on rhinosinusitis created by two frequently used chat robots, ChatGPT-4 and Google Gemini. One hundred nine questions taken from patient information websites were divided into 4 different categories: general knowledge, diagnosis, treatment, surgery and complications, then asked to chat robots. The answers given were evaluated by two different expert otolaryngologists, and on questions where the scores were different, a third, more experienced otolaryngologist finalised the evaluation. Questions were scored from 1 to 4: (1) comprehensive/correct, (2) incomplete/partially correct, (3) accurate and inaccurate data, potentially misleading and (4) completely inaccurate/irrelevant. In evaluating the answers given by ChatGPT-4, all answers in the Diagnosis category were evaluated as comprehensive/correct. In the evaluation of the answers given by Google Gemini, the answers evaluated as completely inaccurate/irrelevant in the treatment category were found to be statistically significantly higher, and the answers evaluated as incomplete/partially correct in the surgery and complications category were found to be statistically significantly higher. In the comparison between the two chat robots, in the treatment category, ChatGPT-4 had a higher correct evaluation rate than Google Gemini and was found to be statistically significant. The answers given by ChatGPT-4 and Google Gemini chat robots regarding rhinosinusitis were evaluated as sufficient and informative.
ISSN:	1749-4478 1749-4486 1749-4486
DOI:	10.1111/coa.14273