ChatGPT4's proficiency in addressing patients' questions on systemic lupus erythematosus: a blinded comparative study with specialists

The efficacy of artificial intelligence (AI)-driven chatbots like ChatGPT4 in specialized medical consultations, particularly in rheumatology, remains underexplored. This study compares the proficiency of ChatGPT4' responses with practicing rheumatologists to inquiries from patients with SLE. I...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Rheumatology (Oxford, England) England), 2024-09, Vol.63 (9), p.2450-2456
Hauptverfasser:	Xu, Dan, Zhao, Jinxia, Liu, Rui, Dai, Yijun, Sun, Kai, Wong, Priscilla, Ming, Samuel Lee Shang, Wearn, Koh Li, Wang, Jiangyuan, Xie, Shasha, Zeng, Lin, Mu, Rong, Xu, Chuanhui
Format:	Artikel
Sprache:	eng
Schlagworte:	Adult Artificial Intelligence Cross-Sectional Studies Female Humans Lupus Erythematosus, Systemic - psychology Male Middle Aged Patient Satisfaction Physician-Patient Relations Rheumatologists - psychology Rheumatology - standards Surveys and Questionnaires
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	The efficacy of artificial intelligence (AI)-driven chatbots like ChatGPT4 in specialized medical consultations, particularly in rheumatology, remains underexplored. This study compares the proficiency of ChatGPT4' responses with practicing rheumatologists to inquiries from patients with SLE. In this cross-sectional study, we curated 95 frequently asked questions (FAQs), including 55 in Chinese and 40 in English. Responses for FAQs from ChatGPT4 and five rheumatologists were scored separately by a panel of rheumatologists and a group of patients with SLE across six domains (scientific validity, logical consistency, comprehensibility, completeness, satisfaction level and empathy) on a 0-10 scale (a score of 0 indicates entirely incorrect responses, while 10 indicates accurate and comprehensive answers). Rheumatologists' scoring revealed that ChatGPT4-generated responses outperformed those from rheumatologists in satisfaction level and empathy, with mean differences of 0.537 (95% CI, 0.252-0.823; P
ISSN:	1462-0324 1462-0332 1462-0332
DOI:	10.1093/rheumatology/keae238