Applicability of Online Chat-Based Artificial Intelligence Models to Colorectal Cancer Screening

Background Over the past year, studies have shown potential in the applicability of ChatGPT in various medical specialties including cardiology and oncology. However, the application of ChatGPT and other online chat-based AI models to patient education and patient-physician communication on colorect...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Digestive diseases and sciences 2024-03, Vol.69 (3), p.791-797
Hauptverfasser: Atarere, Joseph, Naqvi, Haider, Haas, Christopher, Adewunmi, Comfort, Bandaru, Sumanth, Allamneni, Rakesh, Ugonabo, Onyinye, Egbo, Olachi, Umoren, Mfoniso, Kanth, Priyanka
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Background Over the past year, studies have shown potential in the applicability of ChatGPT in various medical specialties including cardiology and oncology. However, the application of ChatGPT and other online chat-based AI models to patient education and patient-physician communication on colorectal cancer screening has not been critically evaluated which is what we aimed to do in this study. Methods We posed 15 questions on important colorectal cancer screening concepts and 5 common questions asked by patients to the 3 most commonly used freely available artificial intelligence (AI) models. The responses provided by the AI models were graded for appropriateness and reliability using American College of Gastroenterology guidelines. The responses to each question provided by an AI model were graded as reliably appropriate (RA), reliably inappropriate (RI) and unreliable. Grader assessments were validated by the joint probability of agreement for two raters. Results ChatGPT and YouChat™ provided RA responses to the questions posed more often than BingChat. There were two questions that > 1 AI model provided unreliable responses to. ChatGPT did not provide references. BingChat misinterpreted some of the information it referenced. The age of CRC screening provided by YouChat™ was not consistently up-to-date. Inter-rater reliability for 2 raters was 89.2%. Conclusion Most responses provided by AI models on CRC screening were appropriate. Some limitations exist in their ability to correctly interpret medical literature and provide updated information in answering queries. Patients should consult their physicians for context on the recommendations made by these AI models.
ISSN:0163-2116
1573-2568
DOI:10.1007/s10620-024-08274-3