ChatGPT Generated Otorhinolaryngology Multiple‐Choice Questions: Quality, Psychometric Properties, and Suitability for Assessments

Objective To explore Chat Generative Pretrained Transformer's (ChatGPT's) capability to create multiple‐choice questions about otorhinolaryngology (ORL). Study Design Experimental question generation and exam simulation. Setting Tertiary academic center. Methods ChatGPT 3.5 was prompted: “...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	OTO open : the official open access journal of the American Academy of Otolaryngology--Head and Neck Surgery Foundation 2024-07, Vol.8 (3), p.e70018-n/a
Hauptverfasser:	Lotto, Cecilia, Sheppard, Sean C., Anschuetz, Wilma, Stricker, Daniel, Molinari, Giulia, Huwendiek, Sören, Anschuetz, Lukas
Format:	Artikel
Sprache:	eng
Schlagworte:	artificial intelligence ChatGPT large language model multiple choice question Original Research otolaryngology
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Objective To explore Chat Generative Pretrained Transformer's (ChatGPT's) capability to create multiple‐choice questions about otorhinolaryngology (ORL). Study Design Experimental question generation and exam simulation. Setting Tertiary academic center. Methods ChatGPT 3.5 was prompted: “Can you please create a challenging 20‐question multiple‐choice questionnaire about clinical cases in otolaryngology, offering five answer options?.” The generated questionnaire was sent to medical students, residents, and consultants. Questions were investigated regarding quality criteria. Answers were anonymized and the resulting data was analyzed in terms of difficulty and internal consistency. Results ChatGPT 3.5 generated 20 exam questions of which 1 question was considered off‐topic, 3 questions had a false answer, and 3 questions had multiple correct answers. Subspecialty theme repartition was as follows: 5 questions were on otology, 5 about rhinology, and 10 questions addressed head and neck. The qualities of focus and relevance were good while the vignette and distractor qualities were low. The level of difficulty was suitable for undergraduate medical students (n = 24), but too easy for residents (n = 30) or consultants (n = 10) in ORL. Cronbach's α was highest (.69) with 15 selected questions using students' results. Conclusion ChatGPT 3.5 is able to generate grammatically correct simple ORL multiple choice questions for a medical student level. However, the overall quality of the questions was average, needing thorough review and revision by a medical expert to ensure suitability in future exams.
ISSN:	2473-974X 2473-974X
DOI:	10.1002/oto2.70018