Leveraging large language models to construct feedback from medical multiple-choice Questions

Exams like the formative Progress Test Medizin can enhance their effectiveness by offering feedback beyond numerical scores. Content-based feedback, which encompasses relevant information from exam questions, can be valuable for students by offering them insight into their performance on the current...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Scientific reports 2024-11, Vol.14 (1), p.27910-14, Article 27910
Hauptverfasser: Tomova, Mihaela, Roselló Atanet, Iván, Sehy, Victoria, Sieg, Miriam, März, Maren, Mäder, Patrick
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Exams like the formative Progress Test Medizin can enhance their effectiveness by offering feedback beyond numerical scores. Content-based feedback, which encompasses relevant information from exam questions, can be valuable for students by offering them insight into their performance on the current exam, as well as serving as study aids and tools for revision. Our goal was to utilize Large Language Models (LLMs) in preparing content-based feedback for the Progress Test Medizin and evaluate their effectiveness in this task. We utilize two popular LLMs and conduct a comparative assessment by performing textual similarity on the generated outputs. Furthermore, we study via a survey how medical practitioners and medical educators assess the capabilities of LLMs and perceive the usage of LLMs for the task of generating content-based feedback for PTM exams. Our findings show that both examined LLMs performed similarly. Both have their own advantages and disadvantages. Our survey results indicate that one LLM produces slightly better outputs; however, this comes at a cost since it is a paid service, while the other is free to use. Overall, medical practitioners and educators who participated in the survey find the generated feedback relevant and useful, and they are open to using LLMs for such tasks in the future. We conclude that while the content-based feedback generated by the LLM may not be perfect, it nevertheless can be considered a valuable addition to the numerical feedback currently provided.
ISSN:2045-2322
2045-2322
DOI:10.1038/s41598-024-79245-x