Very‐short‐answer questions: reliability, discrimination and acceptability
Context Single‐best‐answer questions (SBAQs) have been widely used to test knowledge because they are easy to mark and demonstrate high reliability. However, SBAQs have been criticised for being subject to cueing. Objectives We used a novel assessment tool that facilitates efficient marking of open‐...
Gespeichert in:
Veröffentlicht in: | Medical education 2018-04, Vol.52 (4), p.447-455 |
---|---|
Hauptverfasser: | , , , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Context
Single‐best‐answer questions (SBAQs) have been widely used to test knowledge because they are easy to mark and demonstrate high reliability. However, SBAQs have been criticised for being subject to cueing.
Objectives
We used a novel assessment tool that facilitates efficient marking of open‐ended very‐short‐answer questions (VSAQs). We compared VSAQs with SBAQs with regard to reliability, discrimination and student performance, and evaluated the acceptability of VSAQs.
Methods
Medical students were randomised to sit a 60‐question assessment administered in either VSAQ and then SBAQ format (Group 1, n = 155) or the reverse (Group 2, n = 144). The VSAQs were delivered on a tablet; responses were computer‐marked and subsequently reviewed by two examiners. The standard error of measurement (SEM) across the ability spectrum was estimated using item response theory.
Results
The review of machine‐marked questions took an average of 1 minute, 36 seconds per question for all students. The VSAQs had high reliability (alpha: 0.91), a significantly lower SEM than the SBAQs (p < 0.001) and higher mean item–total point biserial correlations (p < 0.001). The VSAQ scores were significantly lower than the SBAQ scores (p < 0.001). The difference in scores between VSAQs and SBAQs was attenuated in Group 2. Although 80.4% of students found the VSAQs more difficult, 69.2% found them more authentic.
Conclusions
The VSAQ format demonstrated high reliability and discrimination and items were perceived as more authentic. The SBAQ format was associated with significant cueing. The present results suggest the VSAQ format has a higher degree of validity.
The authors present a novel method of marking short‐answer questions and demonstrate high reliability, discrimination and acceptability |
---|---|
ISSN: | 0308-0110 1365-2923 |
DOI: | 10.1111/medu.13504 |