Interjudge Reliability and Decision Reproducibility

The purpose of this article is to discuss the importance of decision reproducibility for performance assessments. When decisions from two judges about a student's performance using comparable tasks correlate, decisions have been considered reproducible. However, when judges differ in expectatio...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Educational and psychological measurement 1994-12, Vol.54 (4), p.913-925
Hauptverfasser: Lunz, Mary E., Stahl, John A., Wright, Benjamin D.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The purpose of this article is to discuss the importance of decision reproducibility for performance assessments. When decisions from two judges about a student's performance using comparable tasks correlate, decisions have been considered reproducible. However, when judges differ in expectations and tasks differ in difficulty, decisions may not be independent of the particular judges or tasks encountered unless appropriate adjustments for the observable differences are made. In this study, data were analyzed with the Facets model and provided evidence that judges grade differently, whether or not the scores given correlate well. This outcome suggests that adjustments for differences among judge severities should be made before student measures are estimated to produce reproducible decisions for certification, achievement, or promotion.
ISSN:0013-1644
1552-3888
DOI:10.1177/0013164494054004007