Managing What We Can Measure: Quantifying the Susceptibility of Automated Scoring Systems to Gaming Behavior
As methods for automated scoring of constructed‐response items become more widely adopted in state assessments, and are used in more consequential operational configurations, it is critical that their susceptibility to gaming behavior be investigated and managed. This article provides a review of re...
Gespeichert in:
Veröffentlicht in: | Educational measurement, issues and practice issues and practice, 2014-09, Vol.33 (3), p.36-46 |
---|---|
Hauptverfasser: | , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | As methods for automated scoring of constructed‐response items become more widely adopted in state assessments, and are used in more consequential operational configurations, it is critical that their susceptibility to gaming behavior be investigated and managed. This article provides a review of research relevant to how construct‐irrelevant response behavior may affect automated constructed‐response scoring, and aims to address a gap in that literature: the need to assess the degree of risk before operational launch. A general framework is proposed for evaluating susceptibility to gaming, and an initial empirical demonstration is presented using the open‐source short‐answer scoring engines from the Automated Student Assessment Prize (ASAP) Challenge. |
---|---|
ISSN: | 0731-1745 1745-3992 |
DOI: | 10.1111/emip.12036 |