Managing What We Can Measure: Quantifying the Susceptibility of Automated Scoring Systems to Gaming Behavior

As methods for automated scoring of constructed‐response items become more widely adopted in state assessments, and are used in more consequential operational configurations, it is critical that their susceptibility to gaming behavior be investigated and managed. This article provides a review of re...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Educational measurement, issues and practice issues and practice, 2014-09, Vol.33 (3), p.36-46
Hauptverfasser:	Higgins, Derrick, Heilman, Michael
Format:	Artikel
Sprache:	eng
Schlagworte:	artificial intelligence automated scoring Automation Behavior constructed response Educational evaluation Evaluation Research Games Literature Reviews machine learning Responses Risk Scoring simulation Test Wiseness
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	As methods for automated scoring of constructed‐response items become more widely adopted in state assessments, and are used in more consequential operational configurations, it is critical that their susceptibility to gaming behavior be investigated and managed. This article provides a review of research relevant to how construct‐irrelevant response behavior may affect automated constructed‐response scoring, and aims to address a gap in that literature: the need to assess the degree of risk before operational launch. A general framework is proposed for evaluating susceptibility to gaming, and an initial empirical demonstration is presented using the open‐source short‐answer scoring engines from the Automated Student Assessment Prize (ASAP) Challenge.
ISSN:	0731-1745 1745-3992
DOI:	10.1111/emip.12036