A Graphical Method for Displaying the Model Fit of Item Response Theory Trace Lines

Item response theory (IRT) is a statistical paradigm for developing educational tests and assessing students. IRT, however, currently lacks an established graphical method for examining model fit for the three-parameter logistic model, the most flexible and popular IRT model in educational testing....

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Educational and psychological measurement 2019-12, Vol.79 (6), p.1064-1074
1. Verfasser: Kalinowski, Steven T.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Item response theory (IRT) is a statistical paradigm for developing educational tests and assessing students. IRT, however, currently lacks an established graphical method for examining model fit for the three-parameter logistic model, the most flexible and popular IRT model in educational testing. A method is presented here to do this. The graph, which is referred to herein as a “bin plot,” is the IRT equivalent of a scatterplot for linear regression. Bin plots display a conventional IRT trace line (with ability on the horizontal axis and probability correct on the vertical axis). Students are binned according to how well they performed on the entire test, and the proportion of students in each bin who answered the focal question correctly is displayed on the graph as points above or below the trace line. With this arrangement, the difference between each point and the trace line is the residual for the bin. Confidence intervals can be added to the observed proportions in order to display uncertainty. Computer simulations were used to test four alternative ways for binning students. These simulations showed that binning students according to number of questions they answered correctly on the entire test works best. Simulations also showed confidence intervals for bin plots had coverage probabilities close to nominal values for common testing scenarios, but that there are scenarios in which confidence intervals had inflated error rates.
ISSN:0013-1644
1552-3888
DOI:10.1177/0013164419846234