Linear Model to Assess the Scale's Validity of a Test

Wright and Stone had proposed three features to assess the quality of the distribution of the items difficulties in a test, on the so called "most probable response map": line, stack and gap. Once a line is accepted as a design model for a test, gaps and stacks are practically eliminated,...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Tristan, Agustin, Vidal, Rafael
Format: Report
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Wright and Stone had proposed three features to assess the quality of the distribution of the items difficulties in a test, on the so called "most probable response map": line, stack and gap. Once a line is accepted as a design model for a test, gaps and stacks are practically eliminated, producing an evidence of the "scale validity" of the test. A linear model that concretes the idea by Wright & Stone to assess the quality of a test is proposed based on the experience on several real tests. The model is the "test design line" distributing uniformly the items of the test centered on 0 logits. The "test design model" is related to the "mean absolute difference", a single parameter useful to determine (a) the distribution of the items, (b) the lack of bias of the scale and (c) the test width; three main characteristics of the validity of a test. Some results and applications of the model are shown, including remarks about test design, test analysis and item calibration. The model has been successfully used since 2001, to identify the "scale validity" of tests in Mexico, El Salvador and Colombia, from preschool up to professional level. The test design line is a straightforward tool to improve the quality of a test, it provides a mean to compare between different forms of a test and represents a simple procedure to maintain the metric stability of a test over the years. Some other applications have been found regarding standard error, reliability and construct validity. (Contains 2 tables and 8 figures.)