Test bias in a cognitive test: differential item functioning in the CASI
Assessment of test bias is important to establish the construct validity of tests. Assessment of differential item functioning (DIF) is an important first step in this process. DIF is present when examinees from different groups have differing probabilities of success on an item, after controlling f...
Gespeichert in:
Veröffentlicht in: | Statistics in medicine 2004-01, Vol.23 (2), p.241-256 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Assessment of test bias is important to establish the construct validity of tests. Assessment of differential item functioning (DIF) is an important first step in this process. DIF is present when examinees from different groups have differing probabilities of success on an item, after controlling for overall ability level. Here, we present analysis of DIF in the Cognitive Assessment Screening Instrument (CASI) using data from a large cohort study of elderly adults. We developed an ordinal logistic regression modelling technique to assess test items for DIF. Estimates of cognitive ability were obtained in two ways based on responses to CASI items: using traditional CASI scoring according to the original test instructions as well as using item response theory (IRT) scoring. Several demographic characteristics were examined for potential DIF, including ethnicity and gender (entered into the model as dichotomous variables), and years of education and age (entered as continuous variables). We found that a disappointingly large number of items had DIF with respect to at least one of these demographic variables. More items were found to have DIF with traditional CASI scoring than with IRT scoring. This study demonstrates a powerful technique for the evaluation of DIF in psychometric tests. The finding that so many CASI items had DIF suggests that previous findings of differences between groups in cognitive functioning as measured by the CASI may be due to biased test items rather than true differences between groups. The finding that IRT scoring diminished the impact of DIF is discussed. Some preliminary suggestions for how to deal with items found to have DIF in cognitive tests are made. The advantages of the DIF detection techniques we developed are discussed in relation to other techniques for the evaluation of DIF. Copyright © 2004 John Wiley & Sons, Ltd. |
---|---|
ISSN: | 0277-6715 1097-0258 |
DOI: | 10.1002/sim.1713 |