State of the psychometric methods: comments on the ISOQOL SIG psychometric papers
Background Psychometric analyses of patient reported outcomes typically use either classical test theory (CTT), item response theory (IRT), or Rasch measurement theory (RTM). The three papers from the ISOQOL Psychometrics SIG examined the same data set using the tree different approaches. By compari...
Gespeichert in:
Veröffentlicht in: | Journal of Patient-Reported Outcomes 2019-07, Vol.3 (1), p.49-11, Article 49 |
---|---|
1. Verfasser: | |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Background
Psychometric analyses of patient reported outcomes typically use either classical test theory (CTT), item response theory (IRT), or Rasch measurement theory (RTM). The three papers from the ISOQOL Psychometrics SIG examined the same data set using the tree different approaches. By comparing the results from these papers, the current paper aims to examine the extent to which conclusions about the validity and reliability of a PRO tool depends on the selected psychometric approach.
Main text
Regarding the
basic statistical model
, IRT and RTM are relatively similar but differ notably from CTT. However, modern applications of CTT diminish these differences. In analyses of
item discrimination
, CTT and IRT gave very similar results, while RTM requires equal discrimination and therefore suggested exclusion of items deviating too much from this requirement. Thus, fewer items fitted the Rasch model. In analyses of
item thresholds (difficulty)
, IRT and RMT provided fairly similar results. Item thresholds are typically not evaluated in CTT. Analyses of
local dependence
showed only moderate agreement between methods, partly due to different thresholds for important local dependence. Analyses of
differential item function
(DIF) showed good agreement between IRT and RMT. Agreement might be further improved by adjusting the thresholds for important DIF. Analyses of
measurement precision across the score range
showed high agreement between IRT and RMT methods. CTT assumes constant measurement precision throughout the score range and thus gave different results.
Category orderings
were examined in RMT analyses by checking for reversed thresholds. However, this approach is controversial within the RMT society. The same issue can be examined by the nominal categories IRT model.
Conclusions
While there are well-known differences between CTT, IRT and RMT, the comparison between three actual analyses revealed a great deal of agreement between the results from the methods. If the undogmatic attitude of the three current papers is maintained, the field will be well served. |
---|---|
ISSN: | 2509-8020 2509-8020 |
DOI: | 10.1186/s41687-019-0134-1 |