State of the psychometric methods: comments on the ISOQOL SIG psychometric papers

Background Psychometric analyses of patient reported outcomes typically use either classical test theory (CTT), item response theory (IRT), or Rasch measurement theory (RTM). The three papers from the ISOQOL Psychometrics SIG examined the same data set using the tree different approaches. By compari...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Journal of Patient-Reported Outcomes 2019-07, Vol.3 (1), p.49-11, Article 49
1. Verfasser: Bjorner, Jakob B.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Background Psychometric analyses of patient reported outcomes typically use either classical test theory (CTT), item response theory (IRT), or Rasch measurement theory (RTM). The three papers from the ISOQOL Psychometrics SIG examined the same data set using the tree different approaches. By comparing the results from these papers, the current paper aims to examine the extent to which conclusions about the validity and reliability of a PRO tool depends on the selected psychometric approach. Main text Regarding the basic statistical model , IRT and RTM are relatively similar but differ notably from CTT. However, modern applications of CTT diminish these differences. In analyses of item discrimination , CTT and IRT gave very similar results, while RTM requires equal discrimination and therefore suggested exclusion of items deviating too much from this requirement. Thus, fewer items fitted the Rasch model. In analyses of item thresholds (difficulty) , IRT and RMT provided fairly similar results. Item thresholds are typically not evaluated in CTT. Analyses of local dependence showed only moderate agreement between methods, partly due to different thresholds for important local dependence. Analyses of differential item function (DIF) showed good agreement between IRT and RMT. Agreement might be further improved by adjusting the thresholds for important DIF. Analyses of measurement precision across the score range showed high agreement between IRT and RMT methods. CTT assumes constant measurement precision throughout the score range and thus gave different results. Category orderings were examined in RMT analyses by checking for reversed thresholds. However, this approach is controversial within the RMT society. The same issue can be examined by the nominal categories IRT model. Conclusions While there are well-known differences between CTT, IRT and RMT, the comparison between three actual analyses revealed a great deal of agreement between the results from the methods. If the undogmatic attitude of the three current papers is maintained, the field will be well served.
ISSN:2509-8020
2509-8020
DOI:10.1186/s41687-019-0134-1