Using Item Response Theory to Calibrate the Headache Impact Test (HIT™) to the Metric of Traditional Headache Scales

Background: Item response theory (IRT) scoring of health status questionnaires offers many advantages. However, to ensure 'backwards comparability' and to facilitate interpretations of results, we need the ability to express the IRT score in the metrics of the traditional scales. Objective...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Quality of life research 2003-12, Vol.12 (8), p.981-1002
Hauptverfasser: Bjorner, Jakob B., Kosinski, Mark, Ware, John E.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Background: Item response theory (IRT) scoring of health status questionnaires offers many advantages. However, to ensure 'backwards comparability' and to facilitate interpretations of results, we need the ability to express the IRT score in the metrics of the traditional scales. Objectives: To develop procedures to calibrate IRT-based scores on the Headache Impact Test (HIT) into the metrics of the traditional headache scales. To assess the degree to which the calibrated HIT scores agree with the observed traditional scores and lead to the same conclusions in group comparisons. Methods: We used telephone interview data (n = 1016) and Internet data (n = 1103) from general population surveys of recent headache sufferers. Analyses were conducted in four steps: (1) develop IRT models for all items, (2) for each IRT score level, calculate the expected score on each of the traditional scales (calibration), (3) adjust this calibrated score for measurement error in the IRT score, (4) for each of the traditional scales, assess agreement between calibrated HIT scores and observed scores using intraclass correlation (ICC) and evaluate the agreement of mean scores and the relative validity (RV) in discriminating among groups differing in migraine diagnosis, headache severity, and change in impact over time. Results: For the traditional categorical questionnaire items (the Migraine Specific Questionnaire (MSQ) and the Headache Disability Inventory (HDI)) the calibrated HIT agreed with the observed traditional scores: ICC's were between 0.80 and 0.94. In RV analyses the maximum mean difference between the observed and expected scores was 1.7 points on a 0-100 scale for comparisons at one point in time. Analyses of change over time and analyses calibrating scores from the fixed-form HIT-6 to the metric of other questionnaires were also satisfactory although less precise. Analysis of non-standard questionnaire items (e.g. On how many days in the past 3 months did you have a headache, from the HIMQ and the MIDAS) required special IRT models. Agreement was less good: ICC's were between 0.56 and 0.61 and the maximum mean differences were 2.9 (on a 0-270 scale) and 3.8 (on a 0-450 scale) in RV analyses at one point in time. The ability of the calibrated scale scores to discriminate between groups was at least as good as the ability of the observed sum scales and often remarkably better. Conclusion: The theoretical advantage of IRT models in scale calibration is supported by
ISSN:0962-9343
1573-2649
DOI:10.1023/A:1026123400242