Statistical match of the March 1996 Current Population Survey and the 1995 National Health Interview Survey
Statistical matching is a method used to combine two files when it is unlikely that individuals on one file are also on the other file. The objectives of this report are to document and evaluate statistical matches of the March 1996 Current Population Survey (CPS) and the 1995 National Health interv...
Gespeichert in:
Veröffentlicht in: | Vital and health statistics. Series 2. Data evaluation and methods research 2008-01 (144), p.1-50 |
---|---|
Hauptverfasser: | , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Statistical matching is a method used to combine two files when it is unlikely that individuals on one file are also on the other file. The objectives of this report are to document and evaluate statistical matches of the March 1996 Current Population Survey (CPS) and the 1995 National Health interview Survey (NHIS) and give recommendations for improving future matches. The CPS-NHIS match was motivated by the need for a data set with data on health measures and family resources for use in policy analyses.
Three statistical matches between the March 1996 CPS and the 1995 NHIS are described in this report. All three matches used person-level constrained matching with partitioning and a predictive mean matching algorithm to link records on the two files. For two of the matches, the CPS served as the Host file and the NHIS served as the Donor file; for the third match, the NHIS was the Host file and the CPS was the Donor file.
The results suggest that the constrained predictive mean matches of the March 1996 CPS and the 1995 NHIS successfully combined some of the information on the two files, but that relationships among some Host and Donor variables on the matched file may be distorted. The evaluation of the matches suggested that the variables used to partition the Host and Donor files prior to matching and the variables involved in the predictive mean matching play an important role in determining whether relationships among variables on the matched file correctly represent relationships among those variables in the population. The evaluation also indicated that estimates for small subgroups may be especially subject to error. The results reinforce the need to proceed cautiously when exploring relationships among Host and Donor variables on a statistically matched file. |
---|---|
ISSN: | 0083-2057 |