Identifying outliers and implausible values in growth trajectory data

Abstract Purpose To illustrate how conditional growth percentiles can be adapted for use to systematically identify implausible measurements in growth trajectory data. Methods The use of conditional growth percentiles as a tool to assess serial weight data was reviewed. The approach was applied to 8...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Annals of epidemiology 2016-01, Vol.26 (1), p.77-80.e2
Hauptverfasser: Yang, Seungmi, PhD, Hutcheon, Jennifer A., PhD
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Abstract Purpose To illustrate how conditional growth percentiles can be adapted for use to systematically identify implausible measurements in growth trajectory data. Methods The use of conditional growth percentiles as a tool to assess serial weight data was reviewed. The approach was applied to 86,427 weight measurements (kg) taken between birth and age 6.5 years in 8217 girls participating in the Promotion of Breast Feeding Intervention Trial in Belarus. A conditional mean and variance was calculated for each weight measurement, which reflects the expected weight at a current visit given the girl's previous weights. Measurements were flagged as outliers if they were more than 4 standard deviation (SD) above or below the expected (conditional) weight. Results The method identified 234 weight measurements (0.3%) from 216 girls as potential outliers. Review of these trajectories confirmed the implausibility of the flagged measurements, and that the approach identified observations that would not have been identified using a conventional cross-sectional approach (±4 SD of the population mean) for identifying implausible values. Stata code to implement the approach is provided. Conclusions Conditional growth percentiles can be used to systematically identify implausible values in growth trajectory data and may be particularly useful for large data sets where the high number of trajectories makes ad hoc approaches unfeasible.
ISSN:1047-2797
1873-2585
DOI:10.1016/j.annepidem.2015.10.002