Detection of Outliers in Reference Distributions: Performance of Horn's Algorithm
Medical laboratory reference data may be contaminated with outliers that should be eliminated before estimation of the reference interval. A statistical test for outliers has been proposed by Paul S. Horn and coworkers (Clin Chem 2001;47:2137-45). The algorithm operates in 2 steps: (a) mathematicall...
Gespeichert in:
Veröffentlicht in: | Clinical chemistry (Baltimore, Md.) Md.), 2005-12, Vol.51 (12), p.2326-2332 |
---|---|
Hauptverfasser: | , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Medical laboratory reference data may be contaminated with outliers that should be eliminated before estimation of the reference interval. A statistical test for outliers has been proposed by Paul S. Horn and coworkers (Clin Chem 2001;47:2137-45). The algorithm operates in 2 steps: (a) mathematically transform the original data to approximate a gaussian distribution; and (b) establish detection limits (Tukey fences) based on the central part of the transformed distribution.
We studied the specificity of Horn's test algorithm (probability of false detection of outliers), using Monte Carlo computer simulations performed on 13 types of probability distributions covering a wide range of positive and negative skewness. Distributions with 3% of the original observations replaced by random outliers were used to also examine the sensitivity of the test (probability of detection of true outliers). Three data transformations were used: the Box and Cox function (used in the original Horn's test), the Manly exponential function, and the John and Draper modulus function.
For many of the probability distributions, the specificity of Horn's algorithm was rather poor compared with the theoretical expectation. The cause for such poor performance was at least partially related to remaining nongaussian kurtosis (peakedness). The sensitivity showed great variation, dependent on both the type of underlying distribution and the location of the outliers (upper and/or lower tail).
Although Horn's algorithm undoubtedly is an improvement compared with older methods for outlier detection, reliable statistical identification of outliers in reference data remains a challenge. |
---|---|
ISSN: | 0009-9147 1530-8561 |
DOI: | 10.1373/clinchem.2005.058339 |