An investigation of error sources and their impact in estimating the time to the most recent ancestor of spatially and temporally distributed HIV sequences
This is an investigation of significant error sources and their impact in estimating the time to the most recent common ancestor (MRCA) of spatially and temporally distributed human immunodeficiency virus (HIV) sequences. We simulate an HIV epidemic under a range of assumptions with known time to th...
Gespeichert in:
Veröffentlicht in: | Statistics in medicine 2003-05, Vol.22 (9), p.1495-1516 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | This is an investigation of significant error sources and their impact in estimating the time to the most recent common ancestor (MRCA) of spatially and temporally distributed human immunodeficiency virus (HIV) sequences. We simulate an HIV epidemic under a range of assumptions with known time to the MRCA (tMRCA). We then apply a range of baseline (known) evolutionary models to generate sequence data. We next estimate or assume one of several misspecified models and use the chosen model to estimate the time to the MRCA. Random effects and the extent of model misspecification determine the magnitude of error sources that could include: neglected heterogeneity in substitution rates across lineages and DNA sites; uncertainty in HIV isolation times; uncertain magnitude and type of population subdivision; uncertain impacts of host/viral transmission dynamics, and unavoidable model estimation errors. Our results suggest that confidence intervals will rarely have the nominal coverage probability for tMRCA. Neglected effects lead to errors that are unaccounted for in most analyses, resulting in optimistically narrow confidence intervals (CI). Using real HIV sequences having approximately known isolation times and locations, we present possible confidence intervals for several sets of assumptions. In general, we cannot be certain how much to broaden a stated confidence interval for tMRCA. However, we describe the impact of candidate error sources on CI width. We also determine which error sources have the most impact on CI width and demonstrate that the standard bootstrap method will underestimate the CI width. Copyright © 2003 John Wiley & Sons, Ltd. |
---|---|
ISSN: | 0277-6715 1097-0258 |
DOI: | 10.1002/sim.1508 |