An Exact Nonparametric Method for Inferring Mosaic Structure in Sequence Triplets

Statistical tests for detecting mosaic structure or recombination among nucleotide sequences usually rely on identifying a pattern or a signal that would be unlikely to appear under clonal reproduction. Dozens of such tests have been described, but many are hampered by long running times, confoundin...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Genetics (Austin) 2007-06, Vol.176 (2), p.1035-1047
Hauptverfasser: Boni, Maciej F, Posada, David, Feldman, Marcus W
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Statistical tests for detecting mosaic structure or recombination among nucleotide sequences usually rely on identifying a pattern or a signal that would be unlikely to appear under clonal reproduction. Dozens of such tests have been described, but many are hampered by long running times, confounding of selection and recombination, and/or inability to isolate the mosaic-producing event. We introduce a test that is exact, nonparametric, rapidly computable, free of the infinite-sites assumption, able to distinguish between recombination and variation in mutation/fixation rates, and able to identify the breakpoints and sequences involved in the mosaic-producing event. Our test considers three sequences at a time: two parent sequences that may have recombined, with one or two breakpoints, to form the third sequence (the child sequence). Excess similarity of the child sequence to a candidate recombinant of the parents is a sign of recombination; we take the maximum value of this excess similarity as our test statistic Delta(m,n,b). We present a method for rapidly calculating the distribution of Delta(m,n,b) and demonstrate that it has comparable power to and a much improved running time over previous methods, especially in detecting recombination in large data sets.
ISSN:0016-6731
1943-2631
1943-2631
DOI:10.1534/genetics.106.068874