The geometry of admixture in population genetics: the blessing of dimensionality

We present a geometry-based interpretation of the f-statistics framework, commonly used in population genetics to estimate phylogenetic relationships from genomic data. The focus is on the determination of the mixing coefficients in population admixture events subject to post-admixture drift. The in...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Genetics (Austin) 2024-10, Vol.228 (2)
Hauptverfasser: Oteo, José-Angel, Oteo-García, Gonzalo
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:We present a geometry-based interpretation of the f-statistics framework, commonly used in population genetics to estimate phylogenetic relationships from genomic data. The focus is on the determination of the mixing coefficients in population admixture events subject to post-admixture drift. The interpretation takes advantage of the high dimension of the dataset and analyzes the problem as a dimensional reduction issue. We show that it is possible to think of the f-statistics technique as an implicit transformation of the genomic data from a phase space into a subspace where the mapped data structure is more similar to the ancestral admixture configuration. The 2-way mixing coefficient is, as a matter of fact, carried out implicitly in this subspace. In addition, we propose the admixture test to be evaluated in the subspace because the comparison with the conventional one provides an important assessment of the admixture model. The overarching geometric framework provides slightly more general formulas than the f-formalism by using a different rationale as a starting point. Explicitly addressed are 2- and 3-way admixtures. The mixture proportions are provided by suitable linear fits, in 2 or 3 dimensions, that can be easily visualized. The difficulties encountered with introgression and gene flow are also addressed. The developments and findings are illustrated with numerical simulations and real-world cases.
ISSN:1943-2631
1943-2631
DOI:10.1093/genetics/iyae134