A deep population reference panel of tandem repeat variation

Tandem repeats (TRs) represent one of the largest sources of genetic variation in humans and are implicated in a range of phenotypes. Here we present a deep characterization of TR variation based on high coverage whole genome sequencing from 3550 diverse individuals from the 1000 Genomes Project and...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Nature communications 2023-10, Vol.14 (1), p.6711-15, Article 6711
Hauptverfasser: Ziaei Jam, Helyaneh, Li, Yang, DeVito, Ross, Mousavi, Nima, Ma, Nichole, Lujumba, Ibra, Adam, Yagoub, Maksimov, Mikhail, Huang, Bonnie, Dolzhenko, Egor, Qiu, Yunjiang, Kakembo, Fredrick Elishama, Joseph, Habi, Onyido, Blessing, Adeyemi, Jumoke, Bakhtiari, Mehrdad, Park, Jonghun, Javadzadeh, Sara, Jjingo, Daudi, Adebiyi, Ezekiel, Bafna, Vineet, Gymrek, Melissa
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Tandem repeats (TRs) represent one of the largest sources of genetic variation in humans and are implicated in a range of phenotypes. Here we present a deep characterization of TR variation based on high coverage whole genome sequencing from 3550 diverse individuals from the 1000 Genomes Project and H3Africa cohorts. We develop a method, EnsembleTR, to integrate genotypes from four separate methods resulting in high-quality genotypes at more than 1.7 million TR loci. Our catalog reveals novel sequence features influencing TR heterozygosity, identifies population-specific trinucleotide expansions, and finds hundreds of novel eQTL signals. Finally, we generate a phased haplotype panel which can be used to impute most TRs from nearby single nucleotide polymorphisms (SNPs) with high accuracy. Overall, the TR genotypes and reference haplotype panel generated here will serve as valuable resources for future genome-wide and population-wide studies of TRs and their role in human phenotypes. Tandem repeats (TRs) comprise some of the most polymorphic regions of the human genome but are difficult to study. Here, the authors develop an ensemble-based genotyping method and characterize 1.7 million TRs across 3,550 humans from diverse populations.
ISSN:2041-1723
2041-1723
DOI:10.1038/s41467-023-42278-3