Comparative evaluation of the heterozygous variant standard deviation as a quality measure for next-generation sequencing

[Display omitted] •A need for additional next-generation sequencing quality checks is identified.•Standard deviation of heterozygous allele frequencies is a crucial quality metric.•Variant allele frequencies can be modeled from coverage using the gamma distribution.•The effect of increasing high cov...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Journal of biomedical informatics 2022-11, Vol.135, p.104234-104234, Article 104234
Hauptverfasser: Høy Hansen, Marcus, Steensboe Lang, Cecilie, Abildgaard, Niels, Nyvold, Charlotte Guldborg
Format: Artikel
Sprache:eng
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:[Display omitted] •A need for additional next-generation sequencing quality checks is identified.•Standard deviation of heterozygous allele frequencies is a crucial quality metric.•Variant allele frequencies can be modeled from coverage using the gamma distribution.•The effect of increasing high coverage becomes negligible in terms of quality gain. Next-generation sequencing holds unprecedented throughput in terms of informational content to cost. The technology has entered the scene in laboratory diagnostics and offers flexible workflows in biomedical research. However, the rapid acquisition of genomic data also gives rise to a substantial fraction of sequencing artifacts, causing the detection of false-positive germline variants or erroneous somatic mutations. Consequently, there is a pressing need for efficient and practical quality assessment in sequencing projects. In this study, we investigate using heterozygous variant allele frequency (VAF) standard deviation (σ) for supplementary quality control. Whereas several proposed quality metrics are based on empirical assessments, the dispersion of the allele frequencies reflects a direct approximation of the inherent and discrete features of a diploid genome. Consequently, homologous chromosomes display heterozygous VAF of approximately 1/2. Based on the meta-analysis of 152 whole-exome sequencing data sets, we found that σ reflects both sequencing coverage and noise and can be effectively modeled. It is concluded that the relative comparison of heterozygous VAF σ provides a practical handle for quality assessment, even for samples afflicted with copy-number alterations. The approach can be implemented when performing whole-exome, whole-genome, or targeted panel sequencing and helps identify problematic samples, such as those retrieved from archived formalin-fixed paraffin-embedded tissue.
ISSN:1532-0464
1532-0480
DOI:10.1016/j.jbi.2022.104234