Mock microbial community meta‐analysis using different trimming of amplicon read lengths

Trimming of sequencing reads is a pre‐processing step that aims to discard sequence segments such as primers, adapters and low quality nucleotides that will interfere with clustering and classification steps. We evaluated the impact of trimming length of paired‐end 16S and 18S rRNA amplicon reads on...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Environmental microbiology 2024-01, Vol.26 (1), p.e16566-n/a
Hauptverfasser: Haider, Diana, Hall, Michael W., LaRoche, Julie, Beiko, Robert G.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Trimming of sequencing reads is a pre‐processing step that aims to discard sequence segments such as primers, adapters and low quality nucleotides that will interfere with clustering and classification steps. We evaluated the impact of trimming length of paired‐end 16S and 18S rRNA amplicon reads on the ability to reconstruct the taxonomic composition and relative abundances of communities with a known composition in both even and uneven proportions. We found that maximizing read retention maximizes recall but reduces precision by increasing false positives. The presence of expected taxa was accurately predicted across broad trim length ranges but recovering original relative proportions remains a difficult challenge. We show that parameters that maximize taxonomic recovery do not simultaneously maximize relative abundance accuracy. Trim length represents one of several experimental parameters that have non‐uniform impact across microbial clades, making it a difficult parameter to optimize. This study offers insights, guidelines, and helps researchers assess the significance of their decisions when trimming raw reads in a microbiome analysis based on overlapping or non‐overlapping paired‐end amplicons. The choice of tools and parameters can influence the accuracy of microbial community analysis. We examined this effect using eukaryotic and prokaryotic artificial communities and a range of DNA sequence read trim lengths. We found that taxon occurrence patterns are more accurately predicted in comparison to proportions, and while common practice in many studies, maximizing read retention inflates false positives. We recommend varying trimming lengths to assess the stability and robustness of microbial community analysis.
ISSN:1462-2912
1462-2920
DOI:10.1111/1462-2920.16566