RESCUE: a validated Nanopore pipeline to classify bacteria through long-read, 16S-ITS-23S rRNA sequencing

Despite the advent of third-generation sequencing technologies, modern bacterial ecology studies still use Illumina to sequence small (~400 bp) hypervariable regions of the 16S rRNA SSU for phylogenetic classification. By sequencing a larger region of the rRNA gene operons, the limitations and biase...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Frontiers in microbiology 2023-07, Vol.14, p.1201064-1201064
Hauptverfasser: Petrone, Joseph R, Rios Glusberger, Paula, George, Christian D, Milletich, Patricia L, Ahrens, Angelica P, Roesch, Luiz Fernando Wurdig, Triplett, Eric W
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Despite the advent of third-generation sequencing technologies, modern bacterial ecology studies still use Illumina to sequence small (~400 bp) hypervariable regions of the 16S rRNA SSU for phylogenetic classification. By sequencing a larger region of the rRNA gene operons, the limitations and biases of sequencing small portions can be removed, allowing for more accurate classification with deeper taxonomic resolution. With Nanopore sequencing now providing raw simplex reads with quality scores above Q20 using the kit 12 chemistry, the ease, cost, and portability of Nanopore play a leading role in performing differential bacterial abundance analysis. Sequencing the near-entire operon of bacteria and archaea enables the use of the universally conserved operon holding evolutionary polymorphisms for taxonomic resolution. Here, a reproducible and validated pipeline was developed, RRN-operon Enabled Species-level Classification Using EMU (RESCUE), to facilitate the sequencing of bacterial operons and to support import into phyloseq. Benchmarking RESCUE showed that fully processed reads are now parallel or exceed the quality of Sanger, with median quality scores of approximately Q20+, using the R10.4 and Guppy SUP basecalling. The pipeline was validated through two complex mock samples, the use of multiple sample types, with actual Illumina data, and across four databases. RESCUE sequencing is shown to drastically improve classification to the species level for most taxa and resolves erroneous taxa caused by using short reads such as Illumina.
ISSN:1664-302X
1664-302X
DOI:10.3389/fmicb.2023.1201064