Fast and sensitive taxonomic classification for metagenomics with Kaiju

Metagenomics emerged as an important field of research not only in microbial ecology but also for human health and disease, and metagenomic studies are performed on increasingly larger scales. While recent taxonomic classification programs achieve high speed by comparing genomic k -mers, they often...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Nature communications 2016-04, Vol.7 (1), p.11257-11257, Article 11257
Hauptverfasser: Menzel, Peter, Ng, Kim Lee, Krogh, Anders
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Metagenomics emerged as an important field of research not only in microbial ecology but also for human health and disease, and metagenomic studies are performed on increasingly larger scales. While recent taxonomic classification programs achieve high speed by comparing genomic k -mers, they often lack sensitivity for overcoming evolutionary divergence, so that large fractions of the metagenomic reads remain unclassified. Here we present the novel metagenome classifier Kaiju, which finds maximum (in-)exact matches on the protein-level using the Burrows–Wheeler transform. We show in a genome exclusion benchmark that Kaiju classifies reads with higher sensitivity and similar precision compared with current k -mer-based classifiers, especially in genera that are underrepresented in reference databases. We also demonstrate that Kaiju classifies up to 10 times more reads in real metagenomes. Kaiju can process millions of reads per minute and can run on a standard PC. Source code and web server are available at http://kaiju.binf.ku.dk . Here, Anders Krogh and colleagues describe Kaiju, a metagenome taxonomic classification program that uses maximum (in-)exact matches on the protein-level to account for evolutionary divergence. The authors show that Kaiju performs faster and is more sensitive compared with existing algorithms and can be used on a standard computer.
ISSN:2041-1723
2041-1723
DOI:10.1038/ncomms11257