De novo haplotype reconstruction in viral quasispecies using paired-end read guided path finding
Abstract Motivation RNA virus populations contain different but genetically related strains, all infecting an individual host. Reconstruction of the viral haplotypes is a fundamental step to characterize the virus population, predict their viral phenotypes and finally provide important information f...
Gespeichert in:
Veröffentlicht in: | Bioinformatics 2018-09, Vol.34 (17), p.2927-2935 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Abstract
Motivation
RNA virus populations contain different but genetically related strains, all infecting an individual host. Reconstruction of the viral haplotypes is a fundamental step to characterize the virus population, predict their viral phenotypes and finally provide important information for clinical treatment and prevention. Advances of the next-generation sequencing technologies open up new opportunities to assemble full-length haplotypes. However, error-prone short reads, high similarities between related strains, an unknown number of haplotypes pose computational challenges for reference-free haplotype reconstruction. There is still much room to improve the performance of existing haplotype assembly tools.
Results
In this work, we developed a de novo haplotype reconstruction tool named PEHaplo, which employs paired-end reads to distinguish highly similar strains for viral quasispecies data. It was applied on both simulated and real quasispecies data, and the results were benchmarked against several recently published de novo haplotype reconstruction tools. The comparison shows that PEHaplo outperforms the benchmarked tools in a comprehensive set of metrics.
Availability and implementation
The source code and the documentation of PEHaplo are available at https://github.com/chjiao/PEHaplo.
Supplementary information
Supplementary data are available at Bioinformatics online. |
---|---|
ISSN: | 1367-4803 1460-2059 1367-4811 |
DOI: | 10.1093/bioinformatics/bty202 |