RetroSnake: A modular pipeline to detect human endogenous retroviruses in genome sequencing data

Human endogenous retroviruses (HERVs) integrated into the human genome as a result of ancient exogenous infections and currently comprise ∼8% of our genome. The members of the most recently acquired HERV family, HERV-Ks, still retain the potential to produce viral molecules and have been linked to a...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:iScience 2022-11, Vol.25 (11), p.105289, Article 105289
Hauptverfasser: Kabiljo, Renata, Bowles, Harry, Marriott, Heather, Jones, Ashley R., Bouton, Clement R., Dobson, Richard J.B., Quinn, John P., Al Khleifat, Ahmad, Swanson, Chad M., Al-Chalabi, Ammar, Iacoangeli, Alfredo
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Human endogenous retroviruses (HERVs) integrated into the human genome as a result of ancient exogenous infections and currently comprise ∼8% of our genome. The members of the most recently acquired HERV family, HERV-Ks, still retain the potential to produce viral molecules and have been linked to a wide range of diseases including cancer and neurodegeneration. Although a range of tools for HERV detection in NGS data exist, most of them lack wet lab validation and they do not cover all steps of the analysis. Here, we describe RetroSnake, an end-to-end, modular, computationally efficient, and customizable pipeline for the discovery of HERVs in short-read NGS data. RetroSnake is based on an extensively wet-lab validated protocol, it covers all steps of the analysis from raw data to the generation of annotated results presented as an interactive html file, and it is easy to use by life scientists without substantial computational training. Availability and implementation: The Pipeline and an extensive documentation are available on GitHub. [Display omitted] •RetroSnake is an end-to-end pipeline for detection of HERV-K insertions•Modular and computationally efficient (∼4 h per genome)•Easy setup and installation with Snakemake•Can be installed and used by users with limited computational experience Bioinformatics; Biocomputational method; Sequence analysis
ISSN:2589-0042
2589-0042
DOI:10.1016/j.isci.2022.105289