RatesTools: a Nextflow pipeline for detecting de novo germline mutations in pedigree sequence data

Here, we introduce RatesTools, an automated pipeline to infer de novo mutation rates from parent-offspring trio data of diploid organisms. By providing a reference genome and high-coverage, whole-genome resequencing data of a minimum of three individuals (sire, dam and offspring), RatesTools provide...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Bioinformatics (Oxford, England) England), 2023-01, Vol.39 (1)
Hauptverfasser: Armstrong, Ellie E, Campana, Michael G
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue 1
container_start_page
container_title Bioinformatics (Oxford, England)
container_volume 39
creator Armstrong, Ellie E
Campana, Michael G
description Here, we introduce RatesTools, an automated pipeline to infer de novo mutation rates from parent-offspring trio data of diploid organisms. By providing a reference genome and high-coverage, whole-genome resequencing data of a minimum of three individuals (sire, dam and offspring), RatesTools provides a list of candidate de novo mutations and calculates a putative mutation rate. RatesTools uses several quality filtering steps, such as discarding sites with low mappability and highly repetitive regions, as well as sites with low genotype and mapping qualities to find potential de novo mutations. In addition, RatesTools implements several optional filters based on post hoc assumptions of the heterozygosity and mutation rate of the organism. Filters are highly customizable to user specifications in order to maximize utility across a wide range of applications. RatesTools is freely available at https://github.com/campanam/RatesTools under a Creative Commons Zero (CC0) license. The pipeline is implemented in Nextflow (Di Tommaso et al., 2017), Ruby (http://www.ruby-lang.org), Bash (https://www.gnu.org/software/bash/) and R (R Core Team, 2020) with reliance upon several other freely available tools. RatesTools is compatible with macOS and Linux operating systems. Supplementary data are available at Bioinformatics online.
doi_str_mv 10.1093/bioinformatics/btac784
format Article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_2747003070</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2747003070</sourcerecordid><originalsourceid>FETCH-LOGICAL-c359t-a55fe6fc2e6fcfa5bbaea9e45ac4c0fdfa005d706262159995732e018759f0073</originalsourceid><addsrcrecordid>eNpVUMtOwzAQtBCIlsIvVD5yKbXj2K65oYqXVIGEyjlynHVllMTBdnj8PSktCC67I-3MzmgQmlJyQYli89J511ofGp2cifMyaSMX-QEaUybkLF9QevgHj9BJjC-EEE64OEYjJnKhWCbHqHzSCeLa-zpeYo0f4CPZ2r_jznVQuxbw4IErSGCSazcDwq1_83gDofk-N30aEvg2YtfiDiq3CQA4wmsPrQFc6aRP0ZHVdYSz_Z6g55vr9fJutnq8vV9erWaGcZVmmnMLwppsO6zmZalBK8i5NrkhtrJ6iF9JIjKRUa6U4pJlQOhCcmUJkWyCznd_u-AH-5iKxkUDda1b8H0sMplLQhiRZKCKHdUEH2MAW3TBNTp8FpQU236L__0W-34H4XTv0ZcNVL-yn0LZF3s2foA</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2747003070</pqid></control><display><type>article</type><title>RatesTools: a Nextflow pipeline for detecting de novo germline mutations in pedigree sequence data</title><source>MEDLINE</source><source>DOAJ Directory of Open Access Journals</source><source>Oxford Journals Open Access Collection</source><source>EZB-FREE-00999 freely available EZB journals</source><source>PubMed Central</source><source>Alma/SFX Local Collection</source><creator>Armstrong, Ellie E ; Campana, Michael G</creator><contributor>Birol, Inanc</contributor><creatorcontrib>Armstrong, Ellie E ; Campana, Michael G ; Birol, Inanc</creatorcontrib><description>Here, we introduce RatesTools, an automated pipeline to infer de novo mutation rates from parent-offspring trio data of diploid organisms. By providing a reference genome and high-coverage, whole-genome resequencing data of a minimum of three individuals (sire, dam and offspring), RatesTools provides a list of candidate de novo mutations and calculates a putative mutation rate. RatesTools uses several quality filtering steps, such as discarding sites with low mappability and highly repetitive regions, as well as sites with low genotype and mapping qualities to find potential de novo mutations. In addition, RatesTools implements several optional filters based on post hoc assumptions of the heterozygosity and mutation rate of the organism. Filters are highly customizable to user specifications in order to maximize utility across a wide range of applications. RatesTools is freely available at https://github.com/campanam/RatesTools under a Creative Commons Zero (CC0) license. The pipeline is implemented in Nextflow (Di Tommaso et al., 2017), Ruby (http://www.ruby-lang.org), Bash (https://www.gnu.org/software/bash/) and R (R Core Team, 2020) with reliance upon several other freely available tools. RatesTools is compatible with macOS and Linux operating systems. Supplementary data are available at Bioinformatics online.</description><identifier>ISSN: 1367-4811</identifier><identifier>EISSN: 1367-4811</identifier><identifier>DOI: 10.1093/bioinformatics/btac784</identifier><identifier>PMID: 36469327</identifier><language>eng</language><publisher>England</publisher><subject>Genome ; Germ-Line Mutation ; Humans ; Pedigree ; Sequence Analysis, DNA ; Software</subject><ispartof>Bioinformatics (Oxford, England), 2023-01, Vol.39 (1)</ispartof><rights>Published by Oxford University Press 2022.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c359t-a55fe6fc2e6fcfa5bbaea9e45ac4c0fdfa005d706262159995732e018759f0073</citedby><cites>FETCH-LOGICAL-c359t-a55fe6fc2e6fcfa5bbaea9e45ac4c0fdfa005d706262159995732e018759f0073</cites><orcidid>0000-0001-7107-6318 ; 0000-0003-0461-6462</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,776,780,860,27903,27904</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/36469327$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><contributor>Birol, Inanc</contributor><creatorcontrib>Armstrong, Ellie E</creatorcontrib><creatorcontrib>Campana, Michael G</creatorcontrib><title>RatesTools: a Nextflow pipeline for detecting de novo germline mutations in pedigree sequence data</title><title>Bioinformatics (Oxford, England)</title><addtitle>Bioinformatics</addtitle><description>Here, we introduce RatesTools, an automated pipeline to infer de novo mutation rates from parent-offspring trio data of diploid organisms. By providing a reference genome and high-coverage, whole-genome resequencing data of a minimum of three individuals (sire, dam and offspring), RatesTools provides a list of candidate de novo mutations and calculates a putative mutation rate. RatesTools uses several quality filtering steps, such as discarding sites with low mappability and highly repetitive regions, as well as sites with low genotype and mapping qualities to find potential de novo mutations. In addition, RatesTools implements several optional filters based on post hoc assumptions of the heterozygosity and mutation rate of the organism. Filters are highly customizable to user specifications in order to maximize utility across a wide range of applications. RatesTools is freely available at https://github.com/campanam/RatesTools under a Creative Commons Zero (CC0) license. The pipeline is implemented in Nextflow (Di Tommaso et al., 2017), Ruby (http://www.ruby-lang.org), Bash (https://www.gnu.org/software/bash/) and R (R Core Team, 2020) with reliance upon several other freely available tools. RatesTools is compatible with macOS and Linux operating systems. Supplementary data are available at Bioinformatics online.</description><subject>Genome</subject><subject>Germ-Line Mutation</subject><subject>Humans</subject><subject>Pedigree</subject><subject>Sequence Analysis, DNA</subject><subject>Software</subject><issn>1367-4811</issn><issn>1367-4811</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>EIF</sourceid><recordid>eNpVUMtOwzAQtBCIlsIvVD5yKbXj2K65oYqXVIGEyjlynHVllMTBdnj8PSktCC67I-3MzmgQmlJyQYli89J511ofGp2cifMyaSMX-QEaUybkLF9QevgHj9BJjC-EEE64OEYjJnKhWCbHqHzSCeLa-zpeYo0f4CPZ2r_jznVQuxbw4IErSGCSazcDwq1_83gDofk-N30aEvg2YtfiDiq3CQA4wmsPrQFc6aRP0ZHVdYSz_Z6g55vr9fJutnq8vV9erWaGcZVmmnMLwppsO6zmZalBK8i5NrkhtrJ6iF9JIjKRUa6U4pJlQOhCcmUJkWyCznd_u-AH-5iKxkUDda1b8H0sMplLQhiRZKCKHdUEH2MAW3TBNTp8FpQU236L__0W-34H4XTv0ZcNVL-yn0LZF3s2foA</recordid><startdate>20230101</startdate><enddate>20230101</enddate><creator>Armstrong, Ellie E</creator><creator>Campana, Michael G</creator><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7X8</scope><orcidid>https://orcid.org/0000-0001-7107-6318</orcidid><orcidid>https://orcid.org/0000-0003-0461-6462</orcidid></search><sort><creationdate>20230101</creationdate><title>RatesTools: a Nextflow pipeline for detecting de novo germline mutations in pedigree sequence data</title><author>Armstrong, Ellie E ; Campana, Michael G</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c359t-a55fe6fc2e6fcfa5bbaea9e45ac4c0fdfa005d706262159995732e018759f0073</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Genome</topic><topic>Germ-Line Mutation</topic><topic>Humans</topic><topic>Pedigree</topic><topic>Sequence Analysis, DNA</topic><topic>Software</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Armstrong, Ellie E</creatorcontrib><creatorcontrib>Campana, Michael G</creatorcontrib><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>MEDLINE - Academic</collection><jtitle>Bioinformatics (Oxford, England)</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Armstrong, Ellie E</au><au>Campana, Michael G</au><au>Birol, Inanc</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>RatesTools: a Nextflow pipeline for detecting de novo germline mutations in pedigree sequence data</atitle><jtitle>Bioinformatics (Oxford, England)</jtitle><addtitle>Bioinformatics</addtitle><date>2023-01-01</date><risdate>2023</risdate><volume>39</volume><issue>1</issue><issn>1367-4811</issn><eissn>1367-4811</eissn><abstract>Here, we introduce RatesTools, an automated pipeline to infer de novo mutation rates from parent-offspring trio data of diploid organisms. By providing a reference genome and high-coverage, whole-genome resequencing data of a minimum of three individuals (sire, dam and offspring), RatesTools provides a list of candidate de novo mutations and calculates a putative mutation rate. RatesTools uses several quality filtering steps, such as discarding sites with low mappability and highly repetitive regions, as well as sites with low genotype and mapping qualities to find potential de novo mutations. In addition, RatesTools implements several optional filters based on post hoc assumptions of the heterozygosity and mutation rate of the organism. Filters are highly customizable to user specifications in order to maximize utility across a wide range of applications. RatesTools is freely available at https://github.com/campanam/RatesTools under a Creative Commons Zero (CC0) license. The pipeline is implemented in Nextflow (Di Tommaso et al., 2017), Ruby (http://www.ruby-lang.org), Bash (https://www.gnu.org/software/bash/) and R (R Core Team, 2020) with reliance upon several other freely available tools. RatesTools is compatible with macOS and Linux operating systems. Supplementary data are available at Bioinformatics online.</abstract><cop>England</cop><pmid>36469327</pmid><doi>10.1093/bioinformatics/btac784</doi><orcidid>https://orcid.org/0000-0001-7107-6318</orcidid><orcidid>https://orcid.org/0000-0003-0461-6462</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 1367-4811
ispartof Bioinformatics (Oxford, England), 2023-01, Vol.39 (1)
issn 1367-4811
1367-4811
language eng
recordid cdi_proquest_miscellaneous_2747003070
source MEDLINE; DOAJ Directory of Open Access Journals; Oxford Journals Open Access Collection; EZB-FREE-00999 freely available EZB journals; PubMed Central; Alma/SFX Local Collection
subjects Genome
Germ-Line Mutation
Humans
Pedigree
Sequence Analysis, DNA
Software
title RatesTools: a Nextflow pipeline for detecting de novo germline mutations in pedigree sequence data
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-21T14%3A53%3A42IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=RatesTools:%20a%20Nextflow%20pipeline%20for%20detecting%20de%20novo%20germline%20mutations%20in%20pedigree%20sequence%20data&rft.jtitle=Bioinformatics%20(Oxford,%20England)&rft.au=Armstrong,%20Ellie%20E&rft.date=2023-01-01&rft.volume=39&rft.issue=1&rft.issn=1367-4811&rft.eissn=1367-4811&rft_id=info:doi/10.1093/bioinformatics/btac784&rft_dat=%3Cproquest_cross%3E2747003070%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2747003070&rft_id=info:pmid/36469327&rfr_iscdi=true