Extracting ‘legacy loci’ from an invertebrate sequence capture data set

Sequence capture studies result in rich data sets comprising hundreds to thousands of targeted genomic regions that are superseding Sanger‐based data sets comprised of a few well‐known loci with historical uses in phylogenetics (‘legacy loci’). However, integrating sequence capture and Sanger‐based...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Zoologica scripta 2022-01, Vol.51 (1), p.14-31
Hauptverfasser: Miller, Caroline D., Forthman, Michael, Miller, Christine W., Kimball, Rebecca T.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Sequence capture studies result in rich data sets comprising hundreds to thousands of targeted genomic regions that are superseding Sanger‐based data sets comprised of a few well‐known loci with historical uses in phylogenetics (‘legacy loci’). However, integrating sequence capture and Sanger‐based data sets is of interest as legacy loci can include different types of loci (e.g. mitochondrial and nuclear) across a potentially larger sample of species from past studies. Sequence capture data sets include nontargeted sequences, and there has been recent interest in extracting legacy loci from invertebrate data sets. Here, we use published legacy data from leaf‐footed bugs (Hemiptera: Coreoidea) to recover 15 mitochondrial and seven nuclear legacy loci from off‐target sequences in a sequence capture data set, explore approaches to improve legacy locus recovery, and combine these loci with sequence capture data for phylogenetic analysis. Two nuclear loci were determined to already be targeted by sequence capture baits. Most of the remaining loci were successfully recovered from off‐target sequences, but this recovery varied greatly. Additionally, complementing complete mitogenomes with additional reference mitochondrial sequences from a genetic depository did not offer improvement for most of our taxa; however, supplementing these reference sequences with extracted legacy loci offered ≥6% improvement across taxa for a given mitochondrial locus (negligible improvement for nuclear loci). Phylogenetic analysis of legacy and sequence capture data produced a topology generally congruent with recent studies, but support was lower. Thus, future studies may employ the approaches used in this study to integrate legacy data with newly generated sequence capture data sets without added expenses.
ISSN:0300-3256
1463-6409
DOI:10.1111/zsc.12513