Next-generation phylogenomics using a Target Restricted Assembly Method

[Display omitted] ► We develop a Target Restricted Assembly Method for obtaining phylogenomic data from next-generation sequencing reads. ► This method uses BLAST searches of reads from a single Illumina lane to identify matching reads for target genes. ► Matching reads are assembled locally to prod...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Molecular phylogenetics and evolution 2013-01, Vol.66 (1), p.417-422
Hauptverfasser: Johnson, Kevin P., Walden, Kimberly K.O., Robertson, Hugh M.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:[Display omitted] ► We develop a Target Restricted Assembly Method for obtaining phylogenomic data from next-generation sequencing reads. ► This method uses BLAST searches of reads from a single Illumina lane to identify matching reads for target genes. ► Matching reads are assembled locally to produce complete single copy nuclear protein coding sequences. ► In an example with 20 species of lice (Psocodea) we recover sequences for 10 genes and perform a phylogenetic analysis. Next-generation sequencing technologies are revolutionizing the field of phylogenetics by making available genome scale data for a fraction of the cost of traditional targeted sequencing. One challenge will be to make use of these genomic level data without necessarily resorting to full-scale genome assembly and annotation, which is often time and labor intensive. Here we describe a technique, the Target Restricted Assembly Method (TRAM), in which the typical process of genome assembly and annotation is in essence reversed. Protein sequences of phylogenetically useful genes from a species within the group of interest are used as targets in tblastn searches of a data set from a lane of Illumina reads for a related species. Resulting blast hits are then assembled locally into contigs and these contigs are then aligned against the reference “cDNA” sequence to remove portions of the sequences that include introns. We illustrate the Target Restricted Assembly Method using genomic scale datasets for 20 species of lice (Insecta: Psocodea) to produce a test phylogenetic data set of 10 nuclear protein coding gene sequences. Given the advantages of using DNA instead of RNA, this technique is very cost effective and feasible given current technologies.
ISSN:1055-7903
1095-9513
DOI:10.1016/j.ympev.2012.09.007