SA-SSR: a suffix array-based algorithm for exhaustive and efficient SSR discovery in large genetic sequences

Simple Sequence Repeats (SSRs) are used to address a variety of research questions in a variety of fields (e.g. population genetics, phylogenetics, forensics, etc.), due to their high mutability within and between species. Here, we present an innovative algorithm, SA-SSR, based on suffix and longest...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Bioinformatics (Oxford, England) England), 2016-09, Vol.32 (17), p.2707-2709
Hauptverfasser:	Pickett, B D, Karlinsey, S M, Penrod, C E, Cormier, M J, Ebbert, M T W, Shiozawa, D K, Whipple, C J, Ridge, P G
Format:	Artikel
Sprache:	eng
Schlagworte:	Algorithms Applications Notes Databases, Nucleic Acid Genetic Markers Microsatellite Repeats Sequence Analysis, DNA - methods Software
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Simple Sequence Repeats (SSRs) are used to address a variety of research questions in a variety of fields (e.g. population genetics, phylogenetics, forensics, etc.), due to their high mutability within and between species. Here, we present an innovative algorithm, SA-SSR, based on suffix and longest common prefix arrays for efficiently detecting SSRs in large sets of sequences. Existing SSR detection applications are hampered by one or more limitations (i.e. speed, accuracy, ease-of-use, etc.). Our algorithm addresses these challenges while being the most comprehensive and correct SSR detection software available. SA-SSR is 100% accurate and detected >1000 more SSRs than the second best algorithm, while offering greater control to the user than any existing software. SA-SSR is freely available at http://github.com/ridgelab/SA-SSR CONTACT: perry.ridge@byu.edu Supplementary data are available at Bioinformatics online.
ISSN:	1367-4803 1367-4811
DOI:	10.1093/bioinformatics/btw298