Beyond tandem repeats: complex pattern structures and distant regions of similarity
Tandem repeats (TRs) are associated with human disease, play a role in evolution and are important in regulatory processes. Despite their importance, locating and characterizing these patterns within anonymous DNA sequences remains a challenge. In part, the difficulty is due to imperfect conservatio...
Gespeichert in:
Veröffentlicht in: | Bioinformatics 2002-07, Vol.18 Suppl 1 (1), p.S31-S37 |
---|---|
Hauptverfasser: | , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Tandem repeats (TRs) are associated with human disease, play a role in evolution and are important in regulatory processes. Despite their importance, locating and characterizing these patterns within anonymous DNA sequences remains a challenge. In part, the difficulty is due to imperfect conservation of patterns and complex pattern structures. We study recognition algorithms for two complex pattern structures: variable length tandem repeats (VLTRs) and multi-period tandem repeats (MPTRs).
We extend previous algorithmic research to a class of regular tandem repeats (RegTRs). We formally define RegTRs, as well as two important subclasses: VLTRs and MPTRs. We present algorithms for identification of TRs in these classes. Furthermore, our algorithms identify degenerate VLTRs and MPTRs: repeats containing substitutions, insertions and deletions. To illustrate our work, we present results of our analysis for two difficult regions in cattle and human data which reflect practical occurrences of these subclasses in GenBank sequence data. In addition, we show the applicability of our algorithmic techniques for identifying Alu sequences, gene clusters and other distant regions of similarity. We illustrate this with an example from yeast chromosome I. |
---|---|
ISSN: | 1367-4803 1367-4811 1460-2059 |
DOI: | 10.1093/bioinformatics/18.suppl_1.S31 |