Genetic code symmetry and efficient design of GC-constrained coding sequences

Motivation: Cloning of long DNA sequences (40–60 bases) into phage display libraries using polymerase chain reaction (PCR) is a low efficiency process, in which PCR is used to incorporate a DNA insert, coding for a certain peptide, into the amplified sequence. The PCR efficiency in this process is s...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Bioinformatics 2007-01, Vol.23 (2), p.e57-e63
Hauptverfasser: Gavish, Matan, Peled, Amnon, Chor, Benny
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Motivation: Cloning of long DNA sequences (40–60 bases) into phage display libraries using polymerase chain reaction (PCR) is a low efficiency process, in which PCR is used to incorporate a DNA insert, coding for a certain peptide, into the amplified sequence. The PCR efficiency in this process is strongly affected by the distribution of G–C bases in the amplified sequence. As any DNA insert coding for the target peptide may be attempted, there is a flexibility in choosing part of the amplified sequence. Since the number of inserts coding for the same peptide is exponential in the peptide length, a computational problem naturally arises—that of efficiently finding an insert, whose parameters are optimal for PCR cloning. Results: The GC distribution requirements are formulated as a search problem. We developed an efficient, linear time ‘one pass’ algorithm for this problem. Interestingly, our algorithm strongly relies on an interesting symmetry, which we observed in the standard genetic code. Most non-standard genetic codes examined possess this symmetry as well, yet some do not. We generalize the search problem and consider the case of a non-standard, or arbitrary, genetic code where this symmetry does not necessary hold. We solve the generalized problem in polynomial, but nonlinear, time. Availability: An implementation of the proposed algorithm is available upon request from the authors. Contact:benny@cs.tau.ac.il
ISSN:1367-4803
1460-2059
1367-4811
DOI:10.1093/bioinformatics/btl317