Have we seen all structures corresponding to short protein fragments in the Protein Data Bank? An update

Assembling short fragments from known structures has been a widely used approach to construct novel protein structures. To what extent there exist structurally similar fragments in the database of known structures for short fragments of a novel protein is a question that is fundamental to this appro...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Protein engineering 2003-06, Vol.16 (6), p.407-414
Hauptverfasser: Du, Peicheng, Andrec, Michael, Levy, Ronald M.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Assembling short fragments from known structures has been a widely used approach to construct novel protein structures. To what extent there exist structurally similar fragments in the database of known structures for short fragments of a novel protein is a question that is fundamental to this approach. This work addresses that question for seven‐, nine‐ and 15‐residue fragments. For each fragment size, two databases, a query database and a template database of fragments from high‐quality protein structures in SCOP20 and SCOP90, respectively, were constructed. For each fragment in the query database, the template database was scanned to find the lowest r.m.s.d. fragment among non‐homologous structures. For seven‐residue fragments, there is a 99% probability that there exists such a fragment within 0.7 Å r.m.s.d. for each loop fragment. For nine‐residue fragments there is a 96% probability of a fragment within 1 Å r.m.s.d., while for 15‐residue fragments there is a 91% probability of a fragment within 2 Å r.m.s.d.. These results, which update previous studies, show that there exists sufficient coverage to model even a novel fold using fragments from the Protein Data Bank, as the current database of known structures has increased enormously in the last few years. We have also explored the use of a grid search method for loop homology modeling and make some observations about the use of a grid search compared with a database search for the loop modeling problem.
ISSN:0269-2139
1741-0126
1460-213X
1741-0134
DOI:10.1093/protein/gzg052