Exploring the limits of nearest neighbour secondary structure prediction

This paper presents a simple and robust secondary structure prediction scheme (SIMPA96) based on an updated version of the nearest neighbour method. Using a larger database of known structures, the Blosum 62 substitution matrix and a regularization algorithm, the three state prediction accuracy is i...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Protein Engineering 1997-07, Vol.10 (7), p.771-776
1. Verfasser: Levin, J M
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:This paper presents a simple and robust secondary structure prediction scheme (SIMPA96) based on an updated version of the nearest neighbour method. Using a larger database of known structures, the Blosum 62 substitution matrix and a regularization algorithm, the three state prediction accuracy is increased by 4.7 percentage points to 67.7% for a single sequence and up to 72.8% when using multiple alignments. The increase in prediction accuracy with respect to the previous version can be almost entirely ascribed to the sevenfold increase in the size of the database. A more detailed analysis of the results shows that badly predicted regions of a protein sequence are randomly distributed throughout the database and that the goal of perfect secondary structure predictions by methods which use only local sequence information is illusory.
ISSN:0269-2139
1741-0126
1741-0134
DOI:10.1093/protein/10.7.771