Predicting protein structural classes from amino acid composition: application of fuzzy clustering

Most globular proteins can be classified into one of four structural classes-all-α, all-β, α+β and α/β-depending upon the type, amount and arrangement of secondary structures present In this work a new method, based upon fuzzy clustering, is proposed for predicting the structural class of a protein...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Protein engineering 1995-05, Vol.8 (5), p.425-435
Hauptverfasser: Zhang, Chun-Ting, Chou, Kuo-Chen, Maggiora, G. M.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Most globular proteins can be classified into one of four structural classes-all-α, all-β, α+β and α/β-depending upon the type, amount and arrangement of secondary structures present In this work a new method, based upon fuzzy clustering, is proposed for predicting the structural class of a protein from its amino acid composition. Here, each of the structural classes is described by a fuzzy cluster and each protein is characterized by its membership degree, a number between zero and one in each of the four clusters, with the constraint that the sum of the membership degrees equals unity. A given protein is then classified as belonging to that structural class corresponding to the fuzzy cluster with maximum membership degree. Calculation of membership degrees is carried out using the fuzzy c-means algorithm on a training set of 64 proteins. Results obtained for the training set show that the fuzzy clustering approach produces results comparable with or better than those obtained by other methods. A test set of 27 proteins also produced comparable results to those obtained with the training set The success of the present preliminary work on protein structure class prediction suggests that further refinements of method may lead to improved predictions and this is currently being investigated.
ISSN:1741-0126
0269-2139
1741-0134
DOI:10.1093/protein/8.5.425