Clustering algorithms for identifying core atom sets and for assessing the precision of protein structure ensembles

An important open question in the field of NMR‐based biomolecular structure determination is how best to characterize the precision of the resulting ensemble of structures. Typically, the RMSD, as minimized in superimposing the ensemble of structures, is the preferred measure of precision. However,...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Proteins, structure, function, and bioinformatics structure, function, and bioinformatics, 2005-06, Vol.59 (4), p.673-686
Hauptverfasser: Snyder, David A., Montelione, Gaetano T.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:An important open question in the field of NMR‐based biomolecular structure determination is how best to characterize the precision of the resulting ensemble of structures. Typically, the RMSD, as minimized in superimposing the ensemble of structures, is the preferred measure of precision. However, the presence of poorly determined atomic coordinates and multiple “RMSD‐stable domains”—locally well‐defined regions that are not aligned in global superimpositions—complicate RMSD calculations. In this paper, we present a method, based on a novel, structurally defined order parameter, for identifying a set of core atoms to use in determining superimpositions for RMSD calculations. In addition we present a method for deciding whether to partition that core atom set into “RMSD‐stable domains” and, if so, how to determine partitioning of the core atom set. We demonstrate our algorithm and its application in calculating statistically sound RMSD values by applying it to a set of NMR‐derived structural ensembles, superimposing each RMSD‐stable domain (or the entire core atom set, where appropriate) found in each protein structure under consideration. A parameter calculated by our algorithm using a novel, kurtosis‐based criterion, the ϵ‐value, is a measure of precision of the superimposition that complements the RMSD. In addition, we compare our algorithm with previously described algorithms for determining core atom sets. The methods presented in this paper for biomolecular structure superimposition are quite general, and have application in many areas of structural bioinformatics and structural biology. Proteins 2005. © 2005 Wiley‐Liss, Inc.
ISSN:0887-3585
1097-0134
DOI:10.1002/prot.20402