Clustering algorithms for identifying core atom sets and for assessing the precision of protein structure ensembles
An important open question in the field of NMR‐based biomolecular structure determination is how best to characterize the precision of the resulting ensemble of structures. Typically, the RMSD, as minimized in superimposing the ensemble of structures, is the preferred measure of precision. However,...
Gespeichert in:
Veröffentlicht in: | Proteins, structure, function, and bioinformatics structure, function, and bioinformatics, 2005-06, Vol.59 (4), p.673-686 |
---|---|
Hauptverfasser: | , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | An important open question in the field of NMR‐based biomolecular structure determination is how best to characterize the precision of the resulting ensemble of structures. Typically, the RMSD, as minimized in superimposing the ensemble of structures, is the preferred measure of precision. However, the presence of poorly determined atomic coordinates and multiple “RMSD‐stable domains”—locally well‐defined regions that are not aligned in global superimpositions—complicate RMSD calculations. In this paper, we present a method, based on a novel, structurally defined order parameter, for identifying a set of core atoms to use in determining superimpositions for RMSD calculations. In addition we present a method for deciding whether to partition that core atom set into “RMSD‐stable domains” and, if so, how to determine partitioning of the core atom set. We demonstrate our algorithm and its application in calculating statistically sound RMSD values by applying it to a set of NMR‐derived structural ensembles, superimposing each RMSD‐stable domain (or the entire core atom set, where appropriate) found in each protein structure under consideration. A parameter calculated by our algorithm using a novel, kurtosis‐based criterion, the ϵ‐value, is a measure of precision of the superimposition that complements the RMSD. In addition, we compare our algorithm with previously described algorithms for determining core atom sets. The methods presented in this paper for biomolecular structure superimposition are quite general, and have application in many areas of structural bioinformatics and structural biology. Proteins 2005. © 2005 Wiley‐Liss, Inc. |
---|---|
ISSN: | 0887-3585 1097-0134 |
DOI: | 10.1002/prot.20402 |