Probabilistic interaction network of evidence algorithm and its application to complete labeling of peak lists from protein NMR spectroscopy

The process of assigning a finite set of tags or labels to a collection of observations, subject to side conditions, is notable for its computational complexity. This labeling paradigm is of theoretical and practical relevance to a wide range of biological applications, including the analysis of dat...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:PLoS computational biology 2009-03, Vol.5 (3), p.e1000307-e1000307
Hauptverfasser: Bahrami, Arash, Assadi, Amir H, Markley, John L, Eghbalnia, Hamid R
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The process of assigning a finite set of tags or labels to a collection of observations, subject to side conditions, is notable for its computational complexity. This labeling paradigm is of theoretical and practical relevance to a wide range of biological applications, including the analysis of data from DNA microarrays, metabolomics experiments, and biomolecular nuclear magnetic resonance (NMR) spectroscopy. We present a novel algorithm, called Probabilistic Interaction Network of Evidence (PINE), that achieves robust, unsupervised probabilistic labeling of data. The computational core of PINE uses estimates of evidence derived from empirical distributions of previously observed data, along with consistency measures, to drive a fictitious system M with Hamiltonian H to a quasi-stationary state that produces probabilistic label assignments for relevant subsets of the data. We demonstrate the successful application of PINE to a key task in protein NMR spectroscopy: that of converting peak lists extracted from various NMR experiments into assignments associated with probabilities for their correctness. This application, called PINE-NMR, is available from a freely accessible computer server (http://pine.nmrfam.wisc.edu). The PINE-NMR server accepts as input the sequence of the protein plus user-specified combinations of data corresponding to an extensive list of NMR experiments; it provides as output a probabilistic assignment of NMR signals (chemical shifts) to sequence-specific backbone and aliphatic side chain atoms plus a probabilistic determination of the protein secondary structure. PINE-NMR can accommodate prior information about assignments or stable isotope labeling schemes. As part of the analysis, PINE-NMR identifies, verifies, and rectifies problems related to chemical shift referencing or erroneous input data. PINE-NMR achieves robust and consistent results that have been shown to be effective in subsequent steps of NMR structure determination.
ISSN:1553-7358
1553-734X
1553-7358
DOI:10.1371/journal.pcbi.1000307