Using physical potentials and learned models to distinguish native binding interfaces from de novo designed interfaces that do not bind

Protein–protein interactions are a fundamental aspect of many biological processes. The advent of recombinant protein and computational techniques has allowed for the rational design of proteins with novel binding capabilities. It is therefore desirable to predict which designed proteins are capable...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Proteins, structure, function, and bioinformatics structure, function, and bioinformatics, 2013-11, Vol.81 (11), p.1919-1930
Hauptverfasser: Demerdash, Omar N. A., Mitchell, Julie C.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Protein–protein interactions are a fundamental aspect of many biological processes. The advent of recombinant protein and computational techniques has allowed for the rational design of proteins with novel binding capabilities. It is therefore desirable to predict which designed proteins are capable of binding in vitro. To this end, we have developed a learned classification model that combines energetic and non‐energetic features. Our feature set is adapted from specialized potentials for aromatic interactions, hydrogen bonds, electrostatics, shape, and desolvation. A binding model built on these features was initially developed for CAPRI Round 21, achieving top results in the independent assessment. Here, we present a more thoroughly trained and validated model, and compare various support‐vector machine kernels. The Gaussian kernel model classified both high‐resolution complexes and designed nonbinders with 79–86% accuracy on independent test data. We also observe that multiple physical potentials for dielectric‐dependent electrostatics and hydrogen bonding contribute to the enhanced predictive accuracy, suggesting that their combined information is much greater than that of any single energetics model. We also study the change in predictive performance as the model features or training data are varied, observing unusual patterns of prediction in designed interfaces as compared with other data types. Proteins 2013; 81:1919–1930. © 2013 Wiley Periodicals, Inc.
ISSN:0887-3585
1097-0134
DOI:10.1002/prot.24337