The CaspBase: a curated database for evolutionary biochemical studies of caspase functional divergence and ancestral sequence inference

Sequence databases are powerful tools for the contemporary scientists’ toolkit. However, most functional annotations in public databases are determined computationally and are not verified by a human expert. While hypotheses generated from computational studies are now amenable to experimentation, t...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Protein science 2018-10, Vol.27 (10), p.1857-1870
Hauptverfasser: Grinshpon, Robert D., Williford, Anna, Titus‐McQuillan, James, Clay Clark, A.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Sequence databases are powerful tools for the contemporary scientists’ toolkit. However, most functional annotations in public databases are determined computationally and are not verified by a human expert. While hypotheses generated from computational studies are now amenable to experimentation, the quality of the results relies on the quality of input data. We developed the CaspBase to expedite high‐quality dataset compilation of annotated caspase sequences, to maximize phylogenetic signal, and to reduce the noise contributed from public databanks. We describe our methods of curation for the CaspBase and how researchers can acquire sequences from CaspBase.org. Our immediate goal for developing the CaspBase was to optimize the ancestral protein reconstruction (APR) of caspases, and we demonstrate the utility of the CaspBase in APR studies. We also developed the Common Position (CP) system for comparing human caspase family paralogs and suggest the CP system as an update to current reporting methods of caspase amino acid positions. We present a standardized multiple sequence alignment (MSA) for the CP system and show the advantage of using large databases such as the CaspBase in defining structural positions in proteins. Although the results described here pertain to caspase evolution and structure–function studies, the methods can be adapted to any gene family.
ISSN:0961-8368
1469-896X
DOI:10.1002/pro.3494