Context-specific amino acid substitution matrices and their use in the detection of protein homologs

The sequence homology detection relies on score matrices, which reflect the frequency of amino acid substitutions observed in a dataset of homologous sequences. The substitution matrices in popular use today are usually constructed without consideration of the structural context in which the substit...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Proteins, structure, function, and bioinformatics structure, function, and bioinformatics, 2008-05, Vol.71 (2), p.910-919
Hauptverfasser: Goonesekere, Nalin C. W., Lee, Byungkook
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The sequence homology detection relies on score matrices, which reflect the frequency of amino acid substitutions observed in a dataset of homologous sequences. The substitution matrices in popular use today are usually constructed without consideration of the structural context in which the substitution takes place. Here, we present amino acid substitution matrices specific for particular polar–nonpolar environment of the amino acid. As expected, these matrices [context‐specific substitution matrices (CSSMs)] show striking differences from the popular BLOSUM62 matrix, which does not include structural information. When incorporated into BLAST and PSI‐BLAST, CSSM outperformed BLOSUM matrices as assessed by ROC curve analyses of the number of true and false hits and by the accuracy of the sequence alignments to the hit sequences. These findings are also of relevance to profile–profile‐based methods of homology detection, since CSSMs may help build a better profile. Profiles generated for protein sequences in PDB using CSSM‐PSI‐BLAST will be made available for searching via RPSBLAST through our web site http://lmbbi.nci.nih.gov/. Proteins 2008; 71:910–919. Published 2007 Wiley‐Liss, Inc.
ISSN:0887-3585
1097-0134
DOI:10.1002/prot.21775