Proteome analysis based on motif statistics

Motivation: Even for the amino acid motifs collected in the Prosite database there may be chance occurences as opposed to those occurences where the motif is involved in fold or function of a protein. With recent mathematical advances in assessing the significance of observing such a motif a particu...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Bioinformatics 2002-10, Vol.18 (suppl-2), p.S161-S171
Hauptverfasser: Nicodème, P., Doerks, T., Vingron, M.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Motivation: Even for the amino acid motifs collected in the Prosite database there may be chance occurences as opposed to those occurences where the motif is involved in fold or function of a protein. With recent mathematical advances in assessing the significance of observing such a motif a particular number of times, we can now study the over- or under-representation of particular motifs in a complete genome and attempt to make functional deductions. Results: We demonstrate that statistical over- or under-representation of motifs in complete proteomes may be an indicator of whether, in that organism, we are looking at chance occurrences of the motif or whether the occurrences are sufficiently numerous to suggest a systematic, and thus functionally important occurrence. This has important implications on databank annotations. Availability: The complete dataset comprising the plotted statistics of 266 Prosite motifs on 42 proteomes is available at http://algo.inria.fr/nicodeme/proteomes/proteocomp.html. The software used to compute this data has been described by Nicodème (2000, 2001). They are available either by web access as mentioned in these articles or by direct request from Pierre Nicodème. Contact: nicodeme@genopole.cnrs.fr
ISSN:1367-4803
1460-2059
1367-4811
DOI:10.1093/bioinformatics/18.suppl_2.S161