Genome-wide operon prediction in Staphylococcus aureus

Identification of operon structure is critical to understanding gene regulation and function, and pathogenesis, and for identifying targets towards the development of new antibiotics in bacteria. Recently, the complete genome sequences of a large number of important human bacterial pathogens have be...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Nucleic acids research 2004, Vol.32 (12), p.3689-3702
Hauptverfasser: Wang, Liangsu, Trawick, John D., Yamamoto, Robert, Zamudio, Carlos
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Identification of operon structure is critical to understanding gene regulation and function, and pathogenesis, and for identifying targets towards the development of new antibiotics in bacteria. Recently, the complete genome sequences of a large number of important human bacterial pathogens have become available for computational analysis, including the major human Gram-positive pathogen Staphylococcus aureus. By annotating the predicted operon structure of the S.aureus genome, we hope to facilitate the exploration of the unique biology of this organism as well as the comparative genomics across a broad range of bacteria. We have integrated several operon prediction methods and developed a consensus approach to score the likelihood of each adjacent gene pair to be co-transcribed. Gene pairs were separated into distinct operons when scores were equal to or below an empirical threshold. Using this approach, we have generated a S.aureus genome map with scores annotated at the intersections of every adjacent gene pair. This approach predicted about 864 monocistronic transcripts and 533 polycistronic operons from the protein-encoding genes in the S.aureus strain Mu50 genome. When compared with a set of experimentally determined S.aureus operons from literature sources, this method successfully predicted at least 91% of gene pairs. At the transcription unit level, this approach correctly identified at least 92% of complete operons in this dataset. This consensus approach has enabled us to predict operons with high accuracy from a genome where limited experimental evidence for operon structure is available.
ISSN:0305-1048
1362-4962
1362-4962
DOI:10.1093/nar/gkh694