OMA, A Comprehensive, Automated Project for the Identification of Orthologs from Complete Genome Data: Introduction and First Achievements

The OMA project is a large-scale effort to identify groups of orthologs from complete genome data, currently 150 species. The algorithm relies solely on protein sequence information and does not require any human supervision. It has several original features, in particular a verification step that d...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Dessimoz, Christophe, Cannarozzi, Gina, Gil, Manuel, Margadant, Daniel, Roth, Alexander, Schneider, Adrian, Gonnet, Gaston H.
Format: Buchkapitel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The OMA project is a large-scale effort to identify groups of orthologs from complete genome data, currently 150 species. The algorithm relies solely on protein sequence information and does not require any human supervision. It has several original features, in particular a verification step that detects paralogs and prevents them from being clustered together. Consistency checks and verification are performed throughout the process. The resulting groups, whenever a comparison could be made, are highly consistent both with EC assignments, and with assignments from the manually curated database HAMAP. A highly accurate set of orthologous sequences constitutes the basis for several other investigations, including phylogenetic analysis and protein classification.
ISSN:0302-9743
1611-3349
DOI:10.1007/11554714_6