Fusion/fission protein family identification in Archaea

The majority of newly discovered archaeal lineages remain without a cultivated representative, but scarce experimental data from the cultivated organisms show that they harbor distinct functional repertoires. To unveil the ecological as well as evolutionary impact of from metagenomics, new computati...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:mSystems 2024-06, Vol.9 (6), p.e0094823
Hauptverfasser: Padalko, Anastasiia, Nair, Govind, Sousa, Filipa L
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The majority of newly discovered archaeal lineages remain without a cultivated representative, but scarce experimental data from the cultivated organisms show that they harbor distinct functional repertoires. To unveil the ecological as well as evolutionary impact of from metagenomics, new computational methods need to be developed, followed by in-depth analysis. Among them is the genome-wide protein fusion screening performed here. Natural fusions and fissions of genes not only contribute to microbial evolution but also complicate the correct identification and functional annotation of sequences. The products of these processes can be defined as fusion (or composite) proteins, the ones consisting of two or more domains originally encoded by different genes and split proteins, and the ones originating from the separation of a gene in two (fission). Fusion identifications are required for proper phylogenetic reconstructions and metabolic pathway completeness assessments, while mappings between fused and unfused proteins can fill some of the existing gaps in metabolic models. In the archaeal genome-wide screening, more than 1,900 fusion/fission protein clusters were identified, belonging to both newly sequenced and well-studied lineages. These protein families are mainly associated with different types of metabolism, genetic, and cellular processes. Moreover, 162 of the identified fusion/fission protein families are archaeal specific, having no identified fused homolog within the bacterial domain. Our approach was validated by the identification of experimentally characterized fusion/fission cases. However, around 25% of the identified fusion/fission families lack functional annotations for both composite and split states, showing the need for experimental characterization in Archaea.IMPORTANCEGenome-wide fusion screening has never been performed in on a broad taxonomic scale. The overlay of multiple computational techniques allows the detection of a fine-grained set of predicted fusion/fission families, instead of rough estimations based on conserved domain annotations only. The exhaustive mapping of fused proteins to bacterial organisms allows us to capture fusion/fission families that are specific to archaeal biology, as well as to identify links between bacterial and archaeal lineages based on cooccurrence of taxonomically restricted proteins and their sequence features. Furthermore, the identification of poorly characterized lineage-specific fusion prot
ISSN:2379-5077
2379-5077
DOI:10.1128/msystems.00948-23