Visualizing genomic data: The mixing perspective
We report on a novel way to visualize genomic data. By considering genome coding sequences, cds, as sets of the N=61 non-stop codons, one obtains a partition of the total number of codons in each cds. Partitions exhibit a statistical property known as mixing character which characterizes how mixed t...
Gespeichert in:
Veröffentlicht in: | BioSystems 2023-02, Vol.224, p.104839-104839, Article 104839 |
---|---|
Hauptverfasser: | , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | We report on a novel way to visualize genomic data. By considering genome coding sequences, cds, as sets of the N=61 non-stop codons, one obtains a partition of the total number of codons in each cds. Partitions exhibit a statistical property known as mixing character which characterizes how mixed the partition is. Mixing characters have been shown mathematically to exhibit a partial order known as majorization (Ruch, 1975). In previous work (Seitz and Kirwan, 2022) we developed an approach that combined mixing and entropy that is visualized as a scatter plot. If we consider all 1,121,505 partitions of 61 codons, this produces a plot we call the theoretical mixing space, TGMS. A normalization procedure is developed here and applied to real genomic data to produce the genome mixing signature, GMS. Example GMS’s of 19 species, including Homo sapiens, are shown and discussed. |
---|---|
ISSN: | 0303-2647 1872-8324 |
DOI: | 10.1016/j.biosystems.2023.104839 |