BASiNETEntropy: an alignment-free method for classification of biological sequences through complex networks and entropy maximization

The discovery of nucleic acids and the structure of DNA have brought considerable advances in the understanding of life. The development of next-generation sequencing technologies has led to a large-scale generation of data, for which computational methods have become essential for analysis and know...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Breve, Murilo Montanini, Pimenta-Zanon, Matheus Henrique, Lopes, Fabrício Martins
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The discovery of nucleic acids and the structure of DNA have brought considerable advances in the understanding of life. The development of next-generation sequencing technologies has led to a large-scale generation of data, for which computational methods have become essential for analysis and knowledge discovery. In particular, RNAs have received much attention because of the diversity of their functionalities in the organism and the discoveries of different classes with different functions in many biological processes. Therefore, the correct identification of RNA sequences is increasingly important to provide relevant information to understand the functioning of organisms. This work addresses this context by presenting a new method for the classification of biological sequences through complex networks and entropy maximization. The maximum entropy principle is proposed to identify the most informative edges about the RNA class, generating a filtered complex network. The proposed method was evaluated in the classification of different RNA classes from 13 species. The proposed method was compared to PLEK, CPC2 and BASiNET methods, outperforming all compared methods. BASiNETEntropy classified all RNA sequences with high accuracy and low standard deviation in results, showing assertiveness and robustness. The proposed method is implemented in an open source in R language and is freely available at https://cran.r-project.org/web/packages/BASiNETEntropy.
DOI:10.48550/arxiv.2203.15635