EmbedCaps-DBP: Predicting DNA-Binding Proteins Using Protein Sequence Embedding and Capsule Network

DNA-binding interactions are an essential biological activity with important functions, such as DNA replication, transcription, repair, and recombination. DNA-binding proteins (DBPs) have been strongly associated with various human diseases, such as asthma, cancer, and HIV/AIDS. Therefore, some DBPs...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE access 2023, Vol.11, p.121256-121268
Hauptverfasser: Naim, Muhammad Khaerul, Mengko, Tati Rajab, Hertadi, Rukman, Purwarianti, Ayu, Susanty, Meredita
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:DNA-binding interactions are an essential biological activity with important functions, such as DNA replication, transcription, repair, and recombination. DNA-binding proteins (DBPs) have been strongly associated with various human diseases, such as asthma, cancer, and HIV/AIDS. Therefore, some DBPs are used in the pharmaceutical industry to produce antibiotics, anticancer drugs, and anti-inflammatory drugs. Most previous methods have used evolutionary information to predict DBPs. However, these methods have high computing costs and produce unsatisfactory results. This study presents EmbedCaps-DBP, a new method for improving DBP prediction. First, we used three protein sequence embeddings (ProtT5, ESM-1b, and ESM-2) to extract learned feature representations from protein sequences. Those embedding methods can capture important information about amino acids, such as biophysics, biochemistry, structure, and domains, that have not been fully utilized in protein annotation tasks. Then, we used a 1D-capsule network (CapsNet) as a classifier. EmbedCaps-DBP significantly outperformed all existing classifiers in training and independent datasets. Based on two independent datasets, EmbedCaps-DBP (ProtT5) achieved 12.65% and 0.33% higher accuracies than a recent predictor on PDB2272 and PDB186, respectively. These results indicate that our proposed method is a promising predictor of DBPs.
ISSN:2169-3536
2169-3536
DOI:10.1109/ACCESS.2023.3328960