W2V-repeated index: Prediction of enhancers and their strength based on repeated fragments

Enhancers are crucial in gene expression regulation, dictating the specificity and timing of transcriptional activity, which highlights the importance of their identification for unravelling the intricacies of genetic regulation. Therefore, it is critical to identify enhancers and their strengths. R...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Genomics (San Diego, Calif.) Calif.), 2024-09, Vol.116 (5), p.110906, Article 110906
Hauptverfasser: Xie, Weiming, Yao, Zhaomin, Yuan, Yizhe, Too, Jingwei, Li, Fei, Wang, Hongyu, Zhan, Ying, Wu, Xiaodan, Wang, Zhiguo, Zhang, Guoxu
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Enhancers are crucial in gene expression regulation, dictating the specificity and timing of transcriptional activity, which highlights the importance of their identification for unravelling the intricacies of genetic regulation. Therefore, it is critical to identify enhancers and their strengths. Repeated sequences in the genome are repeats of the same or symmetrical fragments. There has been a great deal of evidence that repetitive sequences contain enormous amounts of genetic information. Thus, We introduce the W2V-Repeated Index, designed to identify enhancer sequence fragments and evaluates their strength through the analysis of repeated K-mer sequences in enhancer regions. Utilizing the word2vector algorithm for numerical conversion and Manta Ray Foraging Optimization for feature selection, this method effectively captures the frequency and distribution of K-mer sequences. By concentrating on repeated K-mer sequences, it minimizes computational complexity and facilitates the analysis of larger K values. Experiments indicate that our method performs better than all other advanced methods on almost all indicators. •Based on biological properties.•Improving predictive performance.•Reducing computing costs.
ISSN:0888-7543
1089-8646
1089-8646
DOI:10.1016/j.ygeno.2024.110906