Predicting enhancer-promoter interactions by deep learning and matching heuristic

Abstract Enhancer-promoter interactions (EPIs) play an important role in transcriptional regulation. Recently, machine learning-based methods have been widely used in the genome-scale identification of EPIs due to their promising predictive performance. In this paper, we propose a novel method, term...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Briefings in bioinformatics 2021-07, Vol.22 (4)
Hauptverfasser:	Min, Xiaoping, Ye, Congmin, Liu, Xiangrong, Zeng, Xiangxiang
Format:	Artikel
Sprache:	eng
Schlagworte:	Artificial neural networks Cell culture Cell lines Computer applications Deep learning Deoxyribonucleic acid DNA Enhancers Gene regulation Gene sequencing Genomes Heuristic Learning algorithms Machine learning Matching Neural networks Nucleotide sequence Performance prediction Problem solving Promoters Transcription
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Abstract Enhancer-promoter interactions (EPIs) play an important role in transcriptional regulation. Recently, machine learning-based methods have been widely used in the genome-scale identification of EPIs due to their promising predictive performance. In this paper, we propose a novel method, termed EPI-DLMH, for predicting EPIs with the use of DNA sequences only. EPI-DLMH consists of three major steps. First, a two-layer convolutional neural network is used to learn local features, and an bidirectional gated recurrent unit network is used to capture long-range dependencies on the sequences of promoters and enhancers. Second, an attention mechanism is used for focusing on relatively important features. Finally, a matching heuristic mechanism is introduced for the exploration of the interaction between enhancers and promoters. We use benchmark datasets in evaluating and comparing the proposed method with existing methods. Comparative results show that our model is superior to currently existing models in multiple cell lines. Specifically, we found that the matching heuristic mechanism introduced into the proposed model mainly contributes to the improvement of performance in terms of overall accuracy. Additionally, compared with existing models, our model is more efficient with regard to computational speed.
ISSN:	1467-5463 1477-4054
DOI:	10.1093/bib/bbaa254