Prediction of Transcription Factor Binding Sites With an Attention Augmented Convolutional Neural Network

Identification of transcription factor binding sites (TFBSs) is essential for revealing the rules of protein-DNA binding. Although some computational methods have been presented to predict TFBSs using epigenomic and sequence features, most of them ignore the common features among cross-cell types. I...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE/ACM transactions on computational biology and bioinformatics 2022-11, Vol.19 (6), p.3614-3623
Hauptverfasser: Jing Zhang, Fang, Zhang, Shao-Wu, Zhang, Shihua
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Identification of transcription factor binding sites (TFBSs) is essential for revealing the rules of protein-DNA binding. Although some computational methods have been presented to predict TFBSs using epigenomic and sequence features, most of them ignore the common features among cross-cell types. It is still unclear to what extent the common features could help for this task . To this end, we proposed a new method (named Attention-augmented Convolutional Neural Network, or ACNN) to predict TFBSs. ACNN uses attention-augmented convolutional layers to capture global and local contexts in DNA sequences and employs the convolutional layers to capture features of histone modification markers. In addition, ACNN adopts the private and shared convolutional neural network (CNN) modules to learn specific and common features, respectively. To encourage the shared CNN module to learn the common features, adversarial training is applied in ACNN. The results on 253 ChIP-seq datasets show that ACNN outperforms other existing methods. The attention-augmented convolutional layers and adversarial training mechanism in ACNN can effectively improve the prediction performance. Moreover, in the case of limited labeled data, ACNN also performs better than a baseline method. We further visualize the convolution kernels as motifs to explain the interpretability of ACNN.
ISSN:1545-5963
1557-9964
DOI:10.1109/TCBB.2021.3126623