SemanticCAP: Chromatin Accessibility Prediction Enhanced by Features Learning from a Language Model
A large number of inorganic and organic compounds are able to bind DNA and form complexes, among which drug-related molecules are important. Chromatin accessibility changes not only directly affects drug-DNA interactions, but also promote or inhibit the expression of critical genes associated with d...
Gespeichert in:
Hauptverfasser: | , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | A large number of inorganic and organic compounds are able to bind DNA and
form complexes, among which drug-related molecules are important. Chromatin
accessibility changes not only directly affects drug-DNA interactions, but also
promote or inhibit the expression of critical genes associated with drug
resistance by affecting the DNA binding capacity of TFs and transcriptional
regulators. However, Biological experimental techniques for measuring it are
expensive and time consuming. In recent years, several kinds of computational
methods have been proposed to identify accessible regions of the genome.
Existing computational models mostly ignore the contextual information of bases
in gene sequences. To address these issues, we proposed a new solution named
SemanticCAP. It introduces a gene language model which models the context of
gene sequences, thus being able to provide an effective representation of a
certain site in gene sequences. Basically, we merge the features provided by
the gene language model into our chromatin accessibility model. During the
process, we designed some methods to make feature fusion smoother. Compared
with other systems under public benchmarks, our model proved to have better
performance. |
---|---|
DOI: | 10.48550/arxiv.2204.02130 |