DeepSTARR predicts enhancer activity from DNA sequence and enables the de novo design of synthetic enhancers
Enhancer sequences control gene expression and comprise binding sites (motifs) for different transcription factors (TFs). Despite extensive genetic and computational studies, the relationship between DNA sequence and regulatory activity is poorly understood, and de novo enhancer design has been chal...
Gespeichert in:
Veröffentlicht in: | Nature genetics 2022-05, Vol.54 (5), p.613-624 |
---|---|
Hauptverfasser: | , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Enhancer sequences control gene expression and comprise binding sites (motifs) for different transcription factors (TFs). Despite extensive genetic and computational studies, the relationship between DNA sequence and regulatory activity is poorly understood, and de novo enhancer design has been challenging. Here, we built a deep-learning model, DeepSTARR, to quantitatively predict the activities of thousands of developmental and housekeeping enhancers directly from DNA sequence in
Drosophila melanogaster
S2 cells. The model learned relevant TF motifs and higher-order syntax rules, including functionally nonequivalent instances of the same TF motif that are determined by motif-flanking sequence and intermotif distances. We validated these rules experimentally and demonstrated that they can be generalized to humans by testing more than 40,000 wildtype and mutant
Drosophila
and human enhancers. Finally, we designed and functionally validated synthetic enhancers with desired activities de novo.
A deep-learning model called DeepSTARR quantitatively predicts enhancer activity on the basis of DNA sequence. The model learns relevant motifs and syntax rules, allowing for the design of synthetic enhancers with specific strengths. |
---|---|
ISSN: | 1061-4036 1546-1718 |
DOI: | 10.1038/s41588-022-01048-5 |