Annotation of uORFs in the OMIM genes allows to reveal pathogenic variants in 5'UTRs

An increasing number of studies emphasize the role of non-coding variants in the development of hereditary diseases. However, the interpretation of such variants in clinical genetic testing still remains a critical challenge due to poor knowledge of their pathogenicity mechanisms. It was previously...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Nucleic acids research 2023-02, Vol.51 (3), p.1229-1244
Hauptverfasser: Filatova, Alexandra, Reveguk, Ivan, Piatkova, Maria, Bessonova, Daria, Kuziakova, Olga, Demakova, Victoria, Romanishin, Alexander, Fishman, Veniamin, Imanmalik, Yerzhan, Chekanov, Nikolay, Skitchenko, Rostislav, Barbitoff, Yury, Kardymon, Olga, Skoblov, Mikhail
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:An increasing number of studies emphasize the role of non-coding variants in the development of hereditary diseases. However, the interpretation of such variants in clinical genetic testing still remains a critical challenge due to poor knowledge of their pathogenicity mechanisms. It was previously shown that variants in 5'-untranslated regions (5'UTRs) can lead to hereditary diseases due to disruption of upstream open reading frames (uORFs). Here, we performed a manual annotation of upstream translation initiation sites (TISs) in human disease-associated genes from the OMIM database and revealed ∼4.7 thousand of TISs related to uORFs. We compared our TISs with the previous studies and provided a list of 'high confidence' uORFs. Using a luciferase assay, we experimentally validated the translation of uORFs in the ETFDH, PAX9, MAST1, HTT, TTN,GLI2 and COL2A1 genes, as well as existence of N-terminal CDS extension in the ZIC2 gene. Besides, we created a tool to annotate the effects of genetic variants located in uORFs. We revealed the variants from the HGMD and ClinVar databases that disrupt uORFs and thereby could lead to Mendelian disorders. We also showed that the distribution of uORFs-affecting variants differs between pathogenic and population variants. Finally, drawing on manually curated data, we developed a machine-learning algorithm that allows us to predict the TISs in other human genes.
ISSN:1362-4962
DOI:10.1093/nar/gkac1247