Harnessing model organism genomics to underpin the machine learning-based prediction of essential genes in eukaryotes – Biotechnological implications
The availability of high-quality genomes and advances in functional genomics have enabled large-scale studies of essential genes in model eukaryotes, including the ‘elegant worm’ (Caenorhabditis elegans; Nematoda) and the ‘vinegar fly’ (Drosophila melanogaster; Arthropoda). However, this is not the...
Gespeichert in:
Veröffentlicht in: | Biotechnology advances 2022-01, Vol.54, p.107822-107822, Article 107822 |
---|---|
Hauptverfasser: | , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | The availability of high-quality genomes and advances in functional genomics have enabled large-scale studies of essential genes in model eukaryotes, including the ‘elegant worm’ (Caenorhabditis elegans; Nematoda) and the ‘vinegar fly’ (Drosophila melanogaster; Arthropoda). However, this is not the case for other, much less-studied organisms, such as socioeconomically important parasites, for which functional genomic platforms usually do not exist. Thus, there is a need to develop innovative techniques or approaches for the prediction, identification and investigation of essential genes. A key approach that could enable the prediction of such genes is machine learning (ML). Here, we undertake an historical review of experimental and computational approaches employed for the characterisation of essential genes in eukaryotes, with a particular focus on model ecdysozoans (C. elegans and D. melanogaster), and discuss the possible applicability of ML-approaches to organisms such as socioeconomically important parasites. We highlight some recent results showing that high-performance ML, combined with feature engineering, allows a reliable prediction of essential genes from extensive, publicly available ‘omic data sets, with major potential to prioritise such genes (with statistical confidence) for subsequent functional genomic validation. These findings could ‘open the door’ to fundamental and applied research areas. Evidence of some commonality in the essential gene-complement between these two organisms indicates that an ML-engineering approach could find broader applicability to ecdysozoans such as parasitic nematodes or arthropods, provided that suitably large and informative data sets become/are available for proper feature engineering, and for the robust training and validation of algorithms. This area warrants detailed exploration to, for example, facilitate the identification and characterisation of essential molecules as novel targets for drugs and vaccines against parasitic diseases. This focus is particularly important, given the substantial impact that such diseases have worldwide, and the current challenges associated with their prevention and control and with drug resistance in parasite populations. |
---|---|
ISSN: | 0734-9750 1873-1899 |
DOI: | 10.1016/j.biotechadv.2021.107822 |