Structured sparsity regularization for analyzing high-dimensional omics data

Abstract The development of new molecular and cell technologies is having a significant impact on the quantity of data generated nowadays. The growth of omics databases is creating a considerable potential for knowledge discovery and, concomitantly, is bringing new challenges to statistical learning...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Briefings in bioinformatics 2021-01, Vol.22 (1), p.77-87
1. Verfasser: Vinga, Susana
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Abstract The development of new molecular and cell technologies is having a significant impact on the quantity of data generated nowadays. The growth of omics databases is creating a considerable potential for knowledge discovery and, concomitantly, is bringing new challenges to statistical learning and computational biology for health applications. Indeed, the high dimensionality of these data may hamper the use of traditional regression methods and parameter estimation algorithms due to the intrinsic non-identifiability of the inherent optimization problem. Regularized optimization has been rising as a promising and useful strategy to solve these ill-posed problems by imposing additional constraints in the solution parameter space. In particular, the field of statistical learning with sparsity has been significantly contributing to building accurate models that also bring interpretability to biological observations and phenomena. Beyond the now-classic elastic net, one of the best-known methods that combine lasso with ridge penalizations, we briefly overview recent literature on structured regularizers and penalty functions that have been applied in biomedical data to build parsimonious models in a variety of underlying contexts, from survival to generalized linear models. These methods include functions of $\ell _k$-norms and network-based penalties that take into account the inherent relationships between the features. The successful application to omics data illustrates the potential of sparse structured regularization for identifying disease’s molecular signatures and for creating high-performance clinical decision support systems towards more personalized healthcare. Supplementary information: Supplementary data are available at Briefings in Bioinformatics online.
ISSN:1467-5463
1477-4054
DOI:10.1093/bib/bbaa122