New approaches in omics data modelling

The breakthrough in the technological field has allowed the extraction of large amounts of the so-called omics data. The analysis and Integration of this type of data by means of advanced statistical and bioinformatics methods will allow the improvement in the management of diseases. The diversity a...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
1. Verfasser: Nonell Mazelon, Lara
Format: Dissertation
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The breakthrough in the technological field has allowed the extraction of large amounts of the so-called omics data. The analysis and Integration of this type of data by means of advanced statistical and bioinformatics methods will allow the improvement in the management of diseases. The diversity and complexity of omics data has encouraged the development of hundreds of new statistical methods to meet this objective. Therefore, having the appropriate methods to accommodate different data distributions and modelling complex data structures becomes essential. This thesis presents advances in three directions in this regard. First, the study of several methods to assess non-linear associations which is relevant when assessing the effect of environmental exposures (i.e exposome) on complex diseases. The study is accompanied by the development of the R package nlOmicAssoc. Second, the simplex distribution is proposed to analyse methylome data since this distribution properly fits beta values that are generated in this type of studies. The extension to generalized linear models with simplex response is also proposed. Lastly, an R package, HOmics, has been developed to incorporate a priori biological knowledge into association studies by using Bayesian hierarchical models. It also implements methods to model the dependence between omics data, enabling data integration L’avenç en el camp tecnològic ens ha permès obtenir grans quantitats de les anomenades dades òmiques. L’anàlisi i integració d’aquesta mena de dades mitjançant mètodes estadístics i bioinformàtics avançats ha de permetre la millora en el maneig de les malalties. La diversitat i complexitat de les dades òmiques ha incentivat el desenvolupament de centenars de nous mètodes estadístics per a complir amb aquest objectiu. Per tant, és primordial disposar de mètodes que acomodin les distribucions adequades i modelin estructures de dades complexes. Davant d’això, aquesta tesi presenta avenços en tres direccions. En primer lloc, l’estudi de diferents mètodes per a analitzar associacions no lineals, molt rellevant en estudis d’associació entre exposicions mediambientals (i.e. exposoma) i malalties complexes. Aquesta anàlisi va acompanyada del desenvolupament del paquet de R nlOmicAssoc. En segon lloc, es proposa utilitzar la distribució simplex per analitzar dades metilòmiques, donat que aquesta distribució ajusta els valors beta generats en aquesta mena d’estudis. També es formula l’extensió a models linea