Principled distillation of UK Biobank phenotype data reveals underlying structure in human variation

Data within biobanks capture broad yet detailed indices of human variation, but biobank-wide insights can be difficult to extract due to complexity and scale. Here, using large-scale factor analysis, we distill hundreds of variables (diagnoses, assessments and survey items) into 35 latent constructs...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Nature human behaviour 2024-08, Vol.8 (8), p.1599-1615
Hauptverfasser: Carey, Caitlin E., Shafee, Rebecca, Wedow, Robbee, Elliott, Amanda, Palmer, Duncan S., Compitello, John, Kanai, Masahiro, Abbott, Liam, Schultz, Patrick, Karczewski, Konrad J., Bryant, Samuel C., Cusick, Caroline M., Churchhouse, Claire, Howrigan, Daniel P., King, Daniel, Davey Smith, George, Neale, Benjamin M., Walters, Raymond K., Robinson, Elise B.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Data within biobanks capture broad yet detailed indices of human variation, but biobank-wide insights can be difficult to extract due to complexity and scale. Here, using large-scale factor analysis, we distill hundreds of variables (diagnoses, assessments and survey items) into 35 latent constructs, using data from unrelated individuals with predominantly estimated European genetic ancestry in UK Biobank. These factors recapitulate known disease classifications, disentangle elements of socioeconomic status, highlight the relevance of psychiatric constructs to health and improve measurement of pro-health behaviours. We go on to demonstrate the power of this approach to clarify genetic signal, enhance discovery and identify associations between underlying phenotypic structure and health outcomes. In building a deeper understanding of ways in which constructs such as socioeconomic status, trauma, or physical activity are structured in the dataset, we emphasize the importance of considering the interwoven nature of the human phenome when evaluating public health patterns. Carey and colleagues reveal 35 major latent constructs (factors) in the phenotype data of unrelated individuals with predominantly estimated European genetic ancestry from UK Biobank.
ISSN:2397-3374
2397-3374
DOI:10.1038/s41562-024-01909-5