Sensible initialization using expert knowledge for genome-wide analysis of epistasis using genetic programming

For biomedical researchers it is now possible to measure large numbers of DNA sequence variations across the human genome. Measuring hundreds of thousands of variations is now routine, but single variations which consistently predict an individual's risk of common human disease have proven elus...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Greene, C.S., White, B.C., Moore, J.H.
Format: Tagungsbericht
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:For biomedical researchers it is now possible to measure large numbers of DNA sequence variations across the human genome. Measuring hundreds of thousands of variations is now routine, but single variations which consistently predict an individual's risk of common human disease have proven elusive. Instead of single variants determining the risk of common human diseases, it seems more likely that disease risk is best modeled by interactions between biological components. The evolutionary computing challenge now is to effectively explore interactions in these large datasets and identify combinations of variations which are robust predictors of common human diseases such as bladder cancer. One promising approach to this problem is genetic programming (GP). A GP approach for this problem will use darwinian inspired evolution to evolve programs which find and model attribute interactions which predict an individual's risk of common human diseases. The goal of this study is to develop and evaluate two initializers for this domain. We develop a probabilistic initializer which uses expert knowledge to select attributes and an enumerative initializer which maximizes attribute diversity in the generated population.We compare these initializers to a random initializer which displays no preference for attributes. We show that the expert-knowledge-aware probabilistic initializer significantly outperforms both the random initializer and the enumerative initializer.We discuss implications of these results for the design of GP strategies which are able to detect and characterize predictors of common human diseases.
ISSN:1089-778X
1941-0026
DOI:10.1109/CEC.2009.4983093