SHEIB-AGM: A Novel Stochastic Approach for Detecting High-Order Epistatic Interactions Using Bioinformation With Automatic Gene Matrix in Genome-Wide Association Studies
Detecting epistatic interactions in GWAS (genome-wide association studies) data is of great significance in studying common and complex diseases; however, the ability to detect high-order epistatic interactions in GWAS data is still insufficient. Existing methods are usually used to identify two-ord...
Gespeichert in:
Veröffentlicht in: | IEEE access 2020, Vol.8, p.21676-21693 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Detecting epistatic interactions in GWAS (genome-wide association studies) data is of great significance in studying common and complex diseases; however, the ability to detect high-order epistatic interactions in GWAS data is still insufficient. Existing methods are usually used to identify two-order interactions, and they cannot detect a large number of interactions. In this article, we propose a novel stochastic approach named SHEIB-AGM (stochastic approach for detecting high-order epistatic interactions using bioinformation with automatic gene matrix). SHEIB-AGM utilizes bioinformation to construct a gene matrix. In each iteration, it randomly generate a high-order SNP combination based on the gene matrix. SHEIB-AGM utilizes k2 (the Bayesian network scoring criterion) and G-test to detect epistasis in the generated combination and automatically update the gene matrix. We have compared SHEIB-AGM with six other methods, i.e., DECMDR, SNPHarvester, MACOED, AntEpiSeeker, HS-MMGKG and SEE, on simulated data including 108 epistatic models and 17,600 files. The results demonstrate that SHEIB-AGM greatly outperforms the above methods in terms of F-measure and power. We utilized SHEIB-AGM (with and without bioinformation) to analyze a real GWAS dataset from the Wellcome Trust Case Control Consortium. The results indicate that SHEIB-AGM with bioinformation can detect 33.94~3069.40-times more epistatic interactions. We have found numerous genes and gene pairs that may play an important role in seven complex diseases. Some of them have been found in the CTD database (the Comparative Toxicogenomics Database). |
---|---|
ISSN: | 2169-3536 2169-3536 |
DOI: | 10.1109/ACCESS.2020.2969465 |