CSGDN: Contrastive Signed Graph Diffusion Network for Predicting Crop Gene-phenotype Associations
Positive and negative association prediction between gene and phenotype helps to illustrate the underlying mechanism of complex traits in organisms. The transcription and regulation activity of specific genes will be adjusted accordingly in different cell types, developmental stages, and physiologic...
Gespeichert in:
Hauptverfasser: | , , , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Positive and negative association prediction between gene and phenotype helps
to illustrate the underlying mechanism of complex traits in organisms. The
transcription and regulation activity of specific genes will be adjusted
accordingly in different cell types, developmental stages, and physiological
states. There are the following two problems in obtaining the positive/negative
associations between gene and trait: 1) High-throughput DNA/RNA sequencing and
phenotyping are expensive and time-consuming due to the need to process large
sample sizes; 2) experiments introduce both random and systematic errors, and,
meanwhile, calculations or predictions using software or models may produce
noise. To address these two issues, we propose a Contrastive Signed Graph
Diffusion Network, CSGDN, to learn robust node representations with fewer
training samples to achieve higher link prediction accuracy. CSGDN employs a
signed graph diffusion method to uncover the underlying regulatory associations
between genes and phenotypes. Then, stochastic perturbation strategies are used
to create two views for both original and diffusive graphs. Lastly, a
multi-view contrastive learning paradigm loss is designed to unify the node
presentations learned from the two views to resist interference and reduce
noise. We conduct experiments to validate the performance of CSGDN on three
crop datasets: Gossypium hirsutum, Brassica napus, and Triticum turgidum. The
results demonstrate that the proposed model outperforms state-of-the-art
methods by up to 9.28% AUC for link sign prediction in G. hirsutum dataset. |
---|---|
DOI: | 10.48550/arxiv.2410.07511 |