k-Skip-n-Gram-RF: A Random Forest Based Method for Alzheimer's Disease Protein Identification

In this paper, a computational method based on machine learning technique for identifying Alzheimer's disease genes is proposed. Compared with most existing machine learning based methods, existing methods predict Alzheimer's disease genes by using structural magnetic resonance imaging (MR...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Frontiers in genetics 2019-02, Vol.10, p.33
Hauptverfasser:	Xu, Lei, Liang, Guangmin, Liao, Changrui, Chen, Gin-Den, Chang, Chi-Chang
Format:	Artikel
Sprache:	eng
Schlagworte:	Alzheimer's disease gene coding Genetics n-gram model random forest sequence information
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	In this paper, a computational method based on machine learning technique for identifying Alzheimer's disease genes is proposed. Compared with most existing machine learning based methods, existing methods predict Alzheimer's disease genes by using structural magnetic resonance imaging (MRI) technique. Most methods have attained acceptable results, but the cost is expensive and time consuming. Thus, we proposed a computational method for identifying Alzheimer disease genes by use of the sequence information of proteins, and classify the feature vectors by random forest. In the proposed method, the gene protein information is extracted by adaptive k-skip-n-gram features. The proposed method can attain the accuracy to 85.5% on the selected UniProt dataset, which has been demonstrated by the experimental results.
ISSN:	1664-8021 1664-8021
DOI:	10.3389/fgene.2019.00033