A KNN Model Based on Manhattan Distance to Identify the SNARE Proteins

SNARE proteins, known as membrane fusion proteins, play a primary role to mediate vesicle fusion. Loss of function of the SNARE protein can lead to a variety of diseases. A method to accurately identify the SNARE protein is important and necessary. In this paper, we try different kinds of combinatio...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE access 2020, Vol.8, p.112922-112931
Hauptverfasser: Gao, Xing, Li, Guilin
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:SNARE proteins, known as membrane fusion proteins, play a primary role to mediate vesicle fusion. Loss of function of the SNARE protein can lead to a variety of diseases. A method to accurately identify the SNARE protein is important and necessary. In this paper, we try different kinds of combinations of sampling methods (the resampling, SMOTE and no sampling), feature extraction approaches (the 188D, K-skip-2-gram and CKSAAP) and distance measurements (Chebyshev distance, Euclidean distance, Manhattan distance and Minkowski distance) to find a suitable model for identifying the SNARE proteins. By doing extensive experiments, we construct a Manhattan distance based KNN model by combining the CKSAAP feature extraction approach with no sampling method, which achieves the best identification performance among all combinations. Finally, we compare our KNN based model with a deep learning based model (called SNARE-CNN) from SN, SP, ACC and MCC four aspects, the experimental results show that the performance of our model is better than that of the SNARE-CNN.
ISSN:2169-3536
2169-3536
DOI:10.1109/ACCESS.2020.3003086