K 1 K 2 NN: A novel multi-label classification approach based on neighbors for predicting COVID-19 drug side effects

COVID-19, a novel ailment, has received comparatively fewer drugs for its treatment. Side Effects (SE) of a COVID-19 drug could cause long-term health issues. Hence, SE prediction is essential in COVID-19 drug development. Efficient models are also needed to predict COVID-19 drug SE since most exist...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Computational biology and chemistry 2024-04, Vol.110, p.108066
Hauptverfasser: Das, Pranab, Mazumder, Dilwar Hussain
Format: Artikel
Sprache:eng
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:COVID-19, a novel ailment, has received comparatively fewer drugs for its treatment. Side Effects (SE) of a COVID-19 drug could cause long-term health issues. Hence, SE prediction is essential in COVID-19 drug development. Efficient models are also needed to predict COVID-19 drug SE since most existing research has proposed many classifiers to predict SE for diseases other than COVID-19. This work proposes a novel classifier based on neighbors named K K Nearest Neighbors (K K NN) to predict the SE of the COVID-19 drug from 17 molecules' descriptors and the chemical 1D structure of the drugs. The model is implemented based on the proposition that chemically similar drugs may be assigned similar drug SE, and co-occurring SE may be assigned to chemically similar drugs. The K K NN model chooses the first K neighbors to the test drug sample by calculating its similarity with the train drug samples. It then assigns the test sample with the SE label having the majority count on the SE labels of these K neighbor drugs obtained through a voting mechanism. The model then calculates the SE-SE similarity using the Jaccard similarity measure from the SE co-occurrence values. Finally, the model chooses the most similar K SE neighbors for those SE determined by the K neighbor drugs and assigns these SE to that test drug sample. The proposed K K NN model has showcased promising performance with the highest accuracy of 97.53% on chemical 1D drug structure and outperforms the state-of-the-art multi-label classifiers. In addition, we demonstrate the successful application of the proposed model on gene expression signature datasets, which aided in evaluating its performance and confirming its accuracy and robustness.
ISSN:1476-928X
DOI:10.1016/j.compbiolchem.2024.108066