A multimodal screening system for elderly neurological diseases based on deep learning

In this paper, we propose a deep-learning-based algorithm for screening neurological diseases. We proposed various examination protocols for screening neurological diseases and collected data by video-recording persons performing these protocols. We converted video data into human landmarks that cap...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Scientific reports 2023-11, Vol.13 (1), p.21013-21013, Article 21013
Hauptverfasser:	Park, Sangyoung, No, Changho, Kim, Sora, Han, Kyoungmin, Jung, Jin-Man, Kwon, Kyum-Yil, Lee, Minsik
Format:	Artikel
Sprache:	eng
Schlagworte:	639/705/117 692/617/375/1718 692/617/375/534 Aged Algorithms Decision making Deep Learning Humanities and Social Sciences Humans Movement disorders multidisciplinary Neural networks Neural Networks, Computer Neurodegenerative diseases Neurological diseases Parkinson Disease - diagnosis Parkinson's disease Science Science (multidisciplinary) Stroke
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	In this paper, we propose a deep-learning-based algorithm for screening neurological diseases. We proposed various examination protocols for screening neurological diseases and collected data by video-recording persons performing these protocols. We converted video data into human landmarks that capture action information with a much smaller data dimension. We also used voice data which are also effective indicators of neurological disorders. We designed a subnetwork for each protocol to extract features from landmarks or voice and a feature aggregator that combines all the information extracted from the protocols to make a final decision. Multitask learning was applied to screen two neurological diseases. To capture meaningful information about these human landmarks and voices, we applied various pre-trained models to extract preliminary features. The spatiotemporal characteristics of landmarks are extracted using a pre-trained graph neural network, and voice features are extracted using a pre-trained time-delay neural network. These extracted high-level features are then passed onto the subnetworks and an additional feature aggregator that are simultaneously trained. We also used various data augmentation techniques to overcome the shortage of data. Using a frame-length staticizer that considers the characteristics of the data, we can capture momentary tremors without wasting information. Finally, we examine the effectiveness of different protocols and different modalities (different body parts and voice) through extensive experiments. The proposed method achieves AUC scores of 0.802 for stroke and 0.780 for Parkinson’s disease, which is effective for a screening system.
ISSN:	2045-2322 2045-2322
DOI:	10.1038/s41598-023-48071-y