Multilingual word embeddings for the assessment of narrative speech in mild cognitive impairment

•An analysis of Cookie Theft narratives in English and Swedish is presented.•Multilingual word embeddings are clustered to generate multilingual topics.•Features extracted from the topic model help detect mild cognitive impairment.•Classification accuracy is 63% (English) and 72% (Swedish).•Multilin...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Computer speech & language 2019-01, Vol.53, p.121-139
Hauptverfasser: Fraser, Kathleen C., Lundholm Fors, Kristina, Kokkinakis, Dimitrios
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:•An analysis of Cookie Theft narratives in English and Swedish is presented.•Multilingual word embeddings are clustered to generate multilingual topics.•Features extracted from the topic model help detect mild cognitive impairment.•Classification accuracy is 63% (English) and 72% (Swedish).•Multilingual topic models outperform monolingual models in both languages. We analyze the information content of narrative speech samples from individuals with mild cognitive impairment (MCI), in both English and Swedish, using a combination of supervised and unsupervised learning techniques. We extract information units using topic models trained on word embeddings in monolingual and multilingual spaces, and find that the multilingual approach leads to significantly better classification accuracies than training on the target language alone. In many cases, we find that augmenting the topic model training corpus with additional clinical data from a different language is more effective than training on additional monolingual data from healthy controls. Ultimately we are able to distinguish MCI speakers from healthy older adults with accuracies of up to 63% (English) and 72% (Swedish) on the basis of information content alone. We also compare our method against previous results measuring information content in Alzheimer’s disease, and report an improvement over other topic-modeling approaches. Furthermore, our results support the hypothesis that subtle differences in language can be detected in narrative speech, even at the very early stages of cognitive decline, when scores on screening tools such as the Mini-Mental State Exam are still in the “normal” range.
ISSN:0885-2308
1095-8363
1095-8363
DOI:10.1016/j.csl.2018.07.005