DEMENTIA: A Hybrid Attention-Based Multimodal and Multi-Task Learning Framework With Expert Knowledge for Alzheimer's Disease Assessment From Speech

The prevalence of Alzheimer's disease (AD) is rising annually, imposing a severe burden on patients and society. Therefore, assisted AD assessment is crucial. The decline in language function and the cognitive impairment it reflects are key external manifestations of AD. Many studies have utili...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE journal of biomedical and health informatics 2024-12, p.1-12
Hauptverfasser:	Zhang, Zhenglin, Wang, Tengfei, Hu, Zian, Yang, Li-Zhuang, Li, Hai
Format:	Artikel
Sprache:	eng
Schlagworte:	Accuracy Acoustics Alzheimer's disease Artificial intelligence Bioinformatics Context modeling Diseases expert knowledge Feature extraction Hands hybrid attention Linguistics multi-task learning multimodal fusion Multitasking speech analysis
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	12
container_issue
container_start_page	1
container_title	IEEE journal of biomedical and health informatics
container_volume
creator	Zhang, Zhenglin Wang, Tengfei Hu, Zian Yang, Li-Zhuang Li, Hai
description	The prevalence of Alzheimer's disease (AD) is rising annually, imposing a severe burden on patients and society. Therefore, assisted AD assessment is crucial. The decline in language function and the cognitive impairment it reflects are key external manifestations of AD. Many studies have utilized speech analysis to achieve convenient, non-invasive, and low-cost AD detection. Although state-of-the-art researches achieve high-precision AD detection using multimodal information, these studies often ignore interactions between different modalities and lack explanations for complex models. To address this, we propose a multi-task learning (MTL) AD assessment model that combines hybrid attention with multimodal representations. The model fuses audio, text, and expert knowledge to fully capture intra- and inter-modal interactions, achieving simultaneous AD detection and cognitive state prediction, along with comprehensive explainability analyses of the model and various modalities. Results show that the proposed method is sufficiently sensitive in assessing AD, achieving 89.58% accuracy and 91.67% recall for the classification task and a root mean square error of 4.31 for the regression task with good generalization performance. Multimodal representations with expert knowledge and MTL contribute to AD assessment performance. Explainability analyses indicate that, compared to healthy controls, AD patients exhibit slower speech rates, reduced syntactic complexity, and a greater tendency to use pause fillers and pronouns. Therefore, Our study validates the effectiveness of the proposed method, addressing trust issues in clinical practice for assisted decision-making and further advancing the development of speech as a promising biomarker for early AD screening and cognitive decline monitoring.
doi_str_mv	10.1109/JBHI.2024.3509620
format	Article
fullrecord	<record><control><sourceid>ieee_RIE</sourceid><recordid>TN_cdi_ieee_primary_10806741</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>10806741</ieee_id><sourcerecordid>10806741</sourcerecordid><originalsourceid>FETCH-ieee_primary_108067413</originalsourceid><addsrcrecordid>eNqFjTFPwzAUhD2ARAX9AUgMb2NKsNMQEraUpkpbykIkxsqQ18Y0tiM_o1J-Bz8YD9255XR30neMXQseC8GLu-W0XsQJT9J4cs-LLOFnbJSILI8SUaQXbEz0yYPyUBXZiP3OqnX10izKRyihPr471ULpPRqvrImmkrCF9Vfvlbat7EGaU4waSXt4RumMMjuYO6nxYN0e3pTvoPoe0HlYGXvosd0hbK2Dsv_pUGl0twQzRRjYUBIhkQ53AWE1vA6IH90VO9_KnnB88kt2M6-apzpSiLgZnNLSHTeC5zx7SMXkn_kPpX9WNQ</addsrcrecordid><sourcetype>Publisher</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>DEMENTIA: A Hybrid Attention-Based Multimodal and Multi-Task Learning Framework With Expert Knowledge for Alzheimer's Disease Assessment From Speech</title><source>IEEE Electronic Library (IEL)</source><creator>Zhang, Zhenglin ; Wang, Tengfei ; Hu, Zian ; Yang, Li-Zhuang ; Li, Hai</creator><creatorcontrib>Zhang, Zhenglin ; Wang, Tengfei ; Hu, Zian ; Yang, Li-Zhuang ; Li, Hai</creatorcontrib><description>The prevalence of Alzheimer's disease (AD) is rising annually, imposing a severe burden on patients and society. Therefore, assisted AD assessment is crucial. The decline in language function and the cognitive impairment it reflects are key external manifestations of AD. Many studies have utilized speech analysis to achieve convenient, non-invasive, and low-cost AD detection. Although state-of-the-art researches achieve high-precision AD detection using multimodal information, these studies often ignore interactions between different modalities and lack explanations for complex models. To address this, we propose a multi-task learning (MTL) AD assessment model that combines hybrid attention with multimodal representations. The model fuses audio, text, and expert knowledge to fully capture intra- and inter-modal interactions, achieving simultaneous AD detection and cognitive state prediction, along with comprehensive explainability analyses of the model and various modalities. Results show that the proposed method is sufficiently sensitive in assessing AD, achieving 89.58% accuracy and 91.67% recall for the classification task and a root mean square error of 4.31 for the regression task with good generalization performance. Multimodal representations with expert knowledge and MTL contribute to AD assessment performance. Explainability analyses indicate that, compared to healthy controls, AD patients exhibit slower speech rates, reduced syntactic complexity, and a greater tendency to use pause fillers and pronouns. Therefore, Our study validates the effectiveness of the proposed method, addressing trust issues in clinical practice for assisted decision-making and further advancing the development of speech as a promising biomarker for early AD screening and cognitive decline monitoring.</description><identifier>ISSN: 2168-2194</identifier><identifier>DOI: 10.1109/JBHI.2024.3509620</identifier><identifier>CODEN: IJBHA9</identifier><language>eng</language><publisher>IEEE</publisher><subject>Accuracy ; Acoustics ; Alzheimer's disease ; Artificial intelligence ; Bioinformatics ; Context modeling ; Diseases ; expert knowledge ; Feature extraction ; Hands ; hybrid attention ; Linguistics ; multi-task learning ; multimodal fusion ; Multitasking ; speech analysis</subject><ispartof>IEEE journal of biomedical and health informatics, 2024-12, p.1-12</ispartof><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><orcidid>0000-0002-5490-6815 ; 0000-0001-8504-5811</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/10806741$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,776,780,792,27901,27902,54733</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/10806741$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Zhang, Zhenglin</creatorcontrib><creatorcontrib>Wang, Tengfei</creatorcontrib><creatorcontrib>Hu, Zian</creatorcontrib><creatorcontrib>Yang, Li-Zhuang</creatorcontrib><creatorcontrib>Li, Hai</creatorcontrib><title>DEMENTIA: A Hybrid Attention-Based Multimodal and Multi-Task Learning Framework With Expert Knowledge for Alzheimer's Disease Assessment From Speech</title><title>IEEE journal of biomedical and health informatics</title><addtitle>JBHI</addtitle><description>The prevalence of Alzheimer's disease (AD) is rising annually, imposing a severe burden on patients and society. Therefore, assisted AD assessment is crucial. The decline in language function and the cognitive impairment it reflects are key external manifestations of AD. Many studies have utilized speech analysis to achieve convenient, non-invasive, and low-cost AD detection. Although state-of-the-art researches achieve high-precision AD detection using multimodal information, these studies often ignore interactions between different modalities and lack explanations for complex models. To address this, we propose a multi-task learning (MTL) AD assessment model that combines hybrid attention with multimodal representations. The model fuses audio, text, and expert knowledge to fully capture intra- and inter-modal interactions, achieving simultaneous AD detection and cognitive state prediction, along with comprehensive explainability analyses of the model and various modalities. Results show that the proposed method is sufficiently sensitive in assessing AD, achieving 89.58% accuracy and 91.67% recall for the classification task and a root mean square error of 4.31 for the regression task with good generalization performance. Multimodal representations with expert knowledge and MTL contribute to AD assessment performance. Explainability analyses indicate that, compared to healthy controls, AD patients exhibit slower speech rates, reduced syntactic complexity, and a greater tendency to use pause fillers and pronouns. Therefore, Our study validates the effectiveness of the proposed method, addressing trust issues in clinical practice for assisted decision-making and further advancing the development of speech as a promising biomarker for early AD screening and cognitive decline monitoring.</description><subject>Accuracy</subject><subject>Acoustics</subject><subject>Alzheimer's disease</subject><subject>Artificial intelligence</subject><subject>Bioinformatics</subject><subject>Context modeling</subject><subject>Diseases</subject><subject>expert knowledge</subject><subject>Feature extraction</subject><subject>Hands</subject><subject>hybrid attention</subject><subject>Linguistics</subject><subject>multi-task learning</subject><subject>multimodal fusion</subject><subject>Multitasking</subject><subject>speech analysis</subject><issn>2168-2194</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNqFjTFPwzAUhD2ARAX9AUgMb2NKsNMQEraUpkpbykIkxsqQ18Y0tiM_o1J-Bz8YD9255XR30neMXQseC8GLu-W0XsQJT9J4cs-LLOFnbJSILI8SUaQXbEz0yYPyUBXZiP3OqnX10izKRyihPr471ULpPRqvrImmkrCF9Vfvlbat7EGaU4waSXt4RumMMjuYO6nxYN0e3pTvoPoe0HlYGXvosd0hbK2Dsv_pUGl0twQzRRjYUBIhkQ53AWE1vA6IH90VO9_KnnB88kt2M6-apzpSiLgZnNLSHTeC5zx7SMXkn_kPpX9WNQ</recordid><startdate>20241217</startdate><enddate>20241217</enddate><creator>Zhang, Zhenglin</creator><creator>Wang, Tengfei</creator><creator>Hu, Zian</creator><creator>Yang, Li-Zhuang</creator><creator>Li, Hai</creator><general>IEEE</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><orcidid>https://orcid.org/0000-0002-5490-6815</orcidid><orcidid>https://orcid.org/0000-0001-8504-5811</orcidid></search><sort><creationdate>20241217</creationdate><title>DEMENTIA: A Hybrid Attention-Based Multimodal and Multi-Task Learning Framework With Expert Knowledge for Alzheimer's Disease Assessment From Speech</title><author>Zhang, Zhenglin ; Wang, Tengfei ; Hu, Zian ; Yang, Li-Zhuang ; Li, Hai</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-ieee_primary_108067413</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Accuracy</topic><topic>Acoustics</topic><topic>Alzheimer's disease</topic><topic>Artificial intelligence</topic><topic>Bioinformatics</topic><topic>Context modeling</topic><topic>Diseases</topic><topic>expert knowledge</topic><topic>Feature extraction</topic><topic>Hands</topic><topic>hybrid attention</topic><topic>Linguistics</topic><topic>multi-task learning</topic><topic>multimodal fusion</topic><topic>Multitasking</topic><topic>speech analysis</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Zhang, Zhenglin</creatorcontrib><creatorcontrib>Wang, Tengfei</creatorcontrib><creatorcontrib>Hu, Zian</creatorcontrib><creatorcontrib>Yang, Li-Zhuang</creatorcontrib><creatorcontrib>Li, Hai</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><jtitle>IEEE journal of biomedical and health informatics</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Zhang, Zhenglin</au><au>Wang, Tengfei</au><au>Hu, Zian</au><au>Yang, Li-Zhuang</au><au>Li, Hai</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>DEMENTIA: A Hybrid Attention-Based Multimodal and Multi-Task Learning Framework With Expert Knowledge for Alzheimer's Disease Assessment From Speech</atitle><jtitle>IEEE journal of biomedical and health informatics</jtitle><stitle>JBHI</stitle><date>2024-12-17</date><risdate>2024</risdate><spage>1</spage><epage>12</epage><pages>1-12</pages><issn>2168-2194</issn><coden>IJBHA9</coden><abstract>The prevalence of Alzheimer's disease (AD) is rising annually, imposing a severe burden on patients and society. Therefore, assisted AD assessment is crucial. The decline in language function and the cognitive impairment it reflects are key external manifestations of AD. Many studies have utilized speech analysis to achieve convenient, non-invasive, and low-cost AD detection. Although state-of-the-art researches achieve high-precision AD detection using multimodal information, these studies often ignore interactions between different modalities and lack explanations for complex models. To address this, we propose a multi-task learning (MTL) AD assessment model that combines hybrid attention with multimodal representations. The model fuses audio, text, and expert knowledge to fully capture intra- and inter-modal interactions, achieving simultaneous AD detection and cognitive state prediction, along with comprehensive explainability analyses of the model and various modalities. Results show that the proposed method is sufficiently sensitive in assessing AD, achieving 89.58% accuracy and 91.67% recall for the classification task and a root mean square error of 4.31 for the regression task with good generalization performance. Multimodal representations with expert knowledge and MTL contribute to AD assessment performance. Explainability analyses indicate that, compared to healthy controls, AD patients exhibit slower speech rates, reduced syntactic complexity, and a greater tendency to use pause fillers and pronouns. Therefore, Our study validates the effectiveness of the proposed method, addressing trust issues in clinical practice for assisted decision-making and further advancing the development of speech as a promising biomarker for early AD screening and cognitive decline monitoring.</abstract><pub>IEEE</pub><doi>10.1109/JBHI.2024.3509620</doi><orcidid>https://orcid.org/0000-0002-5490-6815</orcidid><orcidid>https://orcid.org/0000-0001-8504-5811</orcidid></addata></record>
fulltext	fulltext_linktorsrc
identifier	ISSN: 2168-2194
ispartof	IEEE journal of biomedical and health informatics, 2024-12, p.1-12
issn	2168-2194
language	eng
recordid	cdi_ieee_primary_10806741
source	IEEE Electronic Library (IEL)
subjects	Accuracy Acoustics Alzheimer's disease Artificial intelligence Bioinformatics Context modeling Diseases expert knowledge Feature extraction Hands hybrid attention Linguistics multi-task learning multimodal fusion Multitasking speech analysis
title	DEMENTIA: A Hybrid Attention-Based Multimodal and Multi-Task Learning Framework With Expert Knowledge for Alzheimer's Disease Assessment From Speech
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-04T13%3A20%3A44IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-ieee_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=DEMENTIA:%20A%20Hybrid%20Attention-Based%20Multimodal%20and%20Multi-Task%20Learning%20Framework%20With%20Expert%20Knowledge%20for%20Alzheimer's%20Disease%20Assessment%20From%20Speech&rft.jtitle=IEEE%20journal%20of%20biomedical%20and%20health%20informatics&rft.au=Zhang,%20Zhenglin&rft.date=2024-12-17&rft.spage=1&rft.epage=12&rft.pages=1-12&rft.issn=2168-2194&rft.coden=IJBHA9&rft_id=info:doi/10.1109/JBHI.2024.3509620&rft_dat=%3Cieee_RIE%3E10806741%3C/ieee_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=10806741&rfr_iscdi=true