Validation of Machine Learning-Based Assessment of Major Depressive Disorder from Paralinguistic Speech Characteristics in Routine Care

New developments in machine learning-based analysis of speech can be hypothesized to facilitate the long-term monitoring of major depressive disorder (MDD) during and after treatment. To test this hypothesis, we collected 550 speech samples from telephone-based clinical interviews with 267 individua...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Depression and anxiety 2024-04, Vol.2024, p.1-12
Hauptverfasser: Bauer, Jonathan F., Gerczuk, Maurice, Schindler-Gmelch, Lena, Amiriparian, Shahin, Ebert, David Daniel, Krajewski, Jarek, Schuller, Björn, Berking, Matthias
Format: Artikel
Sprache:eng
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 12
container_issue
container_start_page 1
container_title Depression and anxiety
container_volume 2024
creator Bauer, Jonathan F.
Gerczuk, Maurice
Schindler-Gmelch, Lena
Amiriparian, Shahin
Ebert, David Daniel
Krajewski, Jarek
Schuller, Björn
Berking, Matthias
description New developments in machine learning-based analysis of speech can be hypothesized to facilitate the long-term monitoring of major depressive disorder (MDD) during and after treatment. To test this hypothesis, we collected 550 speech samples from telephone-based clinical interviews with 267 individuals in routine care. With this data, we trained and evaluated a machine learning system to identify the absence/presence of a MDD diagnosis (as assessed with the Structured Clinical Interview for DSM-IV) from paralinguistic speech characteristics. Our system classified diagnostic status of MDD with an accuracy of 66% (sensitivity: 70%, specificity: 62%). Permutation tests indicated that the machine learning system classified MDD significantly better than chance. However, deriving diagnoses from cut-off scores of common depression scales was superior to the machine learning system with an accuracy of 73% for the Hamilton Rating Scale for Depression (HRSD), 74% for the Quick Inventory of Depressive Symptomatology–Clinician version (QIDS-C), and 73% for the depression module of the Patient Health Questionnaire (PHQ-9). Moreover, training a machine learning system that incorporated both speech analysis and depression scales resulted in accuracies between 73 and 76%. Thus, while findings of the present study demonstrate that automated speech analysis shows the potential of identifying patterns of depressed speech, it does not substantially improve the validity of classifications from common depression scales. In conclusion, speech analysis may not yet be able to replace common depression scales in clinical practice, since it cannot yet provide the necessary accuracy in depression detection. This trial is registered with DRKS00023670.
doi_str_mv 10.1155/2024/9667377
format Article
fullrecord <record><control><sourceid>crossref_hinda</sourceid><recordid>TN_cdi_crossref_primary_10_1155_2024_9667377</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>10_1155_2024_9667377</sourcerecordid><originalsourceid>FETCH-LOGICAL-c266t-ee306044681bad29a4877ff8854162332c052da2f256ed39cc65f136853b93983</originalsourceid><addsrcrecordid>eNp9kMlOwzAURS0EEqWw4wO8h1APsWMvS8okFYGYtpHrPLeu2riyUxBfwG-T0K5Zvauro_ukg9A5JVeUCjFihOUjLWXBi-IADahgJJNc54ddJppmOZP6GJ2ktCSEKK3IAP18mJWvTetDg4PDj8YufAN4CiY2vpln1yZBjccpQUpraNodtAwRT2ATu9J_Ap74FGINEbsY1vjZxG6zmW99ar3FrxsAu8DloqttC_GvTdg3-CVs2_5ZaSKcoiNnVgnO9neI3m9v3sr7bPp091COp5llUrYZACeS5LlUdGZqpk2uisI5pUROJeOcWSJYbZhjQkLNtbVSOMqlEnymuVZ8iC53uzaGlCK4ahP92sTvipKql1j1Equ9xA6_2OGdldp8-f_pX7rhcsk</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Validation of Machine Learning-Based Assessment of Major Depressive Disorder from Paralinguistic Speech Characteristics in Routine Care</title><source>Access via Wiley Online Library</source><source>Wiley Online Library (Open Access Collection)</source><source>Alma/SFX Local Collection</source><creator>Bauer, Jonathan F. ; Gerczuk, Maurice ; Schindler-Gmelch, Lena ; Amiriparian, Shahin ; Ebert, David Daniel ; Krajewski, Jarek ; Schuller, Björn ; Berking, Matthias</creator><contributor>Landi, Giulia</contributor><creatorcontrib>Bauer, Jonathan F. ; Gerczuk, Maurice ; Schindler-Gmelch, Lena ; Amiriparian, Shahin ; Ebert, David Daniel ; Krajewski, Jarek ; Schuller, Björn ; Berking, Matthias ; Landi, Giulia</creatorcontrib><description>New developments in machine learning-based analysis of speech can be hypothesized to facilitate the long-term monitoring of major depressive disorder (MDD) during and after treatment. To test this hypothesis, we collected 550 speech samples from telephone-based clinical interviews with 267 individuals in routine care. With this data, we trained and evaluated a machine learning system to identify the absence/presence of a MDD diagnosis (as assessed with the Structured Clinical Interview for DSM-IV) from paralinguistic speech characteristics. Our system classified diagnostic status of MDD with an accuracy of 66% (sensitivity: 70%, specificity: 62%). Permutation tests indicated that the machine learning system classified MDD significantly better than chance. However, deriving diagnoses from cut-off scores of common depression scales was superior to the machine learning system with an accuracy of 73% for the Hamilton Rating Scale for Depression (HRSD), 74% for the Quick Inventory of Depressive Symptomatology–Clinician version (QIDS-C), and 73% for the depression module of the Patient Health Questionnaire (PHQ-9). Moreover, training a machine learning system that incorporated both speech analysis and depression scales resulted in accuracies between 73 and 76%. Thus, while findings of the present study demonstrate that automated speech analysis shows the potential of identifying patterns of depressed speech, it does not substantially improve the validity of classifications from common depression scales. In conclusion, speech analysis may not yet be able to replace common depression scales in clinical practice, since it cannot yet provide the necessary accuracy in depression detection. This trial is registered with DRKS00023670.</description><identifier>ISSN: 1091-4269</identifier><identifier>EISSN: 1520-6394</identifier><identifier>DOI: 10.1155/2024/9667377</identifier><language>eng</language><publisher>Hindawi</publisher><ispartof>Depression and anxiety, 2024-04, Vol.2024, p.1-12</ispartof><rights>Copyright © 2024 Jonathan F. Bauer et al.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c266t-ee306044681bad29a4877ff8854162332c052da2f256ed39cc65f136853b93983</cites><orcidid>0000-0002-8355-1603 ; 0000-0001-8293-6635 ; 0000-0002-6478-8699 ; 0000-0002-1129-8223 ; 0000-0001-6820-0146 ; 0000-0001-8936-7105 ; 0000-0002-1549-6534 ; 0000-0001-5903-4748</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>315,782,786,27933,27934</link.rule.ids></links><search><contributor>Landi, Giulia</contributor><creatorcontrib>Bauer, Jonathan F.</creatorcontrib><creatorcontrib>Gerczuk, Maurice</creatorcontrib><creatorcontrib>Schindler-Gmelch, Lena</creatorcontrib><creatorcontrib>Amiriparian, Shahin</creatorcontrib><creatorcontrib>Ebert, David Daniel</creatorcontrib><creatorcontrib>Krajewski, Jarek</creatorcontrib><creatorcontrib>Schuller, Björn</creatorcontrib><creatorcontrib>Berking, Matthias</creatorcontrib><title>Validation of Machine Learning-Based Assessment of Major Depressive Disorder from Paralinguistic Speech Characteristics in Routine Care</title><title>Depression and anxiety</title><description>New developments in machine learning-based analysis of speech can be hypothesized to facilitate the long-term monitoring of major depressive disorder (MDD) during and after treatment. To test this hypothesis, we collected 550 speech samples from telephone-based clinical interviews with 267 individuals in routine care. With this data, we trained and evaluated a machine learning system to identify the absence/presence of a MDD diagnosis (as assessed with the Structured Clinical Interview for DSM-IV) from paralinguistic speech characteristics. Our system classified diagnostic status of MDD with an accuracy of 66% (sensitivity: 70%, specificity: 62%). Permutation tests indicated that the machine learning system classified MDD significantly better than chance. However, deriving diagnoses from cut-off scores of common depression scales was superior to the machine learning system with an accuracy of 73% for the Hamilton Rating Scale for Depression (HRSD), 74% for the Quick Inventory of Depressive Symptomatology–Clinician version (QIDS-C), and 73% for the depression module of the Patient Health Questionnaire (PHQ-9). Moreover, training a machine learning system that incorporated both speech analysis and depression scales resulted in accuracies between 73 and 76%. Thus, while findings of the present study demonstrate that automated speech analysis shows the potential of identifying patterns of depressed speech, it does not substantially improve the validity of classifications from common depression scales. In conclusion, speech analysis may not yet be able to replace common depression scales in clinical practice, since it cannot yet provide the necessary accuracy in depression detection. This trial is registered with DRKS00023670.</description><issn>1091-4269</issn><issn>1520-6394</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>RHX</sourceid><recordid>eNp9kMlOwzAURS0EEqWw4wO8h1APsWMvS8okFYGYtpHrPLeu2riyUxBfwG-T0K5Zvauro_ukg9A5JVeUCjFihOUjLWXBi-IADahgJJNc54ddJppmOZP6GJ2ktCSEKK3IAP18mJWvTetDg4PDj8YufAN4CiY2vpln1yZBjccpQUpraNodtAwRT2ATu9J_Ap74FGINEbsY1vjZxG6zmW99ar3FrxsAu8DloqttC_GvTdg3-CVs2_5ZaSKcoiNnVgnO9neI3m9v3sr7bPp091COp5llUrYZACeS5LlUdGZqpk2uisI5pUROJeOcWSJYbZhjQkLNtbVSOMqlEnymuVZ8iC53uzaGlCK4ahP92sTvipKql1j1Equ9xA6_2OGdldp8-f_pX7rhcsk</recordid><startdate>20240409</startdate><enddate>20240409</enddate><creator>Bauer, Jonathan F.</creator><creator>Gerczuk, Maurice</creator><creator>Schindler-Gmelch, Lena</creator><creator>Amiriparian, Shahin</creator><creator>Ebert, David Daniel</creator><creator>Krajewski, Jarek</creator><creator>Schuller, Björn</creator><creator>Berking, Matthias</creator><general>Hindawi</general><scope>RHU</scope><scope>RHW</scope><scope>RHX</scope><scope>AAYXX</scope><scope>CITATION</scope><orcidid>https://orcid.org/0000-0002-8355-1603</orcidid><orcidid>https://orcid.org/0000-0001-8293-6635</orcidid><orcidid>https://orcid.org/0000-0002-6478-8699</orcidid><orcidid>https://orcid.org/0000-0002-1129-8223</orcidid><orcidid>https://orcid.org/0000-0001-6820-0146</orcidid><orcidid>https://orcid.org/0000-0001-8936-7105</orcidid><orcidid>https://orcid.org/0000-0002-1549-6534</orcidid><orcidid>https://orcid.org/0000-0001-5903-4748</orcidid></search><sort><creationdate>20240409</creationdate><title>Validation of Machine Learning-Based Assessment of Major Depressive Disorder from Paralinguistic Speech Characteristics in Routine Care</title><author>Bauer, Jonathan F. ; Gerczuk, Maurice ; Schindler-Gmelch, Lena ; Amiriparian, Shahin ; Ebert, David Daniel ; Krajewski, Jarek ; Schuller, Björn ; Berking, Matthias</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c266t-ee306044681bad29a4877ff8854162332c052da2f256ed39cc65f136853b93983</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Bauer, Jonathan F.</creatorcontrib><creatorcontrib>Gerczuk, Maurice</creatorcontrib><creatorcontrib>Schindler-Gmelch, Lena</creatorcontrib><creatorcontrib>Amiriparian, Shahin</creatorcontrib><creatorcontrib>Ebert, David Daniel</creatorcontrib><creatorcontrib>Krajewski, Jarek</creatorcontrib><creatorcontrib>Schuller, Björn</creatorcontrib><creatorcontrib>Berking, Matthias</creatorcontrib><collection>Hindawi Publishing Complete</collection><collection>Hindawi Publishing Subscription Journals</collection><collection>Hindawi Publishing Open Access Journals</collection><collection>CrossRef</collection><jtitle>Depression and anxiety</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Bauer, Jonathan F.</au><au>Gerczuk, Maurice</au><au>Schindler-Gmelch, Lena</au><au>Amiriparian, Shahin</au><au>Ebert, David Daniel</au><au>Krajewski, Jarek</au><au>Schuller, Björn</au><au>Berking, Matthias</au><au>Landi, Giulia</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Validation of Machine Learning-Based Assessment of Major Depressive Disorder from Paralinguistic Speech Characteristics in Routine Care</atitle><jtitle>Depression and anxiety</jtitle><date>2024-04-09</date><risdate>2024</risdate><volume>2024</volume><spage>1</spage><epage>12</epage><pages>1-12</pages><issn>1091-4269</issn><eissn>1520-6394</eissn><abstract>New developments in machine learning-based analysis of speech can be hypothesized to facilitate the long-term monitoring of major depressive disorder (MDD) during and after treatment. To test this hypothesis, we collected 550 speech samples from telephone-based clinical interviews with 267 individuals in routine care. With this data, we trained and evaluated a machine learning system to identify the absence/presence of a MDD diagnosis (as assessed with the Structured Clinical Interview for DSM-IV) from paralinguistic speech characteristics. Our system classified diagnostic status of MDD with an accuracy of 66% (sensitivity: 70%, specificity: 62%). Permutation tests indicated that the machine learning system classified MDD significantly better than chance. However, deriving diagnoses from cut-off scores of common depression scales was superior to the machine learning system with an accuracy of 73% for the Hamilton Rating Scale for Depression (HRSD), 74% for the Quick Inventory of Depressive Symptomatology–Clinician version (QIDS-C), and 73% for the depression module of the Patient Health Questionnaire (PHQ-9). Moreover, training a machine learning system that incorporated both speech analysis and depression scales resulted in accuracies between 73 and 76%. Thus, while findings of the present study demonstrate that automated speech analysis shows the potential of identifying patterns of depressed speech, it does not substantially improve the validity of classifications from common depression scales. In conclusion, speech analysis may not yet be able to replace common depression scales in clinical practice, since it cannot yet provide the necessary accuracy in depression detection. This trial is registered with DRKS00023670.</abstract><pub>Hindawi</pub><doi>10.1155/2024/9667377</doi><tpages>12</tpages><orcidid>https://orcid.org/0000-0002-8355-1603</orcidid><orcidid>https://orcid.org/0000-0001-8293-6635</orcidid><orcidid>https://orcid.org/0000-0002-6478-8699</orcidid><orcidid>https://orcid.org/0000-0002-1129-8223</orcidid><orcidid>https://orcid.org/0000-0001-6820-0146</orcidid><orcidid>https://orcid.org/0000-0001-8936-7105</orcidid><orcidid>https://orcid.org/0000-0002-1549-6534</orcidid><orcidid>https://orcid.org/0000-0001-5903-4748</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 1091-4269
ispartof Depression and anxiety, 2024-04, Vol.2024, p.1-12
issn 1091-4269
1520-6394
language eng
recordid cdi_crossref_primary_10_1155_2024_9667377
source Access via Wiley Online Library; Wiley Online Library (Open Access Collection); Alma/SFX Local Collection
title Validation of Machine Learning-Based Assessment of Major Depressive Disorder from Paralinguistic Speech Characteristics in Routine Care
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-02T01%3A51%3A25IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-crossref_hinda&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Validation%20of%20Machine%20Learning-Based%20Assessment%20of%20Major%20Depressive%20Disorder%20from%20Paralinguistic%20Speech%20Characteristics%20in%20Routine%20Care&rft.jtitle=Depression%20and%20anxiety&rft.au=Bauer,%20Jonathan%20F.&rft.date=2024-04-09&rft.volume=2024&rft.spage=1&rft.epage=12&rft.pages=1-12&rft.issn=1091-4269&rft.eissn=1520-6394&rft_id=info:doi/10.1155/2024/9667377&rft_dat=%3Ccrossref_hinda%3E10_1155_2024_9667377%3C/crossref_hinda%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true