Conversational assessment using artificial intelligence is as clinically useful as depression scales and preferred by users

Depression is prevalent, chronic, and burdensome. Due to limited screening access, depression often remains undiagnosed. Artificial intelligence (AI) models based on spoken responses to interview questions may offer an effective, efficient alternative to other screening methods. The primary aim was...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Journal of affective disorders 2024-04, Vol.351, p.489-498
Hauptverfasser:	Weisenburger, Rachel L., Mullarkey, Michael C., Labrada, Jocelyn, Labrousse, Daniel, Yang, Michelle Y., MacPherson, Allison Huff, Hsu, Kean J., Ugail, Hassan, Shumake, Jason, Beevers, Christopher G.
Format:	Artikel
Sprache:	eng
Schlagworte:	Adult Artificial Intelligence Communication Depression Depression - diagnosis Ethnicity Humans Internet Machine learning Mental health screening
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Depression is prevalent, chronic, and burdensome. Due to limited screening access, depression often remains undiagnosed. Artificial intelligence (AI) models based on spoken responses to interview questions may offer an effective, efficient alternative to other screening methods. The primary aim was to use a demographically diverse sample to validate an AI model, previously trained on human-administered interviews, on novel bot-administered interviews, and to check for algorithmic biases related to age, sex, race, and ethnicity. Using the Aiberry app, adults recruited via social media (N = 393) completed a brief bot-administered interview and a depression self-report form. An AI model was used to predict form scores based on interview responses alone. For all meaningful discrepancies between model inference and form score, clinicians performed a masked review to determine which one they preferred. There was strong concurrent validity between the model predictions and raw self-report scores (r = 0.73, MAE = 3.3). 90 % of AI predictions either agreed with self-report or with clinical expert opinion when AI contradicted self-report. There was no differential model performance across age, sex, race, or ethnicity. Limitations include access restrictions (English-speaking ability and access to smartphone or computer with broadband internet) and potential self-selection of participants more favorably predisposed toward AI technology. The Aiberry model made accurate predictions of depression severity based on remotely collected spoken responses to a bot-administered interview. This study shows promising results for the use of AI as a mental health screening tool on par with self-report measures. •Depression is a prevalent disorder but goes undiagnosed due to limited screening.•Artificial intelligence models based on audiovisual features may be an alternative.•The model predicted depression severity based on brief, bot-administered interviews.•Clinicians rated discrepant model predictions and self-report as equally plausible.•The model performed independently of age, sex, race, or ethnicity.
ISSN:	0165-0327 1573-2517
DOI:	10.1016/j.jad.2024.01.212