Learning to Automate Follow-up Question Generation using Process Knowledge for Depression Triage on Reddit Posts

Conversational Agents (CAs) powered with deep language models (DLMs) have shown tremendous promise in the domain of mental health. Prominently, the CAs have been used to provide informational or therapeutic services to patients. However, the utility of CAs to assist in mental health triaging has not...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	arXiv.org 2022-05
Hauptverfasser:	Gupta, Shrey, Agarwal, Anmol, Gaur, Manas, Kaushik, Roy, Narayanan, Vignesh, Kumaraguru, Ponnurangam, Sheth, Amit
Format:	Artikel
Sprache:	eng
Schlagworte:	Annotations Computer Science - Artificial Intelligence Computer Science - Computation and Language Datasets Mental disorders Mental health Questions
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title	arXiv.org
container_volume
creator	Gupta, Shrey Agarwal, Anmol Gaur, Manas Kaushik, Roy Narayanan, Vignesh Kumaraguru, Ponnurangam Sheth, Amit
description	Conversational Agents (CAs) powered with deep language models (DLMs) have shown tremendous promise in the domain of mental health. Prominently, the CAs have been used to provide informational or therapeutic services to patients. However, the utility of CAs to assist in mental health triaging has not been explored in the existing work as it requires a controlled generation of follow-up questions (FQs), which are often initiated and guided by the mental health professionals (MHPs) in clinical settings. In the context of depression, our experiments show that DLMs coupled with process knowledge in a mental health questionnaire generate 12.54% and 9.37% better FQs based on similarity and longest common subsequence matches to questions in the PHQ-9 dataset respectively, when compared with DLMs without process knowledge support. Despite coupling with process knowledge, we find that DLMs are still prone to hallucination, i.e., generating redundant, irrelevant, and unsafe FQs. We demonstrate the challenge of using existing datasets to train a DLM for generating FQs that adhere to clinical process knowledge. To address this limitation, we prepared an extended PHQ-9 based dataset, PRIMATE, in collaboration with MHPs. PRIMATE contains annotations regarding whether a particular question in the PHQ-9 dataset has already been answered in the user's initial description of the mental health condition. We used PRIMATE to train a DLM in a supervised setting to identify which of the PHQ-9 questions can be answered directly from the user's post and which ones would require more information from the user. Using performance analysis based on MCC scores, we show that PRIMATE is appropriate for identifying questions in PHQ-9 that could guide generative DLMs towards controlled FQ generation suitable for aiding triaging. Dataset created as a part of this research: https://github.com/primate-mh/Primate2022
doi_str_mv	10.48550/arxiv.2205.13884
format	Article
fullrecord	<record><control><sourceid>proquest_arxiv</sourceid><recordid>TN_cdi_arxiv_primary_2205_13884</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2671444847</sourcerecordid><originalsourceid>FETCH-LOGICAL-a954-8c74e1e15da18357ada1bb13219318984b45bc2a1d9ee1e074cf190b0b63ca093</originalsourceid><addsrcrecordid>eNotkE1Lw0AQhhdBsNT-AE8ueE7dz2ZzLNVWMWCV3MMmmZQtaTbuJlb_vZvU0_syPAwzD0J3lCyFkpI8avdjvpeMEbmkXClxhWaMcxopwdgNWnh_JISwVcyk5DPUpaBda9oD7i1eD7096R7w1jaNPUdDhz8G8L2xLd5BC05PdfAjv3e2BO_xW2vPDVQHwLV1-Ak6F6YjljmjwzS0T6gq0-O99b2_Rde1bjws_nOOsu1ztnmJ0vfd62adRjqRIlJlLIAClZWmistYhywKyhlNOFWJEoWQRck0rRIIHIlFWdOEFKRY8VKThM_R_WXtpCPvnDlp95uPWvJJSyAeLkTn7Nf4ZX60g2vDTXmwQ4UQSsT8D5AJZn0</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2671444847</pqid></control><display><type>article</type><title>Learning to Automate Follow-up Question Generation using Process Knowledge for Depression Triage on Reddit Posts</title><source>arXiv.org</source><source>Free E- Journals</source><creator>Gupta, Shrey ; Agarwal, Anmol ; Gaur, Manas ; Kaushik, Roy ; Narayanan, Vignesh ; Kumaraguru, Ponnurangam ; Sheth, Amit</creator><creatorcontrib>Gupta, Shrey ; Agarwal, Anmol ; Gaur, Manas ; Kaushik, Roy ; Narayanan, Vignesh ; Kumaraguru, Ponnurangam ; Sheth, Amit</creatorcontrib><description>Conversational Agents (CAs) powered with deep language models (DLMs) have shown tremendous promise in the domain of mental health. Prominently, the CAs have been used to provide informational or therapeutic services to patients. However, the utility of CAs to assist in mental health triaging has not been explored in the existing work as it requires a controlled generation of follow-up questions (FQs), which are often initiated and guided by the mental health professionals (MHPs) in clinical settings. In the context of depression, our experiments show that DLMs coupled with process knowledge in a mental health questionnaire generate 12.54% and 9.37% better FQs based on similarity and longest common subsequence matches to questions in the PHQ-9 dataset respectively, when compared with DLMs without process knowledge support. Despite coupling with process knowledge, we find that DLMs are still prone to hallucination, i.e., generating redundant, irrelevant, and unsafe FQs. We demonstrate the challenge of using existing datasets to train a DLM for generating FQs that adhere to clinical process knowledge. To address this limitation, we prepared an extended PHQ-9 based dataset, PRIMATE, in collaboration with MHPs. PRIMATE contains annotations regarding whether a particular question in the PHQ-9 dataset has already been answered in the user's initial description of the mental health condition. We used PRIMATE to train a DLM in a supervised setting to identify which of the PHQ-9 questions can be answered directly from the user's post and which ones would require more information from the user. Using performance analysis based on MCC scores, we show that PRIMATE is appropriate for identifying questions in PHQ-9 that could guide generative DLMs towards controlled FQ generation suitable for aiding triaging. Dataset created as a part of this research: https://github.com/primate-mh/Primate2022</description><identifier>EISSN: 2331-8422</identifier><identifier>DOI: 10.48550/arxiv.2205.13884</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Annotations ; Computer Science - Artificial Intelligence ; Computer Science - Computation and Language ; Datasets ; Mental disorders ; Mental health ; Questions</subject><ispartof>arXiv.org, 2022-05</ispartof><rights>2022. This work is published under http://creativecommons.org/licenses/by-sa/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><rights>http://creativecommons.org/licenses/by-sa/4.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,776,780,881,27902</link.rule.ids><backlink>$$Uhttps://doi.org/10.18653/v1/2022.clpsych-1.12$$DView published paper (Access to full text may be restricted)$$Hfree_for_read</backlink><backlink>$$Uhttps://doi.org/10.48550/arXiv.2205.13884$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Gupta, Shrey</creatorcontrib><creatorcontrib>Agarwal, Anmol</creatorcontrib><creatorcontrib>Gaur, Manas</creatorcontrib><creatorcontrib>Kaushik, Roy</creatorcontrib><creatorcontrib>Narayanan, Vignesh</creatorcontrib><creatorcontrib>Kumaraguru, Ponnurangam</creatorcontrib><creatorcontrib>Sheth, Amit</creatorcontrib><title>Learning to Automate Follow-up Question Generation using Process Knowledge for Depression Triage on Reddit Posts</title><title>arXiv.org</title><description>Conversational Agents (CAs) powered with deep language models (DLMs) have shown tremendous promise in the domain of mental health. Prominently, the CAs have been used to provide informational or therapeutic services to patients. However, the utility of CAs to assist in mental health triaging has not been explored in the existing work as it requires a controlled generation of follow-up questions (FQs), which are often initiated and guided by the mental health professionals (MHPs) in clinical settings. In the context of depression, our experiments show that DLMs coupled with process knowledge in a mental health questionnaire generate 12.54% and 9.37% better FQs based on similarity and longest common subsequence matches to questions in the PHQ-9 dataset respectively, when compared with DLMs without process knowledge support. Despite coupling with process knowledge, we find that DLMs are still prone to hallucination, i.e., generating redundant, irrelevant, and unsafe FQs. We demonstrate the challenge of using existing datasets to train a DLM for generating FQs that adhere to clinical process knowledge. To address this limitation, we prepared an extended PHQ-9 based dataset, PRIMATE, in collaboration with MHPs. PRIMATE contains annotations regarding whether a particular question in the PHQ-9 dataset has already been answered in the user's initial description of the mental health condition. We used PRIMATE to train a DLM in a supervised setting to identify which of the PHQ-9 questions can be answered directly from the user's post and which ones would require more information from the user. Using performance analysis based on MCC scores, we show that PRIMATE is appropriate for identifying questions in PHQ-9 that could guide generative DLMs towards controlled FQ generation suitable for aiding triaging. Dataset created as a part of this research: https://github.com/primate-mh/Primate2022</description><subject>Annotations</subject><subject>Computer Science - Artificial Intelligence</subject><subject>Computer Science - Computation and Language</subject><subject>Datasets</subject><subject>Mental disorders</subject><subject>Mental health</subject><subject>Questions</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><sourceid>BENPR</sourceid><sourceid>GOX</sourceid><recordid>eNotkE1Lw0AQhhdBsNT-AE8ueE7dz2ZzLNVWMWCV3MMmmZQtaTbuJlb_vZvU0_syPAwzD0J3lCyFkpI8avdjvpeMEbmkXClxhWaMcxopwdgNWnh_JISwVcyk5DPUpaBda9oD7i1eD7096R7w1jaNPUdDhz8G8L2xLd5BC05PdfAjv3e2BO_xW2vPDVQHwLV1-Ak6F6YjljmjwzS0T6gq0-O99b2_Rde1bjws_nOOsu1ztnmJ0vfd62adRjqRIlJlLIAClZWmistYhywKyhlNOFWJEoWQRck0rRIIHIlFWdOEFKRY8VKThM_R_WXtpCPvnDlp95uPWvJJSyAeLkTn7Nf4ZX60g2vDTXmwQ4UQSsT8D5AJZn0</recordid><startdate>20220527</startdate><enddate>20220527</enddate><creator>Gupta, Shrey</creator><creator>Agarwal, Anmol</creator><creator>Gaur, Manas</creator><creator>Kaushik, Roy</creator><creator>Narayanan, Vignesh</creator><creator>Kumaraguru, Ponnurangam</creator><creator>Sheth, Amit</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20220527</creationdate><title>Learning to Automate Follow-up Question Generation using Process Knowledge for Depression Triage on Reddit Posts</title><author>Gupta, Shrey ; Agarwal, Anmol ; Gaur, Manas ; Kaushik, Roy ; Narayanan, Vignesh ; Kumaraguru, Ponnurangam ; Sheth, Amit</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a954-8c74e1e15da18357ada1bb13219318984b45bc2a1d9ee1e074cf190b0b63ca093</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Annotations</topic><topic>Computer Science - Artificial Intelligence</topic><topic>Computer Science - Computation and Language</topic><topic>Datasets</topic><topic>Mental disorders</topic><topic>Mental health</topic><topic>Questions</topic><toplevel>online_resources</toplevel><creatorcontrib>Gupta, Shrey</creatorcontrib><creatorcontrib>Agarwal, Anmol</creatorcontrib><creatorcontrib>Gaur, Manas</creatorcontrib><creatorcontrib>Kaushik, Roy</creatorcontrib><creatorcontrib>Narayanan, Vignesh</creatorcontrib><creatorcontrib>Kumaraguru, Ponnurangam</creatorcontrib><creatorcontrib>Sheth, Amit</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science & Engineering Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection><collection>arXiv Computer Science</collection><collection>arXiv.org</collection><jtitle>arXiv.org</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Gupta, Shrey</au><au>Agarwal, Anmol</au><au>Gaur, Manas</au><au>Kaushik, Roy</au><au>Narayanan, Vignesh</au><au>Kumaraguru, Ponnurangam</au><au>Sheth, Amit</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Learning to Automate Follow-up Question Generation using Process Knowledge for Depression Triage on Reddit Posts</atitle><jtitle>arXiv.org</jtitle><date>2022-05-27</date><risdate>2022</risdate><eissn>2331-8422</eissn><abstract>Conversational Agents (CAs) powered with deep language models (DLMs) have shown tremendous promise in the domain of mental health. Prominently, the CAs have been used to provide informational or therapeutic services to patients. However, the utility of CAs to assist in mental health triaging has not been explored in the existing work as it requires a controlled generation of follow-up questions (FQs), which are often initiated and guided by the mental health professionals (MHPs) in clinical settings. In the context of depression, our experiments show that DLMs coupled with process knowledge in a mental health questionnaire generate 12.54% and 9.37% better FQs based on similarity and longest common subsequence matches to questions in the PHQ-9 dataset respectively, when compared with DLMs without process knowledge support. Despite coupling with process knowledge, we find that DLMs are still prone to hallucination, i.e., generating redundant, irrelevant, and unsafe FQs. We demonstrate the challenge of using existing datasets to train a DLM for generating FQs that adhere to clinical process knowledge. To address this limitation, we prepared an extended PHQ-9 based dataset, PRIMATE, in collaboration with MHPs. PRIMATE contains annotations regarding whether a particular question in the PHQ-9 dataset has already been answered in the user's initial description of the mental health condition. We used PRIMATE to train a DLM in a supervised setting to identify which of the PHQ-9 questions can be answered directly from the user's post and which ones would require more information from the user. Using performance analysis based on MCC scores, we show that PRIMATE is appropriate for identifying questions in PHQ-9 that could guide generative DLMs towards controlled FQ generation suitable for aiding triaging. Dataset created as a part of this research: https://github.com/primate-mh/Primate2022</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><doi>10.48550/arxiv.2205.13884</doi><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	EISSN: 2331-8422
ispartof	arXiv.org, 2022-05
issn	2331-8422
language	eng
recordid	cdi_arxiv_primary_2205_13884
source	arXiv.org; Free E- Journals
subjects	Annotations Computer Science - Artificial Intelligence Computer Science - Computation and Language Datasets Mental disorders Mental health Questions
title	Learning to Automate Follow-up Question Generation using Process Knowledge for Depression Triage on Reddit Posts
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-06T20%3A46%3A24IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_arxiv&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Learning%20to%20Automate%20Follow-up%20Question%20Generation%20using%20Process%20Knowledge%20for%20Depression%20Triage%20on%20Reddit%20Posts&rft.jtitle=arXiv.org&rft.au=Gupta,%20Shrey&rft.date=2022-05-27&rft.eissn=2331-8422&rft_id=info:doi/10.48550/arxiv.2205.13884&rft_dat=%3Cproquest_arxiv%3E2671444847%3C/proquest_arxiv%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2671444847&rft_id=info:pmid/&rfr_iscdi=true