Long Dialogue Emotion Detection Based on Commonsense Knowledge Graph Guidance

Dialogue emotion detection is always challenging due to human subjectivity and the randomness of dialogue content. In a conversation, the emotion of each person often develops via a cumulative process, which can be influenced by many elements of uncertainty. Much commonsense knowledge influences peo...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on multimedia 2024, Vol.26, p.514-528
Hauptverfasser:	Nie, Weizhi, Bao, Yuru, Zhao, Yue, Liu, Anan
Format:	Artikel
Sprache:	eng
Schlagworte:	commonsense knowledge graph Commonsense reasoning Correlation Datasets Design optimization Emotion detection Emotion recognition Emotions Graphical representations growing graph Knowledge Knowledge representation Oral communication Performance enhancement Process variables Self-supervised learning Social networking (online) Speech recognition topic module Transformers
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	528
container_issue
container_start_page	514
container_title	IEEE transactions on multimedia
container_volume	26
creator	Nie, Weizhi Bao, Yuru Zhao, Yue Liu, Anan
description	Dialogue emotion detection is always challenging due to human subjectivity and the randomness of dialogue content. In a conversation, the emotion of each person often develops via a cumulative process, which can be influenced by many elements of uncertainty. Much commonsense knowledge influences people's emotions imperceptibly, such as experiential or habitual knowledge. In the process of conversation, this commonsense knowledge information can be used to enrich the semantic information of each utterance and improve the accuracy of emotion recognition. In this paper, we propose a growing graph model for dialogues emotion detection based on retrieval of external knowledge atlas ATOMIC from local and global respectively, which can effectively represent the dialogues as a process variable in a sequence and the correlation among utterances also can be represented by the graph model. In particular, 1) we introduce a common sense knowledge graph for linking the commonsense knowledge retrieved from external knowledge atlas ATOMIC, which can effectively add auxiliary information to improve the performance of each utterance's representation. 2) We propose a novel self-supervised learning method for extracting the latent topic of each dialogue. Based on this design, we also propose an effective optimization mechanism to make the representation (embedding) of latent topic has a better distinction for the next operation. 3) Finally, the cross-attention module is utilized to combine the utterances' features and the latent conversation topic information. The attention mechanism can effectively use topic information to supplement the representation of utterances and improve recognition performance. The model is tested on three popular datasets in dialogue emotion detection and is empirically demonstrated to outperform the state-of-the-art approaches. Meanwhile, to demonstrate the performance of our approach, we also build a long dialogue dataset. The average length of each conversation is over 50 utterances. The final experimental results also demonstrate the superior performance of our approach.
doi_str_mv	10.1109/TMM.2023.3267295
format	Article
fullrecord	<record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_proquest_journals_2912942990</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>10102573</ieee_id><sourcerecordid>2912942990</sourcerecordid><originalsourceid>FETCH-LOGICAL-c292t-3c90648c8cac4f3aef9cfb2f99916857807ab12960f913369f028a1497f6e93e3</originalsourceid><addsrcrecordid>eNpNkDFPwzAQRi0EEqWwMzBEYk4520mcG6EtBdGKpcyW655LqiYucSLEv8elHZBOum943530GLvlMOIc8GG5WIwECDmSolAC8zM24JjxFECp85hzASkKDpfsKoQtAM9yUAO2mPtmk0wqs_ObnpJp7bvKN8mEOrJ_6ckEWicxjH1d-yZQnOSt8d87Wm8ombVm_5nM-mptGkvX7MKZXaCb0x6yj-fpcvySzt9nr-PHeWoFii6VFqHISltaYzMnDTm0biUcIvKizFUJyqy4wAIccikLdCBKwzNUriCUJIfs_nh33_qvnkKnt75vm_hSC4zFTCBCpOBI2daH0JLT-7aqTfujOeiDNB2l6YM0fZIWK3fHSkVE_3AOIldS_gKMJmcG</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2912942990</pqid></control><display><type>article</type><title>Long Dialogue Emotion Detection Based on Commonsense Knowledge Graph Guidance</title><source>IEEE Electronic Library (IEL)</source><creator>Nie, Weizhi ; Bao, Yuru ; Zhao, Yue ; Liu, Anan</creator><creatorcontrib>Nie, Weizhi ; Bao, Yuru ; Zhao, Yue ; Liu, Anan</creatorcontrib><description>Dialogue emotion detection is always challenging due to human subjectivity and the randomness of dialogue content. In a conversation, the emotion of each person often develops via a cumulative process, which can be influenced by many elements of uncertainty. Much commonsense knowledge influences people's emotions imperceptibly, such as experiential or habitual knowledge. In the process of conversation, this commonsense knowledge information can be used to enrich the semantic information of each utterance and improve the accuracy of emotion recognition. In this paper, we propose a growing graph model for dialogues emotion detection based on retrieval of external knowledge atlas ATOMIC from local and global respectively, which can effectively represent the dialogues as a process variable in a sequence and the correlation among utterances also can be represented by the graph model. In particular, 1) we introduce a common sense knowledge graph for linking the commonsense knowledge retrieved from external knowledge atlas ATOMIC, which can effectively add auxiliary information to improve the performance of each utterance's representation. 2) We propose a novel self-supervised learning method for extracting the latent topic of each dialogue. Based on this design, we also propose an effective optimization mechanism to make the representation (embedding) of latent topic has a better distinction for the next operation. 3) Finally, the cross-attention module is utilized to combine the utterances' features and the latent conversation topic information. The attention mechanism can effectively use topic information to supplement the representation of utterances and improve recognition performance. The model is tested on three popular datasets in dialogue emotion detection and is empirically demonstrated to outperform the state-of-the-art approaches. Meanwhile, to demonstrate the performance of our approach, we also build a long dialogue dataset. The average length of each conversation is over 50 utterances. The final experimental results also demonstrate the superior performance of our approach.</description><identifier>ISSN: 1520-9210</identifier><identifier>EISSN: 1941-0077</identifier><identifier>DOI: 10.1109/TMM.2023.3267295</identifier><identifier>CODEN: ITMUF8</identifier><language>eng</language><publisher>Piscataway: IEEE</publisher><subject>commonsense knowledge graph ; Commonsense reasoning ; Correlation ; Datasets ; Design optimization ; Emotion detection ; Emotion recognition ; Emotions ; Graphical representations ; growing graph ; Knowledge ; Knowledge representation ; Oral communication ; Performance enhancement ; Process variables ; Self-supervised learning ; Social networking (online) ; Speech recognition ; topic module ; Transformers</subject><ispartof>IEEE transactions on multimedia, 2024, Vol.26, p.514-528</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2024</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c292t-3c90648c8cac4f3aef9cfb2f99916857807ab12960f913369f028a1497f6e93e3</citedby><cites>FETCH-LOGICAL-c292t-3c90648c8cac4f3aef9cfb2f99916857807ab12960f913369f028a1497f6e93e3</cites><orcidid>0000-0002-0578-8138 ; 0000-0001-5755-9145 ; 0000-0001-8390-2410</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/10102573$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>315,781,785,797,4025,27927,27928,27929,54762</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/10102573$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Nie, Weizhi</creatorcontrib><creatorcontrib>Bao, Yuru</creatorcontrib><creatorcontrib>Zhao, Yue</creatorcontrib><creatorcontrib>Liu, Anan</creatorcontrib><title>Long Dialogue Emotion Detection Based on Commonsense Knowledge Graph Guidance</title><title>IEEE transactions on multimedia</title><addtitle>TMM</addtitle><description>Dialogue emotion detection is always challenging due to human subjectivity and the randomness of dialogue content. In a conversation, the emotion of each person often develops via a cumulative process, which can be influenced by many elements of uncertainty. Much commonsense knowledge influences people's emotions imperceptibly, such as experiential or habitual knowledge. In the process of conversation, this commonsense knowledge information can be used to enrich the semantic information of each utterance and improve the accuracy of emotion recognition. In this paper, we propose a growing graph model for dialogues emotion detection based on retrieval of external knowledge atlas ATOMIC from local and global respectively, which can effectively represent the dialogues as a process variable in a sequence and the correlation among utterances also can be represented by the graph model. In particular, 1) we introduce a common sense knowledge graph for linking the commonsense knowledge retrieved from external knowledge atlas ATOMIC, which can effectively add auxiliary information to improve the performance of each utterance's representation. 2) We propose a novel self-supervised learning method for extracting the latent topic of each dialogue. Based on this design, we also propose an effective optimization mechanism to make the representation (embedding) of latent topic has a better distinction for the next operation. 3) Finally, the cross-attention module is utilized to combine the utterances' features and the latent conversation topic information. The attention mechanism can effectively use topic information to supplement the representation of utterances and improve recognition performance. The model is tested on three popular datasets in dialogue emotion detection and is empirically demonstrated to outperform the state-of-the-art approaches. Meanwhile, to demonstrate the performance of our approach, we also build a long dialogue dataset. The average length of each conversation is over 50 utterances. The final experimental results also demonstrate the superior performance of our approach.</description><subject>commonsense knowledge graph</subject><subject>Commonsense reasoning</subject><subject>Correlation</subject><subject>Datasets</subject><subject>Design optimization</subject><subject>Emotion detection</subject><subject>Emotion recognition</subject><subject>Emotions</subject><subject>Graphical representations</subject><subject>growing graph</subject><subject>Knowledge</subject><subject>Knowledge representation</subject><subject>Oral communication</subject><subject>Performance enhancement</subject><subject>Process variables</subject><subject>Self-supervised learning</subject><subject>Social networking (online)</subject><subject>Speech recognition</subject><subject>topic module</subject><subject>Transformers</subject><issn>1520-9210</issn><issn>1941-0077</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNpNkDFPwzAQRi0EEqWwMzBEYk4520mcG6EtBdGKpcyW655LqiYucSLEv8elHZBOum943530GLvlMOIc8GG5WIwECDmSolAC8zM24JjxFECp85hzASkKDpfsKoQtAM9yUAO2mPtmk0wqs_ObnpJp7bvKN8mEOrJ_6ckEWicxjH1d-yZQnOSt8d87Wm8ombVm_5nM-mptGkvX7MKZXaCb0x6yj-fpcvySzt9nr-PHeWoFii6VFqHISltaYzMnDTm0biUcIvKizFUJyqy4wAIccikLdCBKwzNUriCUJIfs_nh33_qvnkKnt75vm_hSC4zFTCBCpOBI2daH0JLT-7aqTfujOeiDNB2l6YM0fZIWK3fHSkVE_3AOIldS_gKMJmcG</recordid><startdate>2024</startdate><enddate>2024</enddate><creator>Nie, Weizhi</creator><creator>Bao, Yuru</creator><creator>Zhao, Yue</creator><creator>Liu, Anan</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><orcidid>https://orcid.org/0000-0002-0578-8138</orcidid><orcidid>https://orcid.org/0000-0001-5755-9145</orcidid><orcidid>https://orcid.org/0000-0001-8390-2410</orcidid></search><sort><creationdate>2024</creationdate><title>Long Dialogue Emotion Detection Based on Commonsense Knowledge Graph Guidance</title><author>Nie, Weizhi ; Bao, Yuru ; Zhao, Yue ; Liu, Anan</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c292t-3c90648c8cac4f3aef9cfb2f99916857807ab12960f913369f028a1497f6e93e3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>commonsense knowledge graph</topic><topic>Commonsense reasoning</topic><topic>Correlation</topic><topic>Datasets</topic><topic>Design optimization</topic><topic>Emotion detection</topic><topic>Emotion recognition</topic><topic>Emotions</topic><topic>Graphical representations</topic><topic>growing graph</topic><topic>Knowledge</topic><topic>Knowledge representation</topic><topic>Oral communication</topic><topic>Performance enhancement</topic><topic>Process variables</topic><topic>Self-supervised learning</topic><topic>Social networking (online)</topic><topic>Speech recognition</topic><topic>topic module</topic><topic>Transformers</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Nie, Weizhi</creatorcontrib><creatorcontrib>Bao, Yuru</creatorcontrib><creatorcontrib>Zhao, Yue</creatorcontrib><creatorcontrib>Liu, Anan</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998–Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>IEEE transactions on multimedia</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Nie, Weizhi</au><au>Bao, Yuru</au><au>Zhao, Yue</au><au>Liu, Anan</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Long Dialogue Emotion Detection Based on Commonsense Knowledge Graph Guidance</atitle><jtitle>IEEE transactions on multimedia</jtitle><stitle>TMM</stitle><date>2024</date><risdate>2024</risdate><volume>26</volume><spage>514</spage><epage>528</epage><pages>514-528</pages><issn>1520-9210</issn><eissn>1941-0077</eissn><coden>ITMUF8</coden><abstract>Dialogue emotion detection is always challenging due to human subjectivity and the randomness of dialogue content. In a conversation, the emotion of each person often develops via a cumulative process, which can be influenced by many elements of uncertainty. Much commonsense knowledge influences people's emotions imperceptibly, such as experiential or habitual knowledge. In the process of conversation, this commonsense knowledge information can be used to enrich the semantic information of each utterance and improve the accuracy of emotion recognition. In this paper, we propose a growing graph model for dialogues emotion detection based on retrieval of external knowledge atlas ATOMIC from local and global respectively, which can effectively represent the dialogues as a process variable in a sequence and the correlation among utterances also can be represented by the graph model. In particular, 1) we introduce a common sense knowledge graph for linking the commonsense knowledge retrieved from external knowledge atlas ATOMIC, which can effectively add auxiliary information to improve the performance of each utterance's representation. 2) We propose a novel self-supervised learning method for extracting the latent topic of each dialogue. Based on this design, we also propose an effective optimization mechanism to make the representation (embedding) of latent topic has a better distinction for the next operation. 3) Finally, the cross-attention module is utilized to combine the utterances' features and the latent conversation topic information. The attention mechanism can effectively use topic information to supplement the representation of utterances and improve recognition performance. The model is tested on three popular datasets in dialogue emotion detection and is empirically demonstrated to outperform the state-of-the-art approaches. Meanwhile, to demonstrate the performance of our approach, we also build a long dialogue dataset. The average length of each conversation is over 50 utterances. The final experimental results also demonstrate the superior performance of our approach.</abstract><cop>Piscataway</cop><pub>IEEE</pub><doi>10.1109/TMM.2023.3267295</doi><tpages>15</tpages><orcidid>https://orcid.org/0000-0002-0578-8138</orcidid><orcidid>https://orcid.org/0000-0001-5755-9145</orcidid><orcidid>https://orcid.org/0000-0001-8390-2410</orcidid></addata></record>
fulltext	fulltext_linktorsrc
identifier	ISSN: 1520-9210
ispartof	IEEE transactions on multimedia, 2024, Vol.26, p.514-528
issn	1520-9210 1941-0077
language	eng
recordid	cdi_proquest_journals_2912942990
source	IEEE Electronic Library (IEL)
subjects	commonsense knowledge graph Commonsense reasoning Correlation Datasets Design optimization Emotion detection Emotion recognition Emotions Graphical representations growing graph Knowledge Knowledge representation Oral communication Performance enhancement Process variables Self-supervised learning Social networking (online) Speech recognition topic module Transformers
title	Long Dialogue Emotion Detection Based on Commonsense Knowledge Graph Guidance
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-17T03%3A58%3A59IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Long%20Dialogue%20Emotion%20Detection%20Based%20on%20Commonsense%20Knowledge%20Graph%20Guidance&rft.jtitle=IEEE%20transactions%20on%20multimedia&rft.au=Nie,%20Weizhi&rft.date=2024&rft.volume=26&rft.spage=514&rft.epage=528&rft.pages=514-528&rft.issn=1520-9210&rft.eissn=1941-0077&rft.coden=ITMUF8&rft_id=info:doi/10.1109/TMM.2023.3267295&rft_dat=%3Cproquest_RIE%3E2912942990%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2912942990&rft_id=info:pmid/&rft_ieee_id=10102573&rfr_iscdi=true