Long Dialogue Emotion Detection Based on Commonsense Knowledge Graph Guidance
Dialogue emotion detection is always challenging due to human subjectivity and the randomness of dialogue content. In a conversation, the emotion of each person often develops via a cumulative process, which can be influenced by many elements of uncertainty. Much commonsense knowledge influences peo...
Gespeichert in:
Veröffentlicht in: | IEEE transactions on multimedia 2024, Vol.26, p.514-528 |
---|---|
Hauptverfasser: | , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 528 |
---|---|
container_issue | |
container_start_page | 514 |
container_title | IEEE transactions on multimedia |
container_volume | 26 |
creator | Nie, Weizhi Bao, Yuru Zhao, Yue Liu, Anan |
description | Dialogue emotion detection is always challenging due to human subjectivity and the randomness of dialogue content. In a conversation, the emotion of each person often develops via a cumulative process, which can be influenced by many elements of uncertainty. Much commonsense knowledge influences people's emotions imperceptibly, such as experiential or habitual knowledge. In the process of conversation, this commonsense knowledge information can be used to enrich the semantic information of each utterance and improve the accuracy of emotion recognition. In this paper, we propose a growing graph model for dialogues emotion detection based on retrieval of external knowledge atlas ATOMIC from local and global respectively, which can effectively represent the dialogues as a process variable in a sequence and the correlation among utterances also can be represented by the graph model. In particular, 1) we introduce a common sense knowledge graph for linking the commonsense knowledge retrieved from external knowledge atlas ATOMIC, which can effectively add auxiliary information to improve the performance of each utterance's representation. 2) We propose a novel self-supervised learning method for extracting the latent topic of each dialogue. Based on this design, we also propose an effective optimization mechanism to make the representation (embedding) of latent topic has a better distinction for the next operation. 3) Finally, the cross-attention module is utilized to combine the utterances' features and the latent conversation topic information. The attention mechanism can effectively use topic information to supplement the representation of utterances and improve recognition performance. The model is tested on three popular datasets in dialogue emotion detection and is empirically demonstrated to outperform the state-of-the-art approaches. Meanwhile, to demonstrate the performance of our approach, we also build a long dialogue dataset. The average length of each conversation is over 50 utterances. The final experimental results also demonstrate the superior performance of our approach. |
doi_str_mv | 10.1109/TMM.2023.3267295 |
format | Article |
fullrecord | <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_proquest_journals_2912942990</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>10102573</ieee_id><sourcerecordid>2912942990</sourcerecordid><originalsourceid>FETCH-LOGICAL-c292t-3c90648c8cac4f3aef9cfb2f99916857807ab12960f913369f028a1497f6e93e3</originalsourceid><addsrcrecordid>eNpNkDFPwzAQRi0EEqWwMzBEYk4520mcG6EtBdGKpcyW655LqiYucSLEv8elHZBOum943530GLvlMOIc8GG5WIwECDmSolAC8zM24JjxFECp85hzASkKDpfsKoQtAM9yUAO2mPtmk0wqs_ObnpJp7bvKN8mEOrJ_6ckEWicxjH1d-yZQnOSt8d87Wm8ombVm_5nM-mptGkvX7MKZXaCb0x6yj-fpcvySzt9nr-PHeWoFii6VFqHISltaYzMnDTm0biUcIvKizFUJyqy4wAIccikLdCBKwzNUriCUJIfs_nh33_qvnkKnt75vm_hSC4zFTCBCpOBI2daH0JLT-7aqTfujOeiDNB2l6YM0fZIWK3fHSkVE_3AOIldS_gKMJmcG</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2912942990</pqid></control><display><type>article</type><title>Long Dialogue Emotion Detection Based on Commonsense Knowledge Graph Guidance</title><source>IEEE Electronic Library (IEL)</source><creator>Nie, Weizhi ; Bao, Yuru ; Zhao, Yue ; Liu, Anan</creator><creatorcontrib>Nie, Weizhi ; Bao, Yuru ; Zhao, Yue ; Liu, Anan</creatorcontrib><description>Dialogue emotion detection is always challenging due to human subjectivity and the randomness of dialogue content. In a conversation, the emotion of each person often develops via a cumulative process, which can be influenced by many elements of uncertainty. Much commonsense knowledge influences people's emotions imperceptibly, such as experiential or habitual knowledge. In the process of conversation, this commonsense knowledge information can be used to enrich the semantic information of each utterance and improve the accuracy of emotion recognition. In this paper, we propose a growing graph model for dialogues emotion detection based on retrieval of external knowledge atlas ATOMIC from local and global respectively, which can effectively represent the dialogues as a process variable in a sequence and the correlation among utterances also can be represented by the graph model. In particular, 1) we introduce a common sense knowledge graph for linking the commonsense knowledge retrieved from external knowledge atlas ATOMIC, which can effectively add auxiliary information to improve the performance of each utterance's representation. 2) We propose a novel self-supervised learning method for extracting the latent topic of each dialogue. Based on this design, we also propose an effective optimization mechanism to make the representation (embedding) of latent topic has a better distinction for the next operation. 3) Finally, the cross-attention module is utilized to combine the utterances' features and the latent conversation topic information. The attention mechanism can effectively use topic information to supplement the representation of utterances and improve recognition performance. The model is tested on three popular datasets in dialogue emotion detection and is empirically demonstrated to outperform the state-of-the-art approaches. Meanwhile, to demonstrate the performance of our approach, we also build a long dialogue dataset. The average length of each conversation is over 50 utterances. The final experimental results also demonstrate the superior performance of our approach.</description><identifier>ISSN: 1520-9210</identifier><identifier>EISSN: 1941-0077</identifier><identifier>DOI: 10.1109/TMM.2023.3267295</identifier><identifier>CODEN: ITMUF8</identifier><language>eng</language><publisher>Piscataway: IEEE</publisher><subject>commonsense knowledge graph ; Commonsense reasoning ; Correlation ; Datasets ; Design optimization ; Emotion detection ; Emotion recognition ; Emotions ; Graphical representations ; growing graph ; Knowledge ; Knowledge representation ; Oral communication ; Performance enhancement ; Process variables ; Self-supervised learning ; Social networking (online) ; Speech recognition ; topic module ; Transformers</subject><ispartof>IEEE transactions on multimedia, 2024, Vol.26, p.514-528</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2024</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c292t-3c90648c8cac4f3aef9cfb2f99916857807ab12960f913369f028a1497f6e93e3</citedby><cites>FETCH-LOGICAL-c292t-3c90648c8cac4f3aef9cfb2f99916857807ab12960f913369f028a1497f6e93e3</cites><orcidid>0000-0002-0578-8138 ; 0000-0001-5755-9145 ; 0000-0001-8390-2410</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/10102573$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>315,781,785,797,4025,27927,27928,27929,54762</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/10102573$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Nie, Weizhi</creatorcontrib><creatorcontrib>Bao, Yuru</creatorcontrib><creatorcontrib>Zhao, Yue</creatorcontrib><creatorcontrib>Liu, Anan</creatorcontrib><title>Long Dialogue Emotion Detection Based on Commonsense Knowledge Graph Guidance</title><title>IEEE transactions on multimedia</title><addtitle>TMM</addtitle><description>Dialogue emotion detection is always challenging due to human subjectivity and the randomness of dialogue content. In a conversation, the emotion of each person often develops via a cumulative process, which can be influenced by many elements of uncertainty. Much commonsense knowledge influences people's emotions imperceptibly, such as experiential or habitual knowledge. In the process of conversation, this commonsense knowledge information can be used to enrich the semantic information of each utterance and improve the accuracy of emotion recognition. In this paper, we propose a growing graph model for dialogues emotion detection based on retrieval of external knowledge atlas ATOMIC from local and global respectively, which can effectively represent the dialogues as a process variable in a sequence and the correlation among utterances also can be represented by the graph model. In particular, 1) we introduce a common sense knowledge graph for linking the commonsense knowledge retrieved from external knowledge atlas ATOMIC, which can effectively add auxiliary information to improve the performance of each utterance's representation. 2) We propose a novel self-supervised learning method for extracting the latent topic of each dialogue. Based on this design, we also propose an effective optimization mechanism to make the representation (embedding) of latent topic has a better distinction for the next operation. 3) Finally, the cross-attention module is utilized to combine the utterances' features and the latent conversation topic information. The attention mechanism can effectively use topic information to supplement the representation of utterances and improve recognition performance. The model is tested on three popular datasets in dialogue emotion detection and is empirically demonstrated to outperform the state-of-the-art approaches. Meanwhile, to demonstrate the performance of our approach, we also build a long dialogue dataset. The average length of each conversation is over 50 utterances. The final experimental results also demonstrate the superior performance of our approach.</description><subject>commonsense knowledge graph</subject><subject>Commonsense reasoning</subject><subject>Correlation</subject><subject>Datasets</subject><subject>Design optimization</subject><subject>Emotion detection</subject><subject>Emotion recognition</subject><subject>Emotions</subject><subject>Graphical representations</subject><subject>growing graph</subject><subject>Knowledge</subject><subject>Knowledge representation</subject><subject>Oral communication</subject><subject>Performance enhancement</subject><subject>Process variables</subject><subject>Self-supervised learning</subject><subject>Social networking (online)</subject><subject>Speech recognition</subject><subject>topic module</subject><subject>Transformers</subject><issn>1520-9210</issn><issn>1941-0077</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNpNkDFPwzAQRi0EEqWwMzBEYk4520mcG6EtBdGKpcyW655LqiYucSLEv8elHZBOum943530GLvlMOIc8GG5WIwECDmSolAC8zM24JjxFECp85hzASkKDpfsKoQtAM9yUAO2mPtmk0wqs_ObnpJp7bvKN8mEOrJ_6ckEWicxjH1d-yZQnOSt8d87Wm8ombVm_5nM-mptGkvX7MKZXaCb0x6yj-fpcvySzt9nr-PHeWoFii6VFqHISltaYzMnDTm0biUcIvKizFUJyqy4wAIccikLdCBKwzNUriCUJIfs_nh33_qvnkKnt75vm_hSC4zFTCBCpOBI2daH0JLT-7aqTfujOeiDNB2l6YM0fZIWK3fHSkVE_3AOIldS_gKMJmcG</recordid><startdate>2024</startdate><enddate>2024</enddate><creator>Nie, Weizhi</creator><creator>Bao, Yuru</creator><creator>Zhao, Yue</creator><creator>Liu, Anan</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><orcidid>https://orcid.org/0000-0002-0578-8138</orcidid><orcidid>https://orcid.org/0000-0001-5755-9145</orcidid><orcidid>https://orcid.org/0000-0001-8390-2410</orcidid></search><sort><creationdate>2024</creationdate><title>Long Dialogue Emotion Detection Based on Commonsense Knowledge Graph Guidance</title><author>Nie, Weizhi ; Bao, Yuru ; Zhao, Yue ; Liu, Anan</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c292t-3c90648c8cac4f3aef9cfb2f99916857807ab12960f913369f028a1497f6e93e3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>commonsense knowledge graph</topic><topic>Commonsense reasoning</topic><topic>Correlation</topic><topic>Datasets</topic><topic>Design optimization</topic><topic>Emotion detection</topic><topic>Emotion recognition</topic><topic>Emotions</topic><topic>Graphical representations</topic><topic>growing graph</topic><topic>Knowledge</topic><topic>Knowledge representation</topic><topic>Oral communication</topic><topic>Performance enhancement</topic><topic>Process variables</topic><topic>Self-supervised learning</topic><topic>Social networking (online)</topic><topic>Speech recognition</topic><topic>topic module</topic><topic>Transformers</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Nie, Weizhi</creatorcontrib><creatorcontrib>Bao, Yuru</creatorcontrib><creatorcontrib>Zhao, Yue</creatorcontrib><creatorcontrib>Liu, Anan</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998–Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>IEEE transactions on multimedia</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Nie, Weizhi</au><au>Bao, Yuru</au><au>Zhao, Yue</au><au>Liu, Anan</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Long Dialogue Emotion Detection Based on Commonsense Knowledge Graph Guidance</atitle><jtitle>IEEE transactions on multimedia</jtitle><stitle>TMM</stitle><date>2024</date><risdate>2024</risdate><volume>26</volume><spage>514</spage><epage>528</epage><pages>514-528</pages><issn>1520-9210</issn><eissn>1941-0077</eissn><coden>ITMUF8</coden><abstract>Dialogue emotion detection is always challenging due to human subjectivity and the randomness of dialogue content. In a conversation, the emotion of each person often develops via a cumulative process, which can be influenced by many elements of uncertainty. Much commonsense knowledge influences people's emotions imperceptibly, such as experiential or habitual knowledge. In the process of conversation, this commonsense knowledge information can be used to enrich the semantic information of each utterance and improve the accuracy of emotion recognition. In this paper, we propose a growing graph model for dialogues emotion detection based on retrieval of external knowledge atlas ATOMIC from local and global respectively, which can effectively represent the dialogues as a process variable in a sequence and the correlation among utterances also can be represented by the graph model. In particular, 1) we introduce a common sense knowledge graph for linking the commonsense knowledge retrieved from external knowledge atlas ATOMIC, which can effectively add auxiliary information to improve the performance of each utterance's representation. 2) We propose a novel self-supervised learning method for extracting the latent topic of each dialogue. Based on this design, we also propose an effective optimization mechanism to make the representation (embedding) of latent topic has a better distinction for the next operation. 3) Finally, the cross-attention module is utilized to combine the utterances' features and the latent conversation topic information. The attention mechanism can effectively use topic information to supplement the representation of utterances and improve recognition performance. The model is tested on three popular datasets in dialogue emotion detection and is empirically demonstrated to outperform the state-of-the-art approaches. Meanwhile, to demonstrate the performance of our approach, we also build a long dialogue dataset. The average length of each conversation is over 50 utterances. The final experimental results also demonstrate the superior performance of our approach.</abstract><cop>Piscataway</cop><pub>IEEE</pub><doi>10.1109/TMM.2023.3267295</doi><tpages>15</tpages><orcidid>https://orcid.org/0000-0002-0578-8138</orcidid><orcidid>https://orcid.org/0000-0001-5755-9145</orcidid><orcidid>https://orcid.org/0000-0001-8390-2410</orcidid></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | ISSN: 1520-9210 |
ispartof | IEEE transactions on multimedia, 2024, Vol.26, p.514-528 |
issn | 1520-9210 1941-0077 |
language | eng |
recordid | cdi_proquest_journals_2912942990 |
source | IEEE Electronic Library (IEL) |
subjects | commonsense knowledge graph Commonsense reasoning Correlation Datasets Design optimization Emotion detection Emotion recognition Emotions Graphical representations growing graph Knowledge Knowledge representation Oral communication Performance enhancement Process variables Self-supervised learning Social networking (online) Speech recognition topic module Transformers |
title | Long Dialogue Emotion Detection Based on Commonsense Knowledge Graph Guidance |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-17T03%3A58%3A59IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Long%20Dialogue%20Emotion%20Detection%20Based%20on%20Commonsense%20Knowledge%20Graph%20Guidance&rft.jtitle=IEEE%20transactions%20on%20multimedia&rft.au=Nie,%20Weizhi&rft.date=2024&rft.volume=26&rft.spage=514&rft.epage=528&rft.pages=514-528&rft.issn=1520-9210&rft.eissn=1941-0077&rft.coden=ITMUF8&rft_id=info:doi/10.1109/TMM.2023.3267295&rft_dat=%3Cproquest_RIE%3E2912942990%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2912942990&rft_id=info:pmid/&rft_ieee_id=10102573&rfr_iscdi=true |