Multi-modal emotion recognition method and device

The invention relates to the technical field of emotion recognition, in particular to a multi-mode emotion recognition method and device, and the method comprises the steps: carrying out the pre-segmentation processing of long-sequence audio and video information, inputting audio and video feature c...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: CHEN XUEQIN, SHI CHANGWEN
Format: Patent
Sprache:chi ; eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title
container_volume
creator CHEN XUEQIN
SHI CHANGWEN
description The invention relates to the technical field of emotion recognition, in particular to a multi-mode emotion recognition method and device, and the method comprises the steps: carrying out the pre-segmentation processing of long-sequence audio and video information, inputting audio and video feature codes, and extracting audio and video segment-level feature sequences; connecting the audio and video segment-level feature sequences and then mapping the audio and video segment-level feature sequences through a full connection layer to obtain a segment-level emotion similarity feature sequence; using each segment-level emotion similarity feature sequence query element and each audio and video segment-level feature sequence as a key element and a value element, and outputting an audio and video segment-level emotion weighted feature sequence through a multi-head attention mechanism; respectively calculating an audio and video weighted center vector and a center vector of emotion similarity information by utilizing
format Patent
fullrecord <record><control><sourceid>epo_EVB</sourceid><recordid>TN_cdi_epo_espacenet_CN117763446A</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>CN117763446A</sourcerecordid><originalsourceid>FETCH-epo_espacenet_CN117763446A3</originalsourceid><addsrcrecordid>eNrjZDD0Lc0pydTNzU9JzFFIzc0vyczPUyhKTc5Pz8sEs3NTSzLyUxQS81IUUlLLMpNTeRhY0xJzilN5oTQ3g6Kba4izh25qQX58anFBYnJqXmpJvLOfoaG5uZmxiYmZozExagCiASvL</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>patent</recordtype></control><display><type>patent</type><title>Multi-modal emotion recognition method and device</title><source>esp@cenet</source><creator>CHEN XUEQIN ; SHI CHANGWEN</creator><creatorcontrib>CHEN XUEQIN ; SHI CHANGWEN</creatorcontrib><description>The invention relates to the technical field of emotion recognition, in particular to a multi-mode emotion recognition method and device, and the method comprises the steps: carrying out the pre-segmentation processing of long-sequence audio and video information, inputting audio and video feature codes, and extracting audio and video segment-level feature sequences; connecting the audio and video segment-level feature sequences and then mapping the audio and video segment-level feature sequences through a full connection layer to obtain a segment-level emotion similarity feature sequence; using each segment-level emotion similarity feature sequence query element and each audio and video segment-level feature sequence as a key element and a value element, and outputting an audio and video segment-level emotion weighted feature sequence through a multi-head attention mechanism; respectively calculating an audio and video weighted center vector and a center vector of emotion similarity information by utilizing</description><language>chi ; eng</language><subject>ACOUSTICS ; CALCULATING ; COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS ; COMPUTING ; COUNTING ; ELECTRIC DIGITAL DATA PROCESSING ; MUSICAL INSTRUMENTS ; PHYSICS ; SPEECH ANALYSIS OR SYNTHESIS ; SPEECH OR AUDIO CODING OR DECODING ; SPEECH OR VOICE PROCESSING ; SPEECH RECOGNITION</subject><creationdate>2024</creationdate><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://worldwide.espacenet.com/publicationDetails/biblio?FT=D&amp;date=20240326&amp;DB=EPODOC&amp;CC=CN&amp;NR=117763446A$$EHTML$$P50$$Gepo$$Hfree_for_read</linktohtml><link.rule.ids>230,308,780,885,25564,76547</link.rule.ids><linktorsrc>$$Uhttps://worldwide.espacenet.com/publicationDetails/biblio?FT=D&amp;date=20240326&amp;DB=EPODOC&amp;CC=CN&amp;NR=117763446A$$EView_record_in_European_Patent_Office$$FView_record_in_$$GEuropean_Patent_Office$$Hfree_for_read</linktorsrc></links><search><creatorcontrib>CHEN XUEQIN</creatorcontrib><creatorcontrib>SHI CHANGWEN</creatorcontrib><title>Multi-modal emotion recognition method and device</title><description>The invention relates to the technical field of emotion recognition, in particular to a multi-mode emotion recognition method and device, and the method comprises the steps: carrying out the pre-segmentation processing of long-sequence audio and video information, inputting audio and video feature codes, and extracting audio and video segment-level feature sequences; connecting the audio and video segment-level feature sequences and then mapping the audio and video segment-level feature sequences through a full connection layer to obtain a segment-level emotion similarity feature sequence; using each segment-level emotion similarity feature sequence query element and each audio and video segment-level feature sequence as a key element and a value element, and outputting an audio and video segment-level emotion weighted feature sequence through a multi-head attention mechanism; respectively calculating an audio and video weighted center vector and a center vector of emotion similarity information by utilizing</description><subject>ACOUSTICS</subject><subject>CALCULATING</subject><subject>COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS</subject><subject>COMPUTING</subject><subject>COUNTING</subject><subject>ELECTRIC DIGITAL DATA PROCESSING</subject><subject>MUSICAL INSTRUMENTS</subject><subject>PHYSICS</subject><subject>SPEECH ANALYSIS OR SYNTHESIS</subject><subject>SPEECH OR AUDIO CODING OR DECODING</subject><subject>SPEECH OR VOICE PROCESSING</subject><subject>SPEECH RECOGNITION</subject><fulltext>true</fulltext><rsrctype>patent</rsrctype><creationdate>2024</creationdate><recordtype>patent</recordtype><sourceid>EVB</sourceid><recordid>eNrjZDD0Lc0pydTNzU9JzFFIzc0vyczPUyhKTc5Pz8sEs3NTSzLyUxQS81IUUlLLMpNTeRhY0xJzilN5oTQ3g6Kba4izh25qQX58anFBYnJqXmpJvLOfoaG5uZmxiYmZozExagCiASvL</recordid><startdate>20240326</startdate><enddate>20240326</enddate><creator>CHEN XUEQIN</creator><creator>SHI CHANGWEN</creator><scope>EVB</scope></search><sort><creationdate>20240326</creationdate><title>Multi-modal emotion recognition method and device</title><author>CHEN XUEQIN ; SHI CHANGWEN</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-epo_espacenet_CN117763446A3</frbrgroupid><rsrctype>patents</rsrctype><prefilter>patents</prefilter><language>chi ; eng</language><creationdate>2024</creationdate><topic>ACOUSTICS</topic><topic>CALCULATING</topic><topic>COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS</topic><topic>COMPUTING</topic><topic>COUNTING</topic><topic>ELECTRIC DIGITAL DATA PROCESSING</topic><topic>MUSICAL INSTRUMENTS</topic><topic>PHYSICS</topic><topic>SPEECH ANALYSIS OR SYNTHESIS</topic><topic>SPEECH OR AUDIO CODING OR DECODING</topic><topic>SPEECH OR VOICE PROCESSING</topic><topic>SPEECH RECOGNITION</topic><toplevel>online_resources</toplevel><creatorcontrib>CHEN XUEQIN</creatorcontrib><creatorcontrib>SHI CHANGWEN</creatorcontrib><collection>esp@cenet</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>CHEN XUEQIN</au><au>SHI CHANGWEN</au><format>patent</format><genre>patent</genre><ristype>GEN</ristype><title>Multi-modal emotion recognition method and device</title><date>2024-03-26</date><risdate>2024</risdate><abstract>The invention relates to the technical field of emotion recognition, in particular to a multi-mode emotion recognition method and device, and the method comprises the steps: carrying out the pre-segmentation processing of long-sequence audio and video information, inputting audio and video feature codes, and extracting audio and video segment-level feature sequences; connecting the audio and video segment-level feature sequences and then mapping the audio and video segment-level feature sequences through a full connection layer to obtain a segment-level emotion similarity feature sequence; using each segment-level emotion similarity feature sequence query element and each audio and video segment-level feature sequence as a key element and a value element, and outputting an audio and video segment-level emotion weighted feature sequence through a multi-head attention mechanism; respectively calculating an audio and video weighted center vector and a center vector of emotion similarity information by utilizing</abstract><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier
ispartof
issn
language chi ; eng
recordid cdi_epo_espacenet_CN117763446A
source esp@cenet
subjects ACOUSTICS
CALCULATING
COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
COMPUTING
COUNTING
ELECTRIC DIGITAL DATA PROCESSING
MUSICAL INSTRUMENTS
PHYSICS
SPEECH ANALYSIS OR SYNTHESIS
SPEECH OR AUDIO CODING OR DECODING
SPEECH OR VOICE PROCESSING
SPEECH RECOGNITION
title Multi-modal emotion recognition method and device
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-05T10%3A49%3A49IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-epo_EVB&rft_val_fmt=info:ofi/fmt:kev:mtx:patent&rft.genre=patent&rft.au=CHEN%20XUEQIN&rft.date=2024-03-26&rft_id=info:doi/&rft_dat=%3Cepo_EVB%3ECN117763446A%3C/epo_EVB%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true