Task-independent Recognition of Communication Skills in Group Interaction Using Time-series Modeling
Case studies of group discussions are considered an effective way to assess communication skills (CS). This method can help researchers evaluate participants’ engagement with each other in a specific realistic context. In this article, multimodal analysis was performed to estimate CS indices using a...
Gespeichert in:
Veröffentlicht in: | ACM transactions on multimedia computing communications and applications 2021-11, Vol.17 (4), p.1-27 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 27 |
---|---|
container_issue | 4 |
container_start_page | 1 |
container_title | ACM transactions on multimedia computing communications and applications |
container_volume | 17 |
creator | Mawalim, Candy Olivia Okada, Shogo Nakano, Yukiko I. |
description | Case studies of group discussions are considered an effective way to assess communication skills (CS). This method can help researchers evaluate participants’ engagement with each other in a specific realistic context. In this article, multimodal analysis was performed to estimate CS indices using a three-task-type group discussion dataset, the MATRICS corpus. The current research investigated the effectiveness of engaging both static and time-series modeling, especially in task-independent settings. This investigation aimed to understand three main points: first, the effectiveness of time-series modeling compared to nonsequential modeling; second, multimodal analysis in a task-independent setting; and third, important differences to consider when dealing with task-dependent and task-independent settings, specifically in terms of modalities and prediction models. Several modalities were extracted (e.g., acoustics, speaking turns, linguistic-related movement, dialog tags, head motions, and face feature sets) for inferring the CS indices as a regression task. Three predictive models, including support vector regression (SVR), long short-term memory (LSTM), and an enhanced time-series model (an LSTM model with a combination of static and time-series features), were taken into account in this study. Our evaluation was conducted by using the
R
2
score in a cross-validation scheme. The experimental results suggested that time-series modeling can improve the performance of multimodal analysis significantly in the task-dependent setting (with the best
R
2
= 0.797 for the total CS index), with word2vec being the most prominent feature. Unfortunately, highly context-related features did not fit well with the task-independent setting. Thus, we propose an enhanced LSTM model for dealing with task-independent settings, and we successfully obtained better performance with the enhanced model than with the conventional SVR and LSTM models (the best
R
2
= 0.602 for the total CS index). In other words, our study shows that a particular time-series modeling can outperform traditional nonsequential modeling for automatically estimating the CS indices of a participant in a group discussion with regard to task dependency. |
doi_str_mv | 10.1145/3450283 |
format | Article |
fullrecord | <record><control><sourceid>crossref</sourceid><recordid>TN_cdi_crossref_primary_10_1145_3450283</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>10_1145_3450283</sourcerecordid><originalsourceid>FETCH-LOGICAL-c253t-f73662c354a7f8cfe450a39f3b7cad72571b3b3eb1b869481d098eb6afde6ba53</originalsourceid><addsrcrecordid>eNo9UF9LwzAcDKLgnOJXyJtP1aZpkvZRim6DiaDdc8mfX0Zcm5Ske_DbW-fw5e64g4M7hO5J_khIyZ5oyfKiohdoQRgjGa84u_zXTFyjm5S-8pxyVvIFMq1Mh8x5AyPM4Cf8ATrsvZtc8DhY3IRhOHqn5cn4PLi-T9h5vIrhOOKNnyBKfcp2yfk9bt0AWYLoIOG3YKCfzVt0ZWWf4O7MS7R7fWmbdbZ9X22a522mC0anzArKeaEpK6WwlbYwL5G0tlQJLY0omCCKKgqKqIrXZUVMXleguLQGuJKMLtHDX6-OIaUIthujG2T87kje_Z7Tnc-hP9AyWEI</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Task-independent Recognition of Communication Skills in Group Interaction Using Time-series Modeling</title><source>ACM Digital Library</source><creator>Mawalim, Candy Olivia ; Okada, Shogo ; Nakano, Yukiko I.</creator><creatorcontrib>Mawalim, Candy Olivia ; Okada, Shogo ; Nakano, Yukiko I.</creatorcontrib><description>Case studies of group discussions are considered an effective way to assess communication skills (CS). This method can help researchers evaluate participants’ engagement with each other in a specific realistic context. In this article, multimodal analysis was performed to estimate CS indices using a three-task-type group discussion dataset, the MATRICS corpus. The current research investigated the effectiveness of engaging both static and time-series modeling, especially in task-independent settings. This investigation aimed to understand three main points: first, the effectiveness of time-series modeling compared to nonsequential modeling; second, multimodal analysis in a task-independent setting; and third, important differences to consider when dealing with task-dependent and task-independent settings, specifically in terms of modalities and prediction models. Several modalities were extracted (e.g., acoustics, speaking turns, linguistic-related movement, dialog tags, head motions, and face feature sets) for inferring the CS indices as a regression task. Three predictive models, including support vector regression (SVR), long short-term memory (LSTM), and an enhanced time-series model (an LSTM model with a combination of static and time-series features), were taken into account in this study. Our evaluation was conducted by using the
R
2
score in a cross-validation scheme. The experimental results suggested that time-series modeling can improve the performance of multimodal analysis significantly in the task-dependent setting (with the best
R
2
= 0.797 for the total CS index), with word2vec being the most prominent feature. Unfortunately, highly context-related features did not fit well with the task-independent setting. Thus, we propose an enhanced LSTM model for dealing with task-independent settings, and we successfully obtained better performance with the enhanced model than with the conventional SVR and LSTM models (the best
R
2
= 0.602 for the total CS index). In other words, our study shows that a particular time-series modeling can outperform traditional nonsequential modeling for automatically estimating the CS indices of a participant in a group discussion with regard to task dependency.</description><identifier>ISSN: 1551-6857</identifier><identifier>EISSN: 1551-6865</identifier><identifier>DOI: 10.1145/3450283</identifier><language>eng</language><ispartof>ACM transactions on multimedia computing communications and applications, 2021-11, Vol.17 (4), p.1-27</ispartof><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c253t-f73662c354a7f8cfe450a39f3b7cad72571b3b3eb1b869481d098eb6afde6ba53</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,777,781,27905,27906</link.rule.ids></links><search><creatorcontrib>Mawalim, Candy Olivia</creatorcontrib><creatorcontrib>Okada, Shogo</creatorcontrib><creatorcontrib>Nakano, Yukiko I.</creatorcontrib><title>Task-independent Recognition of Communication Skills in Group Interaction Using Time-series Modeling</title><title>ACM transactions on multimedia computing communications and applications</title><description>Case studies of group discussions are considered an effective way to assess communication skills (CS). This method can help researchers evaluate participants’ engagement with each other in a specific realistic context. In this article, multimodal analysis was performed to estimate CS indices using a three-task-type group discussion dataset, the MATRICS corpus. The current research investigated the effectiveness of engaging both static and time-series modeling, especially in task-independent settings. This investigation aimed to understand three main points: first, the effectiveness of time-series modeling compared to nonsequential modeling; second, multimodal analysis in a task-independent setting; and third, important differences to consider when dealing with task-dependent and task-independent settings, specifically in terms of modalities and prediction models. Several modalities were extracted (e.g., acoustics, speaking turns, linguistic-related movement, dialog tags, head motions, and face feature sets) for inferring the CS indices as a regression task. Three predictive models, including support vector regression (SVR), long short-term memory (LSTM), and an enhanced time-series model (an LSTM model with a combination of static and time-series features), were taken into account in this study. Our evaluation was conducted by using the
R
2
score in a cross-validation scheme. The experimental results suggested that time-series modeling can improve the performance of multimodal analysis significantly in the task-dependent setting (with the best
R
2
= 0.797 for the total CS index), with word2vec being the most prominent feature. Unfortunately, highly context-related features did not fit well with the task-independent setting. Thus, we propose an enhanced LSTM model for dealing with task-independent settings, and we successfully obtained better performance with the enhanced model than with the conventional SVR and LSTM models (the best
R
2
= 0.602 for the total CS index). In other words, our study shows that a particular time-series modeling can outperform traditional nonsequential modeling for automatically estimating the CS indices of a participant in a group discussion with regard to task dependency.</description><issn>1551-6857</issn><issn>1551-6865</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2021</creationdate><recordtype>article</recordtype><recordid>eNo9UF9LwzAcDKLgnOJXyJtP1aZpkvZRim6DiaDdc8mfX0Zcm5Ske_DbW-fw5e64g4M7hO5J_khIyZ5oyfKiohdoQRgjGa84u_zXTFyjm5S-8pxyVvIFMq1Mh8x5AyPM4Cf8ATrsvZtc8DhY3IRhOHqn5cn4PLi-T9h5vIrhOOKNnyBKfcp2yfk9bt0AWYLoIOG3YKCfzVt0ZWWf4O7MS7R7fWmbdbZ9X22a522mC0anzArKeaEpK6WwlbYwL5G0tlQJLY0omCCKKgqKqIrXZUVMXleguLQGuJKMLtHDX6-OIaUIthujG2T87kje_Z7Tnc-hP9AyWEI</recordid><startdate>20211101</startdate><enddate>20211101</enddate><creator>Mawalim, Candy Olivia</creator><creator>Okada, Shogo</creator><creator>Nakano, Yukiko I.</creator><scope>AAYXX</scope><scope>CITATION</scope></search><sort><creationdate>20211101</creationdate><title>Task-independent Recognition of Communication Skills in Group Interaction Using Time-series Modeling</title><author>Mawalim, Candy Olivia ; Okada, Shogo ; Nakano, Yukiko I.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c253t-f73662c354a7f8cfe450a39f3b7cad72571b3b3eb1b869481d098eb6afde6ba53</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2021</creationdate><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Mawalim, Candy Olivia</creatorcontrib><creatorcontrib>Okada, Shogo</creatorcontrib><creatorcontrib>Nakano, Yukiko I.</creatorcontrib><collection>CrossRef</collection><jtitle>ACM transactions on multimedia computing communications and applications</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Mawalim, Candy Olivia</au><au>Okada, Shogo</au><au>Nakano, Yukiko I.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Task-independent Recognition of Communication Skills in Group Interaction Using Time-series Modeling</atitle><jtitle>ACM transactions on multimedia computing communications and applications</jtitle><date>2021-11-01</date><risdate>2021</risdate><volume>17</volume><issue>4</issue><spage>1</spage><epage>27</epage><pages>1-27</pages><issn>1551-6857</issn><eissn>1551-6865</eissn><abstract>Case studies of group discussions are considered an effective way to assess communication skills (CS). This method can help researchers evaluate participants’ engagement with each other in a specific realistic context. In this article, multimodal analysis was performed to estimate CS indices using a three-task-type group discussion dataset, the MATRICS corpus. The current research investigated the effectiveness of engaging both static and time-series modeling, especially in task-independent settings. This investigation aimed to understand three main points: first, the effectiveness of time-series modeling compared to nonsequential modeling; second, multimodal analysis in a task-independent setting; and third, important differences to consider when dealing with task-dependent and task-independent settings, specifically in terms of modalities and prediction models. Several modalities were extracted (e.g., acoustics, speaking turns, linguistic-related movement, dialog tags, head motions, and face feature sets) for inferring the CS indices as a regression task. Three predictive models, including support vector regression (SVR), long short-term memory (LSTM), and an enhanced time-series model (an LSTM model with a combination of static and time-series features), were taken into account in this study. Our evaluation was conducted by using the
R
2
score in a cross-validation scheme. The experimental results suggested that time-series modeling can improve the performance of multimodal analysis significantly in the task-dependent setting (with the best
R
2
= 0.797 for the total CS index), with word2vec being the most prominent feature. Unfortunately, highly context-related features did not fit well with the task-independent setting. Thus, we propose an enhanced LSTM model for dealing with task-independent settings, and we successfully obtained better performance with the enhanced model than with the conventional SVR and LSTM models (the best
R
2
= 0.602 for the total CS index). In other words, our study shows that a particular time-series modeling can outperform traditional nonsequential modeling for automatically estimating the CS indices of a participant in a group discussion with regard to task dependency.</abstract><doi>10.1145/3450283</doi><tpages>27</tpages></addata></record> |
fulltext | fulltext |
identifier | ISSN: 1551-6857 |
ispartof | ACM transactions on multimedia computing communications and applications, 2021-11, Vol.17 (4), p.1-27 |
issn | 1551-6857 1551-6865 |
language | eng |
recordid | cdi_crossref_primary_10_1145_3450283 |
source | ACM Digital Library |
title | Task-independent Recognition of Communication Skills in Group Interaction Using Time-series Modeling |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-19T20%3A50%3A34IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-crossref&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Task-independent%20Recognition%20of%20Communication%20Skills%20in%20Group%20Interaction%20Using%20Time-series%20Modeling&rft.jtitle=ACM%20transactions%20on%20multimedia%20computing%20communications%20and%20applications&rft.au=Mawalim,%20Candy%20Olivia&rft.date=2021-11-01&rft.volume=17&rft.issue=4&rft.spage=1&rft.epage=27&rft.pages=1-27&rft.issn=1551-6857&rft.eissn=1551-6865&rft_id=info:doi/10.1145/3450283&rft_dat=%3Ccrossref%3E10_1145_3450283%3C/crossref%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true |