KEMoS: A knowledge-enhanced multi-modal summarizing framework for Chinese online meetings
The demand for “online meetings” and “collaborative office work” keeps surging recently, producing an abundant amount of relevant data. How to provide participants with accurate and fast summarizing service has attracted extensive attention. Existing meeting summarizing models overlook the utilizati...
Gespeichert in:
Veröffentlicht in: | Neural networks 2024-10, Vol.178, p.106417, Article 106417 |
---|---|
Hauptverfasser: | , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | The demand for “online meetings” and “collaborative office work” keeps surging recently, producing an abundant amount of relevant data. How to provide participants with accurate and fast summarizing service has attracted extensive attention. Existing meeting summarizing models overlook the utilization of multi-modal information and the information offsetting during summarizing. In this paper, we develop a knowledge-enhanced multi-modal summarizing framework. Firstly, we construct a three-layer multi-modal meeting knowledge graph, including basic, knowledge, and multi-modal layer, to integrate meeting information thoroughly. Then, we raise a topic-based hierarchical clustering approach, which considers information entropy and difference simultaneously, to capture the semantic evolution of meetings. Next, we devise a multi-modal enhanced encoding strategy, including a sentence-level cross-modal encoder, a joint loss function, and a knowledge graph embedding module, to learn the meeting and topic-level presentations. Finally, when generating summaries, we design a topic-enhanced decoding strategy for the Transformer decoder which mitigates semantic offsetting with the aid of topic information. Extensive experiments show that our proposed work consistently outperforms state-of-the-art solutions on the Chinese meeting dataset, where the ROUGE-1, ROUGE-2, and ROUGE-L are 49.98%, 21.03%, and 32.03% respectively. |
---|---|
ISSN: | 0893-6080 1879-2782 1879-2782 |
DOI: | 10.1016/j.neunet.2024.106417 |