DMSeqNet-mBART: A state-of-the-art Adaptive-DropMessage enhanced mBART architecture for superior Chinese short news text summarization

Mandarin Chinese, a globally prevalent language, boasts an abundance of regularly refreshed short news texts accessible online. Consequently, devising concise summaries of these texts has emerged as a pivotal challenge for enhancing information dissemination and comprehension efficiency. To tackle t...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Expert systems with applications 2024-12, Vol.257, p.125095, Article 125095
Hauptverfasser: Cao, Kangjie, Cheng, Weijun, Hao, Yiya, Gan, Yichao, Gao, Ruihuan, Zhu, Junxu, Wu, Jinyao
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Mandarin Chinese, a globally prevalent language, boasts an abundance of regularly refreshed short news texts accessible online. Consequently, devising concise summaries of these texts has emerged as a pivotal challenge for enhancing information dissemination and comprehension efficiency. To tackle this issue, we introduce DMSeqNet-mBART, an innovative model grounded in the mBART framework, positioning it as a state-of-the-art solution for Chinese short news summarization. DMSeqNet-mBART incorporates the Adaptive-DropMessage technique, an innovative methodology that intelligently discards or retains information contingent upon the attention mechanism’s output. Furthermore, the model integrates several enhanced technologies, such as dynamic convolutional layers, gated residual connections, customized feed-forward networks enhanced with batch normalization, self-attention, and cross-attention, all aimed at bolstering the performance and robustness of Chinese short news summarization. Rigorous comparative experiments, conducted across six recognized Chinese short news summary datasets, demonstrate that DMSeqNet-mBART significantly surpasses industry-leading models like T5, MLC, PLCC, and GPT-4 in terms of fluency, completeness, robustness, and accuracy. These results, as validated by benchmarks including BERTScore, BLEU, and ROUGE metrics, underscore the model’s superiority across diverse evaluation standards. •Introduces DMSeqNet-mBART for Chinese text summarization.•Integrates Adaptive-DropMessage, enhancing key info extraction.•Achieves SOTA in six key Chinese news summarization datasets.•Sheds light on complex Chinese text processing challenges.
ISSN:0957-4174
DOI:10.1016/j.eswa.2024.125095