Short text classification with Soft Knowledgeable Prompt-tuning

Over the past few decades, short text classification has emerged as a critical downstream task in natural language processing (NLP). One crucial classification research issue is how to advance semantic understanding considering the short length, feature sparsity, and high ambiguity in short texts. R...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Expert systems with applications 2024-07, Vol.246, p.123248, Article 123248
Hauptverfasser: Zhu, Yi, Wang, Ye, Mu, Jianyuan, Li, Yun, Qiang, Jipeng, Yuan, Yunhao, Wu, Xindong
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Over the past few decades, short text classification has emerged as a critical downstream task in natural language processing (NLP). One crucial classification research issue is how to advance semantic understanding considering the short length, feature sparsity, and high ambiguity in short texts. Recently, prompt-tuning has been proposed to insert a template into the input and convert text classification tasks into equivalent cloze-style tasks. However, among most of the previous approaches, either the crafted template methods are time-consuming and labor-intensive, or automatic prompt generation methods cannot achieve satisfied performance. In this paper, we introduce a novel approach called Soft Knowledgeable Prompt-tuning for short text classification. Our method considers both the template generation and classification performance to construct prompts for label prediction. We employ five different strategies to expand the label words space for modifying soft prompts, and the integration of these strategies is used as the final verbalizer. Despite being automatic, experimental results show that our method achieved more desirable performance even than the crafted template methods, outperforming the state-of-the-art by more than 14 Accuracy points on four well-known benchmarks. •Both the automatic prompt generation and text classification performance are achieved.•5 different strategies are used to construct verbalizer effectively and efficiently.•The comprehensive experiments evaluate the effectiveness of our proposed method.
ISSN:0957-4174
1873-6793
DOI:10.1016/j.eswa.2024.123248