Asymmetric Short-Text Clustering via Prompt

Short-text clustering, which has attracted much attention with the rapid development of social media in recent decades, is a great challenge due to the feature sparsity, high ambiguity, and massive quantity. Recently, pre-trained language models (PLMs)-based methods have achieved fairly good results...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	New generation computing 2024, Vol.42 (4), p.599-615
Hauptverfasser:	Wang, Zhi, Zhu, Yi, Li, Yun, Qiang, Jipeng, Yuan, Yunhao, Zhang, Chaowei
Format:	Artikel
Sprache:	eng
Schlagworte:	Artificial Intelligence Building codes Clustering Computer Hardware Computer Science Computer Systems Organization and Communication Networks Labels Learning Representations Software Engineering/Programming and Operating Systems Source code
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Short-text clustering, which has attracted much attention with the rapid development of social media in recent decades, is a great challenge due to the feature sparsity, high ambiguity, and massive quantity. Recently, pre-trained language models (PLMs)-based methods have achieved fairly good results on this task. However, two main problems still hang in the air: (1) the significant gap of objective forms in pretraining and fine-tuning, which restricts taking full advantage of knowledge in PLMs. (2) Most existing methods require a post-processing operation for clustering label learning, potentially leading to label estimation errors for different data distributions. To address these problems, in this paper, we propose an Asymmetric Short-Text Clustering via Prompt (short for ASTCP), the features learned with our ASTCP are denser and constricted for clustering. Specifically, a subset text of the corpus is first selected by an asymmetric prompt-tuning network, which aims to obtain predicted label as a clustering center. Then, by the propagation of predicted-label information, a fine-tuned model is designed for representation learning. Thus, a clustering module, such as K-means, is built to directly output clustering labels on top of these representations. Extensive experiments conducted on three datasets have demonstrated that our ASTCP can significantly and consistently outperform other SOTA clustering methods. The source code is available at https://github.com/zhuyi_yzu/ASTCP .
ISSN:	0288-3635 1882-7055
DOI:	10.1007/s00354-024-00244-7