Learning to Perturb for Contrastive Learning of Unsupervised Sentence Representations

Recently, contrastive learning has been shown effective in fine-tuning pre-trained language models (PLM) to learn sentence representations, which incorporates perturbations into unlabeled sentences to augment semantically related positive examples for training. However, previous works mostly adopt h...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE/ACM transactions on audio, speech, and language processing speech, and language processing, 2023-01, Vol.31, p.1-10
Hauptverfasser:	Zhou, Kun, Zhou, Yuanhang, Zhao, Wayne Xin, Wen, Ji-Rong
Format:	Artikel
Sprache:	eng
Schlagworte:	Adaptive algorithms Contrastive Learning Heuristic methods Learning Natural language processing Perturbation methods Representation learning Representations Semantics Sentences Task analysis Training Transformers Unsupervised Sentence Representations
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Recently, contrastive learning has been shown effective in fine-tuning pre-trained language models (PLM) to learn sentence representations, which incorporates perturbations into unlabeled sentences to augment semantically related positive examples for training. However, previous works mostly adopt heuristic perturbation methods that are independent of the sentence representations. Since the perturbations are unaware of the goal or process of sentence representation learning during training, it is likely to lead to sub-optimal augmentations for conducting constrative learning. To address this issue, we propose a new framework L2P-CSR that adopts a learnable perturbation strategy for improving contrastive learning of sentence representations. In our L2P-CSR, we design a safer perturbation mechanism that only weakens the influence of tokens and features on the sentence representation, which avoids dramatically changing the semantics of the sentence representations. Besides, we devise a gradient-based algorithm to generate adaptive perturbations specially for the dynamically updated sentence representation during training. Such a way is more capable of augmenting high-quality examples that guide the sentence representation learning. Extensive experiments on diverse sentence-related tasks show that our approach outperforms competitive baselines.
ISSN:	2329-9290 2329-9304
DOI:	10.1109/TASLP.2023.3304485