Focusing on differences! Sample framework enhances semantic textual similarity with external knowledge
Recently, the widespread application of pre-trained language models (PLMs) such as BERT and RoBERTa has significantly enhanced the performance of tasks related to text semantic similarity. However, methods solely based on PLMs inadequately account for the differential information between sentence pa...
Gespeichert in:
Veröffentlicht in: | Expert systems with applications 2024-12, Vol.255, p.124462, Article 124462 |
---|---|
Hauptverfasser: | , , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Recently, the widespread application of pre-trained language models (PLMs) such as BERT and RoBERTa has significantly enhanced the performance of tasks related to text semantic similarity. However, methods solely based on PLMs inadequately account for the differential information between sentence pairs, thus underestimating the importance of this information in sentence matching. In this paper, we propose the enriching Differential information with External Knowledge framework (DEK), an approach that explicitly extracts differential information and enriches semantics using external knowledge. Specifically, we devise a module for extracting differential words from sentence pairs, obtain synonyms of differential words from WordNet, and construct a differential information graph. We employ Graph Convolutional Networks (GCNs) to extract features from this graph and subsequently integrate this information into sentence embeddings. In this work, we demonstrate that incorporating differential information enables PLMs-based methods to better focus on the differing aspects of sentences. Moreover, DEK seamlessly adapts to contrastive learning of sentence embeddings models, including SimCSE and PromptBert, among others. Comparing to baseline, our method has improved spearman correlation between 0.22 and 0.64, yielding competitive results in the experiments. |
---|---|
ISSN: | 0957-4174 1873-6793 |
DOI: | 10.1016/j.eswa.2024.124462 |