A Novel Hybrid Methodology of Measuring Sentence Similarity

The problem of measuring sentence similarity is an essential issue in the natural language processing area. It is necessary to measure the similarity between sentences accurately. Sentence similarity measuring is the task of finding semantic symmetry between two sentences, regardless of word order a...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Symmetry (Basel) 2021-08, Vol.13 (8), p.1442
Hauptverfasser:	Yoo, Yongmin, Heo, Tak-Sung, Park, Yeongjoon, Kim, Kyungsun
Format:	Artikel
Sprache:	eng
Schlagworte:	Big Data Correlation coefficients Data processing Deep learning Language lexical relationship Measurement methods Methodology Methods Natural language processing Neural networks sentence similarity Sentences Similarity Similarity measures Words (language)
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	The problem of measuring sentence similarity is an essential issue in the natural language processing area. It is necessary to measure the similarity between sentences accurately. Sentence similarity measuring is the task of finding semantic symmetry between two sentences, regardless of word order and context of the words. There are many approaches to measuring sentence similarity. Deep learning methodology shows a state-of-the-art performance in many natural language processing fields and is used a lot in sentence similarity measurement methods. However, in the natural language processing field, considering the structure of the sentence or the word structure that makes up the sentence is also important. In this study, we propose a methodology combined with both deep learning methodology and a method considering lexical relationships. Our evaluation metric is the Pearson correlation coefficient and Spearman correlation coefficient. As a result, the proposed method outperforms the current approaches on a KorSTS standard benchmark Korean dataset. Moreover, it performs a maximum of a 65% increase than only using deep learning methodology. Experiments show that our proposed method generally results in better performance than those with only a deep learning model.
ISSN:	2073-8994 2073-8994
DOI:	10.3390/sym13081442