Interpretable semantic textual similarity of sentences using alignment of chunks with classification and regression
The proposed work is focused on establishing an interpretable Semantic Textual Similarity (iSTS) method for a pair of sentences, which can clarify why two sentences are completely or partially similar or have some variations. This proposed interpretable approach is a pipeline of five modules that be...
Gespeichert in:
Veröffentlicht in: | Applied intelligence (Dordrecht, Netherlands) Netherlands), 2021-10, Vol.51 (10), p.7322-7349 |
---|---|
Hauptverfasser: | , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | The proposed work is focused on establishing an interpretable Semantic Textual Similarity (iSTS) method for a pair of sentences, which can clarify why two sentences are completely or partially similar or have some variations. This proposed interpretable approach is a pipeline of five modules that begins with the pre-processing and chunking of text. Further chunks of two sentences are aligned using a one–to–multi (1:M) chunk aligner. Thereafter, support vector, Gaussian Naive Bayes and k–Nearest Neighbours classifiers are then used to create a multiclass classification algorithm, and different class labels are used to define an alignment type. At last, a multivariate regression algorithm is developed to find the semantic equivalence of an alignment with a score (that ranges from 0 to 5). The efficiency of the proposed method is verified on three different datasets and also compared to other state–of–the–art interpretable STS (iSTS) methods. The evaluated results show that the proposed method performs better than other iSTS methods. Most importantly, the modules of the proposed iSTS method are used to develop a Textual Entailment (TE) method. It is found that, when we combined chunk level, alignment, and sentence level features the entailment results significantly improves. |
---|---|
ISSN: | 0924-669X 1573-7497 |
DOI: | 10.1007/s10489-020-02144-x |