Semantic similarity analysis method based on text clustering

The invention discloses a semantic similarity analysis method based on text clustering. The method comprises the following steps: taking unprocessed text data as input; performing word frequency statistics on texts subjected to data preprocessing, adding word frequency statistics information serving...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: SI PENGJU, LI XIN, GONG FAMING, MA YUHUI, TANG YURUN
Format: Patent
Sprache:chi ; eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The invention discloses a semantic similarity analysis method based on text clustering. The method comprises the following steps: taking unprocessed text data as input; performing word frequency statistics on texts subjected to data preprocessing, adding word frequency statistics information serving as priori knowledge into text clustering, proposing a posteriori judgment criterion, and performingan unsupervised clustering method on the basis of taking the word frequency statistics as a classifier to improve the accuracy and timeliness of a text clustering result; carrying out synonym ambiguity elimination on the processed text, and carrying out semantic role labeling; and generating a semantic vector fused with the context features, processing the text sequence by adopting two LSTMs withcompletely same structures and parameters, adding the product and variance of the results, amplifying the same points and differences of the texts, and calculating to obtain a final result of similarity analysis. The method c