Multi-version file comparison method, device and system and storage medium

The invention discloses a multi-version file comparison method, device and system and a storage medium, and relates to the technical field of big data information processing. The method comprises the following steps: acquiring original text data, and preprocessing the original text data to obtain pr...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: CHEN ZHOU, SHEN YUN, ZHU BIN, CAO LIBIN, LU WEIDONG, ZHANG ZHIHENG, LI QIANG, HE YONGLONG, HUANG SULONG, LU JIAN
Format: Patent
Sprache:chi ; eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The invention discloses a multi-version file comparison method, device and system and a storage medium, and relates to the technical field of big data information processing. The method comprises the following steps: acquiring original text data, and preprocessing the original text data to obtain preprocessed text data; performing word vector representation on the preprocessed text data through a word vector model to obtain word vector text data; processing the word vector text data through a text structure analysis algorithm, and extracting structured information of the text; calculating the similarity among the multi-version files based on the word vector text data and the structured information; and setting a similarity threshold value, and judging whether the multi-version files are similar or not according to the similarity threshold value and the calculated similarity among the multi-version files. According to the multi-version file comparison method provided by the embodiment of the invention, semanti