Data storage method, system and equipment based on multi-source heterogeneity and storage medium

The invention provides a multi-source heterogeneous data storage method, system and device and a storage medium, and the method comprises the steps: obtaining a plurality of literature contents according to a plurality of metadata in a database, and obtaining a plurality of feature values correspond...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: XIAO FANG, GAN ZAOBIN, FAN XIN, GUO JIAJING, ZHUO YINGZHONG, LUO MIN, SONG JIAO
Format: Patent
Sprache:chi ; eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The invention provides a multi-source heterogeneous data storage method, system and device and a storage medium, and the method comprises the steps: obtaining a plurality of literature contents according to a plurality of metadata in a database, and obtaining a plurality of feature values corresponding to each literature content; calculating each characteristic value according to a text similarity algorithm and a word frequency-reverse file frequency weighting algorithm to obtain a fingerprint value corresponding to each literature content; performing Hamming distance comparison on the fingerprint value corresponding to each literature content and each fingerprint value in a fingerprint value set of the existing literature to obtain a comparison result; and when a comparison result meets a preset condition, judging that the literature content corresponding to the fingerprint value is not repeated, and storing the metadata corresponding to the literature content into a storage system. According to the method,