Data storage method, system and equipment based on multi-source heterogeneity and storage medium
The invention provides a multi-source heterogeneous data storage method, system and device and a storage medium, and the method comprises the steps: obtaining a plurality of literature contents according to a plurality of metadata in a database, and obtaining a plurality of feature values correspond...
Gespeichert in:
Hauptverfasser: | , , , , , , |
---|---|
Format: | Patent |
Sprache: | chi ; eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | The invention provides a multi-source heterogeneous data storage method, system and device and a storage medium, and the method comprises the steps: obtaining a plurality of literature contents according to a plurality of metadata in a database, and obtaining a plurality of feature values corresponding to each literature content; calculating each characteristic value according to a text similarity algorithm and a word frequency-reverse file frequency weighting algorithm to obtain a fingerprint value corresponding to each literature content; performing Hamming distance comparison on the fingerprint value corresponding to each literature content and each fingerprint value in a fingerprint value set of the existing literature to obtain a comparison result; and when a comparison result meets a preset condition, judging that the literature content corresponding to the fingerprint value is not repeated, and storing the metadata corresponding to the literature content into a storage system. According to the method, |
---|