Data compression by hamming distance categorization
Data is compressed based on non-identical similarity between a first data set and a second data set. A representation of the differences is used to represent one of the data sets. For example, a probabilistically unique value may be generated as a new block label. Probabilistic comparison of the new...
Gespeichert in:
Hauptverfasser: | , |
---|---|
Format: | Patent |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Data is compressed based on non-identical similarity between a first data set and a second data set. A representation of the differences is used to represent one of the data sets. For example, a probabilistically unique value may be generated as a new block label. Probabilistic comparison of the new block label with a plurality of training labels associated with training blocks produces a plurality of training labels that are potentially similar to the new block label. The Hamming distance between each potentially similar training label and the new block label is determined to select the training label with the smallest calculated Hamming distance from the new block label. A bitmap of differences between the new block and the training block associated with the selected training label is compressed and stored as a compressed representation of the new block. |
---|