Accelerated and memory efficient similarity matching

A method, a system, and a computer program product for performing accelerated and memory efficient similarity matching. A data stream having a plurality of data zones is received. Each zone includes a zone identifier. A plurality of hashing values for each zone are generated. Each hashing value is g...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	VanderSpek, Adrian T, Smith, Stephen A, Watkins, Peter, Arruda, Luis, Poirier, Jamey C, Zieber, Raz
Format:	Patent
Sprache:	eng
Schlagworte:	CALCULATING COMPUTING COUNTING ELECTRIC DIGITAL DATA PROCESSING PHYSICS
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	A method, a system, and a computer program product for performing accelerated and memory efficient similarity matching. A data stream having a plurality of data zones is received. Each zone includes a zone identifier. A plurality of hashing values for each zone are generated. Each hashing value is generated based on a portion of a zone. A storage structure having a plurality of storage containers is generated. Each storage container stores one or more hashing values associated with each respective storage container and a plurality of zone identifiers referencing the associated hashing values. At least one storage container includes a listing of zone identifiers stored in each storage container. Using the storage structure, the received data stream is deduplicated.