Toward more efficient locality‐sensitive hashing via constructing novel hash function cluster
Locality‐sensitive hashing (LSH) is widely used in the context of nearest neighbor search of large‐scale high‐dimensions. However, there are serious imbalance problems between the efficiency of data index structure construction and the query accuracy of LSH methods. In this article, a novel higher‐e...
Gespeichert in:
Veröffentlicht in: | Concurrency and computation 2021-10, Vol.33 (20), p.n/a |
---|---|
Hauptverfasser: | , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Locality‐sensitive hashing (LSH) is widely used in the context of nearest neighbor search of large‐scale high‐dimensions. However, there are serious imbalance problems between the efficiency of data index structure construction and the query accuracy of LSH methods. In this article, a novel higher‐entropy‐hyperplane clusters LSH (HEHC‐LSH) algorithm is proposed, which we improve vector quantization to preprocess the data and greatly shortens the preprocessing time; We innovatively integrate the maximum entropy principle into the distribution estimation algorithm to construct a novel hash function cluster method, also incorporate bootstrap aggregating of ensemble learning, and adopt the parallel index dictionary to improve the generalization performance of the index structure. And in the query stage, we realize the comprehensive filtering of index set using integrated learning idea, which not only avoids a lot of distance calculation, but also improves the quality of query results. We also analyze the rationality and effectiveness of the proposed method. Finally, extensive experiment results show that HEHC‐LSH can achieve more higher precision and efficiency simultaneously comparing to current methods, and reflect the strong robustness on different datasets. |
---|---|
ISSN: | 1532-0626 1532-0634 |
DOI: | 10.1002/cpe.6355 |