Locality Sensitive Hashing with Extended Partitioning Boundaries

Article Preview

Abstract:

Locality-sensitive hashing is a technique to allow approximate nearest search for large volume of data in a fast manner. Binary code locality-sensitive hashing distributes a data set into buckets labeled with binary code, where binary codes are determined by a set of hash functions. The binary hash codes play the role of partitioning the data space into subspaces. When close neighbors are placed around subspace boundaries, there are chances to fail in locating them. It requires to check neighboring buckets while finding nearest ones. The paper presents a technique to enhance the search performance by introducing the notion of extended boundary. It reduces the potential misses and the search overhead especially for the regions located at the double-napped corners. Keywords: locality sensitive hashing, data search, hashing, data analysis

You might also be interested in these eBooks

Info:

Periodical:

Pages:

804-807

Citation:

Online since:

June 2013

Authors:

Export:

Price:

[1] P. Indyk and R. Motwani, Approximate Nearest Neighbors: Towards Removing the Curse of Dimensionality, Proc. of STOC1998 (1998)

DOI: 10.1145/276698.276876

Google Scholar

[2] K. M. Lee, Locality-sensitive Hashing Techniques for Nearest Neighbor Search, Int. J. of Fuzzy Logic and Intell.Syst., 12(4) (2012)

DOI: 10.5391/ijfis.2012.12.4.300

Google Scholar

[3] M. Datar, N. Immorlica, P. Indyk, and V. S. Mirrokni, Locality-sensitive Hashing Scheme based on p-stable Distribution, Symp. on Computational Geometry, p.253–262 (2004)

DOI: 10.1145/997817.997857

Google Scholar

[4] R. R. Salakhutdinov, G.E. Hinton, Semantic hashing, proc. of SIGIR 2007 (2007)

Google Scholar

[5] A. Gionis, P. Indyk, and R. Motwani, Similarity Search in High Dimensions via Hashing, Proc. of VLDB1999 (1999)

Google Scholar