Skip to main content
Log in

Real-Valued Embeddings and Sketches for Fast Distance and Similarity Estimation

  • NEW TOOLS OF CYBERNETICS, INFORMATICS, COMPUTER ENGINEERING, AND SYSTEMS ANALYSIS
  • Published:
Cybernetics and Systems Analysis Aims and scope

Abstract

This survey article considers methods and algorithms for fast estimation of data distance/similarity measures from formed real-valued vectors of small dimension. The methods do not use learning and mainly use random projection and sampling. Initial data are mainly high-dimensional vectors with different measures of distance (Euclidean, Manhattan, statistical, etc.) and similarity (dot product, etc.). Vector representations of non-vector data are also considered. The resultant vectors can also be used in similarity search algorithms, machine learning, etc.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. D. A. Rachkovskij, “Binary vectors for fast distance and similarity estimation,” Cybernetics and Systems Analysis, 53, No. 1 (2017) (to be printed).

  2. M. Deza and E. Deza, Encyclopedia of Distances, Springer, Berlin-Heidelberg (2016).

    Book  MATH  Google Scholar 

  3. P. Indyk and J. Matousek, “Low-distortion embeddings of finite metric spaces,” in: Handbook of Discrete and Computational Geometry, Chapman & Hall/CRC, Boca Raton (FL) (2004), pp. 177–196.

    Google Scholar 

  4. J. Matousek, Lecture Notes on Metric Embeddings (2013).

  5. P. Indyk and A. Naor, “Nearest-neighbor-preserving embeddings,” ACM Trans. Algorithms, 3, No. 3, Article No. 31 (2007).

  6. T. Batu, F. Ergun, and C. Sahinalp, “Oblivious string embeddings and edit distance approximations,” SODA’06, 792–801 (2006).

  7. G. Cormode, M. Garofalakis, P. J. Haas, and C. Jermaine, “Synopses for massive data: Samples, histograms, wavelets, sketches,” Foundations and Trends® in Databases, 4, Nos. 1–3, 1–294 (2012).

  8. W. B. Johnson and J. Lindenstrauss, “Extensions of Lipshitz mapping into Hilbert space,” Contemporary Mathematics, 26, 189–206 (1984).

    Article  MathSciNet  MATH  Google Scholar 

  9. P. Indyk and R. Motwani, “Approximate nearest neighbors: Towards removing the curse of dimensionality,” in: Proc. 30th ACM Symp. on Theory of Computing (1998), pp. 604–613.

  10. D. Achlioptas, “Database-friendly random projections: Johnson–Lindenstrauss with binary coins,” Journal of Computer and System Sciences, 66, No. 4, 671–687 (2003).

  11. J. Matousek, “On variants of the Johnson Lindenstrauss lemma,” Random Structures and Algorithms, 33, No. 2, 142–156 (2008).

  12. T. S. Jayram and D. P. Woodruff, “Optimal bounds for Johnson–Lindenstrauss transforms and streaming problems with subconstant error,” ACM Trans. on Algorithms, 9, No. 3, Article 26 (2013).

  13. D. M. Kane, R. Meka, and J. Nelson, “Almost optimal explicit Johnson–Lindenstrauss families,” in: Proc. RANDOM’11 (2011), pp. 628–639.

  14. K. G. Larsen and J. Nelson, “The Johnson–Lindenstrauss lemma is optimal for linear dimensionality reduction,” in: Proc. ICALP’16 (2016).

  15. K. G. Larsen and J. Nelson, Optimality of the Johnson–Lindenstrauss Lemma, arXiv:1609.02094.

  16. R. Vershynin, “Introduction to the non-asymptotic analysis of random matrices,” in: Compressed Sensing, Theory and Applications (2012), pp. 210–268.

  17. V. Buldygin and K. Moskvichova, “The sub-Gaussian norm of a binary random variable,” Theory of Probability and Mathematical Statistics, 86, 33–49 (2013).

    Article  MathSciNet  MATH  Google Scholar 

  18. S. Dirksen, “Dimensionality reduction with sub-Gaussian matrices: A unified theory,” Foundations of Computational Mathematics, 1–30 (2015).

  19. R. G. Baraniuk, M. Davenport, R. A. DeVore, and M. Wakin, “A simple proof of the restricted isometry property for random matrices,” Constr. Approx., 28, No. 3, 253–263 (2008).

    Article  MathSciNet  MATH  Google Scholar 

  20. J. Nelson, E. Price, and M. Wootters, “New constructions of RIP matrices with fast multiplication and fewer rows,” in: Proc. SODA’14 (2014), pp. 1515–1528.

  21. F. Krahmer and R. Ward, “New and improved Johnson–Lindenstrauss embeddings via the Restricted Isometry Property,” SIAM J. Math. Anal., 43, No. 3, 1269–1281 (2011).

    Article  MathSciNet  MATH  Google Scholar 

  22. N. Ailon and H. Rauhut, “Fast and RIP-optimal transforms,” Discrete and Computational Geometry, 52, No. 4, 780–798 (2014).

    Article  MathSciNet  MATH  Google Scholar 

  23. I. Haviv and O. Regev, “The restricted isometry property of subsampled Fourier matrices,” in: Proc. SODA’16, 288–297 (2016).

  24. J. Bourgain, S. Dirksen, and J. Nelson, “Toward a unified theory of sparse dimensionality reduction in Euclidean space,” Geometric and Functional Analysis, 25, No. 4, 1009–1088 (2015).

    Article  MathSciNet  MATH  Google Scholar 

  25. Y. Gordon, “On Milman’s inequality and random subspaces which escape through a mesh in R n,” Geometric Aspects of Functional Analysis, 84–106 (1988).

  26. G. Schechtman, “Two observations regarding embedding subsets of Euclidean spaces in normed spaces,” Adv. Math., 200, No. 1, 125–135 (2006).

    Article  MathSciNet  MATH  Google Scholar 

  27. S. Oymak, B. Recht, and M. Soltanolkotabi, Isometric Sketching of Any Set via the Restricted Isometry Property, arXiv:1506.03521 (2015).

  28. B. Klartag and S. Mendelson, “Empirical processes and random projections,” Journal of Functional Analysis, 225, No. 1, 229–245 (2005).

    Article  MathSciNet  MATH  Google Scholar 

  29. Z. Karnin, Y. Rabani, and A. Shpilka, “Explicit dimension reduction and its applications,” SIAM J. Comput., 41, No. 1, 219–249 (2012).

    Article  MathSciNet  MATH  Google Scholar 

  30. D. M. Kane and J. Nelson, “Sparser Johnson-Lindenstrauss transforms,” Journal of the ACM, 61, No. 1, 4:1–4:23 (2014).

  31. L. Engebretsen, P. Indyk, and R. O’Donnell, “Derandomized dimensionality reduction with applications,” in: Proc. SODA’02, 705–712 (2002).

  32. D. Sivakumar, “Algorithmic derandomization via complexity theory,” in: Proc. 34th Annual ACM Symposium on Theory of Computing, Montreal, QC (2002); ACM, New York (2002), pp. 619–626.

  33. P. Li, T. J. Hastie, and K. W. Church, “Very sparse random projections,” in: Proc. KDD’06 (2006), pp. 287–296.

  34. D. A. Rachkovskij, “Vector data transformation using random binary matrices,” Cybernetics and Systems Analysis, 50, No. 6, 960–968 (2014).

    Article  MathSciNet  MATH  Google Scholar 

  35. S. S. Vempala, The Random Projection Method, American Math. Soc. (2004).

  36. R. I. Arriaga and S. Vempala, “An algorithmic theory of learning: Robust concepts and random projection,” Machine Learning, 63, No. 2, 161–182 (2006).

    Article  MATH  Google Scholar 

  37. A. Kabán, “Improved bounds on the dot product under random projection and random sign projection,” in: Proc. KDD’15 (2015), pp. 487–496.

  38. X. Yi, C. Caramanis, and E. Price, Binary Embedding: Fundamental Limits and Fast Algorithm, arXiv:1502. 05746 (2015).

  39. A. Magen, “Dimensionality reductions in ℓ2 that preserve volumes and distance to affine spaces,” Discrete Comput. Geom., 38, No. 1, 139–153 (2007).

    Article  MathSciNet  MATH  Google Scholar 

  40. Y. Plan and R. Vershynin, “One-bit compressed sensing by linear programming,” Communications on Pure and Applied Mathematics, 66, No. 8, 1275–1297 (2013).

    Article  MathSciNet  MATH  Google Scholar 

  41. N. Ailon and B. Chazelle, “The Fast Johnson–Lindenstrauss Transform and approximate nearest neighbors,” SIAM J. Comput., 39, No. 1, 302–322 (2009).

    Article  MathSciNet  MATH  Google Scholar 

  42. Y. Plan and R. Vershynin, “Dimension reduction by random hyperplane tessellations,” Discrete and Computational Geometry, 51, No. 2, 438–461 (2014).

    Article  MathSciNet  MATH  Google Scholar 

  43. E. Liberty and S. W. Zucker, “The Mailman algorithm: A note on matrix-vector multiplication,” Inf. Process. Lett., 109, No. 3, 179–182 (2009).

    Article  MathSciNet  MATH  Google Scholar 

  44. D. Rachkovskij and S. Slipchenko, “Similarity-based retrieval with structure-sensitive sparse binary distributed representations,” Computational Intelligence, 28, No. 1, 106–129 (2012).

    Article  MathSciNet  Google Scholar 

  45. P. Kanerva, J. Kristoferson, and A. Holst, “Random indexing of text samples for latent semantic analysis,” in: Proc. 22nd Annual Conference of the Cognitive Science Society (2000), p. 1036.

  46. I. S. Misuno, D. A. Rachkovskij, and S. V. Slipchenko, “Vector and distributed representations reflecting semantic relatedness of words,” Mathematical Machines and Systems, No. 3, 50–67 (2005).

    Google Scholar 

  47. D. A. Rachkovskij, I. S. Misuno, and S. V. Slipchenko, “Randomized projective methods for construction of binary sparse vector representations,” Cybernetics and Systems Analysis, 48, No. 1, 146–156 (2012).

    Article  MATH  Google Scholar 

  48. D. A. Rachkovskij, “Formation of similarity-reflecting binary vectors with random binary projections,” Cybernetics and Systems Analysis, 51, No. 2, 313–323 (2015).

    Article  MATH  Google Scholar 

  49. N. Ailon and E. Liberty, “Fast dimension reduction using Rademacher series on dual BCH codes,” Discrete and Computational Geometry, 42, No. 4, 615–630 (2009).

    Article  MathSciNet  MATH  Google Scholar 

  50. E. Liberty, N. Ailon, and A. Singer, “Dense fast random projections and lean Walsh transforms,” Discrete and Computational Geometry, 45, No. 1, 34–44 (2011).

    Article  MathSciNet  MATH  Google Scholar 

  51. N. Ailon and E. Liberty, “An almost optimal unrestricted fast Johnson–Lindenstrauss transform,” ACM Transactions on Algorithms, 9, No. 3. Article No 21 (2013).

  52. H. Rauhut, J. Romberg, and J. Tropp, “Restricted isometries for partial random circulant matrices,” Applied and Computational Harmonic Analysis, 32, No. 2, 242–254 (2012).

    Article  MathSciNet  MATH  Google Scholar 

  53. A. Hinrichs and J. Vybiral, “Johnson-Lindenstrauss lemma for circulant matrices,” Random Structures & Algorithms, 39, No. 3, 391–398 (2011).

    MathSciNet  MATH  Google Scholar 

  54. J. Vybiral, “A variant of the Johnson–Lindenstrauss lemma for circulant matrices,” Journal of Functional Analysis, 260, No. 4, 1096–1105 (2011).

    Article  MathSciNet  MATH  Google Scholar 

  55. F. Krahmer, S. Mendelson, and H. Rauhut, “Suprema of chaos processes and the restricted isometry property,” Comm. Pure Appl. Math., 67, No. 11, 1877–1904 (2014).

    Article  MathSciNet  MATH  Google Scholar 

  56. H. Zhang and L. Cheng, “New bounds for circulant Johnson–Lindenstrauss embeddings,” Communications in Mathematical Sciences, 12, No. 4, 695–705 (2014).

    Article  MathSciNet  MATH  Google Scholar 

  57. Z. Yang, M. Moczulski, M. Denil, N. de Freitas, A. Smola, L. Song, and Z. Wang, “Deep fried convents,” in: Proc. ICCV’15 (2015), pp. 1476–1483.

  58. Y. Cheng, F. X. Yu, R. S. Feris, S. Kumar, A. Choudhary, and S.- F. Chang, “An exploration of parameter redundancy in deep networks with circulant projections,” in: Proc. ICCV’15 (2015), pp. 2857–2865.

  59. V. Sindhwani, T. Sainath, and S. Kumar, “Structured transforms for small-footprint deep learning,” in: Proc. NIPS’15 (2015), pp. 3070–3078.

  60. M. Moczulski, M. Denil, J. Appleyard, N. de Freitas, “ACDC: A structured efficient linear layer,” in: ICLR’16, arXiv:1511.05946 (2016).

  61. K. Weinberger, A. Dasgupta, J. Langford, A. Smola, and J. Attenberg, “Feature hashing for large scale multitask learning,” in: Proc. ICML’09 (2009), pp. 1113–1120.

  62. P. Li, A. Shrivastava, J. L. Moore, and A. C. Konig, “Hashing algorithms for large-scale learning,” in: Proc. NIPS’11 (2011), pp. 2672–2680.

  63. A. Dasgupta, R. Kumar, and T. Sarlos, “A sparse Johnson–Lindenstrauss transform,” in: Proc. STOC’10 (2010), pp. 341–350.

  64. D. M. Kane and J. Nelson, “A derandomized sparse Johnson–Lindenstrauss transform,” Electronic Colloquium on Computational Complexity, 17, Article 98 (2010).

  65. V. Braverman, R. Ostrovsky, and Y. Rabani, “Rademacher chaos, random Eulerian graphs and the sparse Johnson–Lindenstrauss transform,” arXiv:1011.2590 (2010).

  66. J. Nelson and H. L. Nguyen, “Sparsity lower bounds for dimensionality reducing maps,” in: Proc. STOC’13 (2013), pp. 101–110.

  67. S. Ventkatasubramanian and Q. Wang, “The Johnson-Lindenstauss transform: An empirical study,” in: Proc. ALENEX’11 (2011), pp. 164–173.

  68. J. R. Lee and A. Naor, “Embedding the diamond graph in Lp and dimension reduction in L1,” Geometric and Functional Analysis, 14, No. 4, 745–747 (2004).

    Article  MathSciNet  MATH  Google Scholar 

  69. B. Brinkman and M. Charikar, “On the impossibility of dimension reduction in L1,” Journal of the ACM, 52, No. 5, 766–788 (2005).

    Article  MathSciNet  MATH  Google Scholar 

  70. J. Lee, M. Mendel, and A. Naor, “Metric structures in L1: Dimension, snowflakes, and average distortion,” European Journal of Combinatorics, 26, No. 8, 1180–1190 (2005).

    Article  MathSciNet  MATH  Google Scholar 

  71. A. Andoni, M. Charikar, O. Neiman, and H. L. Nguyen, “Near linear lower bounds for dimension reduction in L1,” in: Proc. FOCS’11 (2011), pp. 315–323.

  72. I. Newman and Y. Rabinovich, Finite Volume Spaces and Sparsification, arXiv:1002.3541 (2010).

  73. T. Figiel, J. Lindenstrauss, and V. D. Milman, “The dimension of almost spherical sections of convex bodies,” Acta Math., 139, No. 1, 53–94 (1977).

    Article  MathSciNet  MATH  Google Scholar 

  74. W. B. Johnson and G. Schechtman, “Embedding l m p into l n p ,” Acta Math., 149, No 1, 71–85 (1982).

    Article  MathSciNet  MATH  Google Scholar 

  75. A. Andoni, R. Krauthgamer, and I. P. Razenshteyn, “Sketching and embedding are equivalent for norms,” in: Proc. STOC’15 (2015), pp. 479–488.

  76. Z. Bar-Yossef, T. S. Jayram, R. Kumar, and D. Sivakumar, “An information statistics approach to data stream and communication complexity,” J. Comput. Syst. Sci., 68, No. 4, 702–732 (2004).

    Article  MathSciNet  MATH  Google Scholar 

  77. R. Berinde, A. C. Gilbert, P. Indyk, H. Karloff, M. J. Strauss, “Combining geometry and combinatorics: A unified approach to sparse signal recovery,” in: AAC on CCC’08 (2008), pp. 798–805.

  78. F. Krahmer and R. Ward, “A unified framework for linear dimensionality reduction in L1,” Results in Mathematics, 70, No. 1, 209–231 (2016).

    Article  MathSciNet  MATH  Google Scholar 

  79. Z. Allen-Zhu, R. Gelashvili, and I. Razenshteyn, “Restricted isometry property for general p-norms,” in: Proc. SoCG’15 (2015), pp. 451–460.

  80. Y. Bartal and L.-A. Gottlieb, “Dimension reduction techniques for ℓp (1≤ p ≤ 2), with applications,” in: Proc. SoCG’16 (2016), pp. 16:1–16:15.

  81. P. Indyk, “Stable distributions, pseudorandom generators, embeddings, and data stream computation,” Journal of the ACM, 53, No. 3, 307–323 (2006).

    Article  MathSciNet  MATH  Google Scholar 

  82. P. Li, “Estimators and tail bounds for dimension reduction in ℓα (0 ≤ 2 α ≤ 2) using stable random projections,” in: Proc. SODA’08 (2008), pp. 10–19.

  83. P. Li, “Computationally efficient estimators for dimension reductions using stable random projections,” in: Proc. ICDM’08 (2008), pp. 403–412.

  84. N. Duffield, C. Lund, and M. Thorup, “Priority sampling for estimating arbitrary subset sums,” J. Assoc. Comput. Mach., 54, No. 6, Article No 32 (2007).

  85. E. Cohen, “Distance queries from sampled data: Accurate and efficient,” in: Proc. KDD’14 (2014), pp. 681–690.

  86. P. Li, K. W. Church, and T. J. Hastie, “One sketch for all: Theory and applications of conditional random sampling,” in: Proc. NIPS’08 (2008), pp. 953–960.

  87. M. Thorup, “Bottom-k and priority sampling, set similarity and subset sums with minimal independence,” in: Proc. STOC’13 (2013), pp. 371–378.

  88. M. Charikar, “Similarity estimation techniques from rounding algorithms,” in: Proc. STOC’02 (2002), pp. 380–388.

  89. S. Ioffe, “Improved consistent sampling, weighted minhash and L1 sketching,” in: Proc. ICDM’10 (2010), pp. 246–255.

  90. P. Li, Generalized Min-Max Kernel and Generalized Consistent Weighted Sampling, arXiv:1605.05721 (2016).

  91. E. Cohen, “Estimation for monotone sampling: Competitiveness and customization,” in: Proc. PODC’14 (2014), pp. 124–133.

  92. C. K. I. Williams and M. Seeger, “Using the Nystrom method to speed up kernel machines,” in: Proc. NIPS’00 (2000), pp. 682–688.

  93. G. R. Hjaltason and H. Samet, “Properties of embedding methods for similarity searching in metric spaces,” IEEE Trans. PAMI, 25, No. 5, 530–549 (2003).

    Article  Google Scholar 

  94. J. C. Platt, “FastMap, MetricMap, and Landmark MDS are all Nystrom algorithms,” in: Proc. AISTATS’05 (2005), pp. 261–268.

  95. E. Pekalska and R. P. W. Duin, The Dissimilarity Representation for Pattern Recognition: Foundations and Applications, World Scientific, Singapore (2005).

    Book  MATH  Google Scholar 

  96. E. Chavez, M. Graff, G. Navarro, and E. S. Tellez, “Near neighbor searching with K nearest references,” Information Systems, 51(C), 43–61 (2015).

    Article  Google Scholar 

  97. K. Riesen, M. Neuhaus, and H. Bunke, “Graph embedding in vector spaces by means of prototype selection,” in: Proc. GbRPR’07 (2007), pp. 383–393.

  98. A. Andoni, Nearest Neighbor Search: The Old, the New, and the Impossible, PhD Thesis, Massachusetts Institute of Technology (2009).

  99. V. I. Levenshtein, “Binary codes capable of correcting deletions, insertions, and reversals,” Soviet Physics —Doklady, 10, No. 8, 707–710 (1966).

    MathSciNet  MATH  Google Scholar 

  100. G. Navarro, “A guided tour to approximate string matching,” ACM CSUR, 33, No. 1, 31–88 (2001).

    Article  Google Scholar 

  101. A. Backurs and P. Indyk, “Edit distance cannot be computed in strongly subquadratic time (unless SETH is false),” in: Proc. STOC’15 (2015), pp. 51–58.

  102. K. Bringmann and M. Kunnemann, “Quadratic conditional lower bounds for string problems and dynamic time warping,” in: Proc. FOCS’15 (2015), pp. 79–97.

  103. G. Cormode and S. Muthukrishnan, “The string edit distance matching problem with moves,” ACM Trans. Algorithms, 3, No. 1, 2:1–2:19 (2007).

  104. A. M. Sokolov, “Vector representations for efficient comparison and search for similar strings,” Cybernetics and System Analysis, 43, No. 4, 484–498 (2007).

    Article  MathSciNet  MATH  Google Scholar 

  105. R. Ostrovsky and Y. Rabani, “Low distortion embeddings for edit distance,” Journal of the ACM, 54, No. 5, 23–36 (2007).

    Article  MathSciNet  MATH  Google Scholar 

  106. A. Andoni and K. Onak, “Approximating edit distance in near-linear time,” SIAM Journal on Computing, 41, No. 6, 1635–1648 (2012).

    Article  MathSciNet  MATH  Google Scholar 

  107. A. Andoni, R. Krauthgamer, and K. Onak, “Polylogarithmic approximation for edit distance and the asymmetric query complexity,” in: Proc. FOCS’10 (2010), pp. 377–386.

  108. R. Krauthgamer and Y. Rabani, “Improved lower bounds for embeddings into L1,” in: Proc. SODA’06 (2006), pp. 1010–1017.

  109. A. Andoni, A. Goldberger, A. McGregor, and E. Porat, “Homomorphic fingerprints under misalignments: Sketching edit and shift distances,” in: Proc. STOC’13 (2013), pp. 931–940.

  110. B. Scholkopf and A. J. Smola, Learning with Kernels, Support Vector Machines, Regularization, Optimization, and Beyond, MIT Press, Cambridge (2001).

  111. S. V. N. Vishwanathan, N. N. Schraudolph, R. Kondor, and K. M. Borgwardt, “Graph kernels,” Journal of Machine Learning Research, 11, 1201–1242 (2010).

    MathSciNet  MATH  Google Scholar 

  112. D. Conte, J. Y. Ramel, N. Sidere, M. M. Luqman, B. Gauzere, J. Gibert, L. Brun, and M. Vento, “A comparison of explicit and implicit graph embedding methods for pattern recognition,” LNCS, 7877, 81–90 (2013).

    MathSciNet  MATH  Google Scholar 

  113. A. Feragen, N. Kasenburg, J. Petersen, M. de Bruijne, and K. M. Borgwardt, “Scalable kernels for graphs with continuous attributes,” in: Proc. NIPS’13 (2013), pp. 216–224.

  114. P. Foggia, G. Percannella, and M. Vento, “Graph matching and learning in pattern recognition in the last 10 years,” Int. J. Pattern Recog. Artif. Intell., 28, No. 1, 1–40 (2014).

    Article  MathSciNet  Google Scholar 

  115. N. Halko, P.-G. Martinsson, and J. A. Tropp, “Finding structure with randomness: Probabilistic algorithms for constructing approximate matrix decompositions,” SIAM Review, 53, No. 2, 217–288 (2011).

    Article  MathSciNet  MATH  Google Scholar 

  116. A. Gittens and M. W. Mahoney, “Revisiting the Nystrom method for improved large-scale machine learning,” in: Proc. ICML’13 (2013), pp. 567–575.

  117. M. B. Cohen, Y. T. Lee, C. Musco, C. Musco, R. Peng, and A. Sidford, “Uniform sampling for matrix approximation,” in: Proc. ITCS’15 (2015), pp. 181–190.

  118. S. Wang, Luo Luo, Zhihua Zhang, “SPSD matrix approximation vis column selection: Theories, algorithms, and extensions,” Journal of Machine Learning Research, 17, 1–49 (2016).

    MathSciNet  MATH  Google Scholar 

  119. N. Shervashidze, S. V. N. Vishwanathan, T. Petri, K. Mehlhorn, and K. Borgwardt, “Efficient graphlet kernels for large graph comparison,” JMLR: W&CP, 5, 488–495 (2009).

    Google Scholar 

  120. J. Gibert, E. Valveny, and H. Bunke, “Embedding of graphs with discrete attributes via label frequencies,” Int. J. Patt. Recogn. Artif. Intell., 27, No. 3, 1–27 (2013).

    Article  MathSciNet  Google Scholar 

  121. N. Kriege, M. Neumann, K. Kersting, and P. Mutzel, “Explicit versus implicit graph feature maps: A computational phase transition for walk kernels,” in: Proc. ICDM’14 (2014), pp. 881–886.

  122. A. Rahimi and B. Recht, “Random features for large-scale kernel machines,” in: Proc. NIPS’07 (2007), pp. 1177–1184.

  123. J. A. Tropp, “An introduction to matrix concentration inequalities,” Foundations and Trends® in Machine Learning, 8, Nos. 1–2, 1–230 (2015).

    Article  MATH  Google Scholar 

  124. T. Yang, Y.-F. Li, M. Mahdavi, R. Jin, and Z.-H. Zhou, “Nystrom method vs random Fourier features: A theoretical and empirical comparison,” in: Proc. NIPS’12 (2012), pp. 485–493.

  125. D. J. Sutherland and J. Schneider, “On the error of random Fourier features,” in: Proc. UAI (2015), pp. 862–871.

  126. B. K. Sriperumbudur and Z. Szabo, “Optimal rates for random Fourier features,” in: Proc. NIPS’15 (2015), pp. 1144–1152.

  127. D. Chen and J. M. Phillips, Relative Error Embeddings of the Gaussian Kernel Distance, arXiv:1602.05350.

  128. J. Yang, V. Sindhwani, H. Avron, and M. W. Mahoney, “Quasi-Monte Carlo feature maps for shift-invariant kernels,” in: Proc. ICML’14 (2014), pp. 485–493.

  129. Q. Le, T. Sarlos, and A. J. Smola, “Fastfood — Computing Hilbert space expansions in loglinear time,” JMLR: W&CP, 28, No. 3, 244–252 (2013).

    Google Scholar 

  130. C. Feng, Q. Hu, and S. Liao, “Random feature mapping with signed circulant matrix projection,” in: Proc. IJCAI’15 (2015), pp. 3490–3496.

  131. F. X. Yu, S. Kumar, H. Rowley, and S.-F. Chang, Compact Nonlinear Maps and Circulant Extensions, arXiv:1503.03893 (2015).

  132. K. Choromanski and V. Sindhwani, “Recycling randomness with structure for sublinear time kernel expansions,” in: Proc. ICML (2016), pp. 2502–2510.

  133. A. Vedaldi and A. Zisserman, “Efficient additive kernels via explicit feature maps,” IEEE Trans. PAMI, 34, No. 3, 480–492 (2012).

    Article  Google Scholar 

  134. J. Yang, V. Sindhwani, Q. Fan, H. Avron, and M. W. Mahoney, “Random Laplace feature maps for semigroup kernels on histograms,” in: Proc. CVPR’14 (2014), pp. 971–978.

  135. P. Kar and H. Karnick, “Random feature maps for dot product kernels,” in: Proc. ICAIS’12 (2012), pp. 583–591.

  136. N. Pham and R. Pagh, “Fast and scalable polynomial kernels via explicit feature maps,” in: Proc. KDD’13 (2013), pp. 239–247.

  137. R. Hamid, Y. Xiao, A. Gittens, and D. DeCoste, “Compact random feature maps,” in: Proc. ICML’14 (2014), pp. 19–27.

  138. J. Pennington, F. X. Yu, and S. Kumar, “Spherical random features for polynomial kernels,” in: Proc. NIPS’15 (2015), pp. 1846–1854.

  139. A. Bhattacharya, P. Kar, and M. Pal, “On low distortion embeddings of statistical distance measures into low dimensional spaces.” In: Proc. DEXA’09 (2009), pp. 164–172.

  140. R. J. Kyng, J. M. Phillips, and S. Venkatasubramanian, “Johnson–Lindenstrauss dimensionality reduction on the simplex,” in: Proc. FWCG’10 (2010).

  141. A. Abdullah, A. McGregor, R. Kumar, S. Vassilvitskii, and S. Venkatasubramanian, “Sketching, embedding, and dimensionality reduction in information spaces,” JMLR: W&CP, 41, 948–956 (2016).

    Google Scholar 

  142. S. Guha, P. Indyk, and A. McGregor, “Sketching information divergences,” in: COLT (2007), pp. 424–438.

  143. E. Chávez, G. Navarro, R. Baeza-Yates, and J. L. Marroquín, “Searching in metric spaces,” ACM Computing Surveys, 33, No. 3, 273–321 (2001).

    Article  Google Scholar 

  144. P. Zezula, G. Amato, V. Dohnal, and M. Batko, Similarity Search: The Metric Space Approach,” Springer, New York (2006).

    MATH  Google Scholar 

  145. H. Samet, Foundations of Multidimensional and Metric Data Structures, Morgan Kaufmann, San Francisco (2006).

    MATH  Google Scholar 

  146. M. Muja and D. G. Lowe, “Scalable nearest neighbor algorithms for high dimensional data,” IEEE Trans. on PAMI, 36, No. 11, 2227–2240 (2014).

    Article  Google Scholar 

  147. J. Wang, H. T. Shen, J. Song, and J. Ji, Hashing for Similarity Search: A Survey, arXiv:1408.2927 (2014).

  148. J. Wang, W. Liu, S. Kumar, and S.-F. Chang, “Learning to hash for indexing big data: A survey,” in: Proc. IEEE, 104, No. 1, 34–57 (2016).

  149. P. Li, “Very sparse stable random projections for dimension reduction in l \( \alpha \) (0<α≤2) norm,” in: Proc. SIGKDD’07 (2007), pp. 440–449.

  150. J. Cunningham and Z. Ghahramani, “Linear dimensionality reduction: Survey, insights, and generalizations,” Journal of Machine Learning Research, 16, 2859–2900 (2015).

    MathSciNet  MATH  Google Scholar 

  151. L. J. P. Van der Maaten, E. O. Postma, and H. J. Van den Herik, “Dimensionality reduction: A comparative review,” Tilburg University Technical Report, TiCC-TR 2009-005 (2009).

  152. B. Kulis, “Metric learning: A survey,” Foundations and Trends® in Machine Learning, 5, No. 4, 287–364 (2012).

    Article  MathSciNet  MATH  Google Scholar 

  153. A. Bellet, A. Habrard, and M. Sebban, A Survey on Metric Learning for Feature Vectors and Structured Data, arXiv:1306.6709 (2014).

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to D. A. Rachkovskij.

Additional information

Translated from Kibernetika i Sistemnyi Analiz, No. 6, November–December, 2016, pp. 156–180.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Rachkovskij, D.A. Real-Valued Embeddings and Sketches for Fast Distance and Similarity Estimation. Cybern Syst Anal 52, 967–988 (2016). https://doi.org/10.1007/s10559-016-9899-x

Download citation

  • Received:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10559-016-9899-x

Keywords

Navigation