research-article

Fast shortest path distance estimation in large networks

Authors:
Michalis Potamias

Boston university, Boston, USA

Boston university, Boston, USA
View Profile

,
Francesco Bonchi

Yahoo! Research, Barcelona, Spain

Yahoo! Research, Barcelona, Spain
View Profile

,
Carlos Castillo

Yahoo! Research, Barcelona, Spain

Yahoo! Research, Barcelona, Spain
View Profile

,
Aristides Gionis

Yahoo! Research, Barcelona, Spain

Yahoo! Research, Barcelona, Spain
View Profile

CIKM '09: Proceedings of the 18th ACM conference on Information and knowledge managementNovember 2009Pages 867–876https://doi.org/10.1145/1645953.1646063

Published:02 November 2009Publication History

CIKM '09: Proceedings of the 18th ACM conference on Information and knowledge management

Pages 867–876

ABSTRACT

In this paper we study approximate landmark-based methods for point-to-point distance estimation in very large networks. These methods involve selecting a subset of nodes as landmarks and computing offline the distances from each node in the graph to those landmarks. At runtime, when the distance between a pair of nodes is needed, it can be estimated quickly by combining the precomputed distances. We prove that selecting the optimal set of landmarks is an NP-hard problem, and thus heuristic solutions need to be employed. We therefore explore theoretical insights to devise a variety of simple methods that scale well in very large networks. The efficiency of the suggested techniques is tested experimentally using five real-world graphs having millions of edges. While theoretical bounds support the claim that random landmarks work well in practice, our extensive experimentation shows that smart landmark selection can yield dramatically more accurate results: for a given target accuracy, our methods require as much as 250 times less space than selecting landmarks at random. In addition, we demonstrate that at a very small accuracy loss our techniques are several orders of magnitude faster than the state-of-the-art exact methods. Finally, we study an application of our methods to the task of social search in large graphs.

References

I. Abraham, Y. Bartal, H. Chan, K. Dhamdhere, A. Gupta, J. Kleinberg, O. Neiman, and A. Slivkins. Metric embeddings with relaxed guarantees. In FOCS 2005. Google ScholarDigital Library
S. Amer-Yahia, M. Benedikt, L. V. Lakshmanan, and J. Stoyanovic. Efficient network-aware search in collaborative tagging sites. In VLDB 2008.Google Scholar
V. Athitsos, P. Papapetrou, M. Potamias, G. Kollios, and D. Gunopulos. Approximate embedding-based subsequence matching of time series. In SIGMOD 2008. Google ScholarDigital Library
D. Bader, S. Kintali, K. Madduri, and M. Mihail. Approximating betweenness centrality. In WAW 2007. Google ScholarDigital Library
Baeza and Ribeiro. Modern Information Retrieval. ACM Press / Addison-Wesley, 1999. Google ScholarDigital Library
J. Bourgain. On Lipschitz embedding of finite metric spaces in Hilbert space. Israel Journal of Mathematics, 52(1):46--52, March 1985.Google ScholarCross Ref
U. Brandes. A faster algorithm for betweenness centrality. Journal of Mathematical Sociology, 2001.Google ScholarCross Ref
V. Chvatal. A greedy heuristic for the set-covering problem. Mathematics of Operations Research, 1979.Google ScholarDigital Library
T. H. Cormen, C. E. Leiserson, R. L. Rivest, and C. Stein. Introduction to Algorithms, 2nd Edition. The MIT Press, 2001. Google ScholarDigital Library
F. Dabek, R. Cox, F. Kaashoek, and R. Morris. Vivaldi: a decentralized network coordinate system. In SIGCOMM 2004. Google ScholarDigital Library
E. W. Dijkstra. A note on two problems in connexion with graphs. Numerische Mathematik, 1959.Google Scholar
R. Fagin, A. Lotem, and M. Naor Optimal aggregation algorithms for middleware. J. Comput. Syst. Sci., 2003. Google ScholarDigital Library
R. W. Floyd. Algorithm 97: Shortest path. Commun. ACM, 5(6), June 1962. Google ScholarDigital Library
D. Fogaras, and B. Racz Towards Scaling Fully Personalized PageRank. Algorithms and Models for the Web-Graph, pp. 105--117, 2004.Google ScholarCross Ref
L. Freeman. A set of measures of centrality based on betweenness. Sociometry, 40(1):35--41, 1977.Google ScholarCross Ref
A. Goldberg, H. Kaplan, and R. Werneck. Reach for A¤: Efficient point-to-point shortest path algorithms. Tech. Rep. MSR-TR-2005-132, October 2005.Google Scholar
A. Goldberg and C. Harrelson. Computing the shortest path: A* search meets graph theory. In SODA 2005. Google ScholarDigital Library
A. Goyal, F. Bonchi, and L. Lakshmanan. Discovering leaders from community actions. In CIKM 2008. Google ScholarDigital Library
B. M. Hill. A simple general approach to inference about the tail of a distribution. Annals of Stat., 1975.Google ScholarCross Ref
G. Hjaltason and H. Samet. Properties of embedding methods for similarity searching in metric spaces. IEEE Trans. Pattern Anal. Mach. Intel., 25(5):530--549, 2003. Google ScholarDigital Library
W. Hoeffding. Probability inequalities for sums of bounded random variables. Journal of the American Statistical Association 58(301):13--30, 1963.Google ScholarCross Ref
T. Ikeda, M.-Y. Hsu, H. Imai, S. Nishimura, H. Shimoura, T. Hashimoto, K. Tenmoku, and K. Mitoh. A Fast Alogrithm for Finding Better Routes by AI Search Techniques. In IEEE Vehicle Navigation and Information Systems Conference, 1994.Google Scholar
G. Karypis and V. Kumar. A fast and high quality multilevel scheme for partitioning irregular graphs. SIAM Journal on Scientific Comp. 20(1):359--392, 1999. Google ScholarDigital Library
J. Kleinberg, A. Slivkins, and T. Wexler. Triangulation and embedding using small sets of beacons. In FOCS 2004. Google ScholarDigital Library
HP. Kriegel, P. Kroger, M. Renz, and T. Schmidt. Vivaldi: a decentralized network coordinate system. In SIGCOMM 2004. Google ScholarDigital Library
E. Ng and H. Zhang. Predicting internet network distance with coordiantes-based approaches. In INFOCOM 2001.Google Scholar
I. Pohl. Bi-directional Search. In Machine Intelligence, vol. 6, Edinburgh University Press, 1971, pp. 127--140.Google Scholar
M. J. Rattigan, M. Maier, and D. Jensen. Using structure indices for efficient approximation of network properties. In KDD 2006. Google ScholarDigital Library
H. Samet, J. Sankaranarayanan, and H. Alborzi. Scalable network distance browsing in spatial databases. In SIGMOD'08. Google ScholarDigital Library
P. Singla and M. Richardson. Yes, there is a correlation: from social networks to personal behavior on the web. In WWW'08. Google ScholarDigital Library
L. Tang and M. Crovella. Virtual landmarks for the internet. In IMC 2003. Google ScholarDigital Library
M. Thorup and U. Zwick. Approximate distance oracles. In ACM Symp. on Theory of Computing, 2001. Google ScholarDigital Library
H. Tong, C. Faloutsos and J-Y.Pan. Fast random walk with restart and its applications In ICDM, 2006. Google ScholarDigital Library
A. Ukkonen, C. Castillo, D. Donato, and A. Gionis. Searching the wikipedia with contextual information. In CIKM, 2008. Google ScholarDigital Library
J. Venkateswaran, D. Lachwani, T. Kahveci, and C. Jermaine. Reference-based indexing of sequence databases. In VLDB 2006. Google ScholarDigital Library
M. V. Vieira, B. M. Fonseca, R. Damazio, P. B. Golgher, D. de Castro Reis, and B. Ribeiro--Neto. Efficient search ranking in social networks. In CIKM 2007. Google ScholarDigital Library
Y. Xiao, W. Wu, J. Pei, W. Wang, and Z. He. Efficiently Indexing Shortest Paths by Exploiting Symmetry in Graphs. In EDBT 2009 Google ScholarDigital Library
U. Zwick. Exact and approximate distances in graphs -- a survey. LNCS, 2161, 2001. Google ScholarDigital Library

Index Terms

Fast shortest path distance estimation in large networks
1. Information systems
  1. World Wide Web
    1. Web applications
      1. Internet communications tools

Recommendations

Adaptive Landmark Selection Strategies for Fast Shortest Path Computation in Large Real-World Graphs
WI-IAT '14: Proceedings of the 2014 IEEE/WIC/ACM International Joint Conferences on Web Intelligence (WI) and Intelligent Agent Technologies (IAT) - Volume 01

This paper considers the task of answering shortest path queries in large real-world graphs such as social networks, communication networks and web graphs. The traditional Breadth First Search (BFS) approach for solving this problem is too time-...
Read More
Fast edge searching and fast searching on graphs

Given a graph G=(V,E) in which a fugitive hides on vertices or along edges, graph searching problems are usually to find the minimum number of searchers required to capture the fugitive. In this paper, we consider the problem of finding the minimum ...
Read More
Fast fully dynamic landmark-based estimation of shortest path distances in very large graphs
CIKM '11: Proceedings of the 20th ACM international conference on Information and knowledge management

Computing the shortest path between a pair of vertices in a graph is a fundamental primitive in graph algorithmics. Classical exact methods for this problem do not scale up to contemporary, rapidly evolving social networks with hundreds of millions of ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
CIKM '09: Proceedings of the 18th ACM conference on Information and knowledge management
November 2009
2162 pages
ISBN:9781605585123
DOI:10.1145/1645953
General Chairs:
David Cheung
University of Hong Kong, Hong Kong
,
Il-Yeol Song
Drexel University, USA
,
Program Chairs:
Wesley Chu
UCLA, USA
,
Xiaohua Hu
Drexel University, USA
,
Jimmy Lin
University of Maryland, USA
Copyright © 2009 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 2 November 2009
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
graphs
landmarks methods
shortest-paths
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate1,861of8,427submissions,22%
Upcoming Conference
CIKM '24

Sponsor:

sigir

sigir

The 33rd ACM International Conference on Information and Knowledge Management

October 21 - 25, 2024

Boise , ID , USA
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 197
  Total Citations
  View Citations
- 1,901
  Total Downloads
- Downloads (Last 12 months)137
- Downloads (Last 6 weeks)13
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Fast shortest path distance estimation in large networks

CIKM '09: Proceedings of the 18th ACM conference on Information and knowledge management

ABSTRACT

References

Cited By

Index Terms

Recommendations

Adaptive Landmark Selection Strategies for Fast Shortest Path Computation in Large Real-World Graphs

Fast edge searching and fast searching on graphs

Fast fully dynamic landmark-based estimation of shortest path distances in very large graphs

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Fast shortest path distance estimation in large networks

CIKM '09: Proceedings of the 18th ACM conference on Information and knowledge management

ABSTRACT

References

Cited By

Index Terms

Recommendations

Adaptive Landmark Selection Strategies for Fast Shortest Path Computation in Large Real-World Graphs

Fast edge searching and fast searching on graphs

Fast fully dynamic landmark-based estimation of shortest path distances in very large graphs

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media