Skip to main content
Log in

MapReduce based location selection algorithm for utility maximization with capacity constraints

  • Published:
Computing Aims and scope Submit manuscript

Abstract

Given a set of facility objects and a set of client objects, where each client is served by her nearest facility and each facility is constrained by a service capacity, we study how to find all the locations on which if a new facility with a given capacity is established, the number of served clients is maximized (in other words, the utility of the facilities is maximized). This problem is intrinsically difficult. An existing algorithm with an exponential complexity is not scalable and cannot handle this problem on large data sets. Therefore, we propose to solve the problem through parallel computing, in particular using MapReduce. We propose an arc-based method to divide the search space into disjoint partitions. For load balancing, we propose a dynamic strategy to assign partitions to reducers so that the estimated load difference is within a threshold. We conduct extensive experiments using both real and synthetic data sets of large sizes. The results demonstrate the efficiency and scalability of the algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

Notes

  1. http://www.chorochronos.org/.

References

  1. Al-Khateeb A, Rashid NA, Abdullah R (2012) An enhanced meta-scheduling system for grid computing that considers the job type and priority. Computing, pp 389–410

  2. Dean J, Ghemawat S (2004) Mapreduce: Simplified data processing on large clusters. OSDI, pp 137–150

  3. Gufler B, Augsten N, Reiser A, Kemper A (2011) Handling data skew in mapreduce. In: The first international conference on cloud computing and services, science, pp 574–583

  4. Gufler B, Augsten N, Reiser A, Kemper A (2012) Load balancing in mapreduce based on scalable cardinality estimates. ICDE, pp 522–533

  5. Hale TS, Moberg CR (2003) Location science research: a review. Ann Oper Res 123(1–4):21–35

    Article  MATH  MathSciNet  Google Scholar 

  6. Huang J, Wen Z, Pathan M, Taylor K, Xue Y, Zhang R (2011) Ranking locations for facility selection based on potential influences. In: The 37th annual conference on IEEE industrial electronics society, pp 2411–2416

  7. Huang J, Wen Z, Qi J, Zhang R, Chen J, He Z (2011) Top-k most influential locations selection. CIKM, pp 2377–2380

  8. Huang J, Zhang R, Buyya R, Chen J (2014) Melody-join: efficient earth mover’s distance similarity join using mapreduce. ICDE

  9. Kahraman C, Ruan D, Doan I (2003) Fuzzy group decision-making for facility location selection. Inf Sci 157:135–153

    Article  MATH  Google Scholar 

  10. Klose A, Drexl A (2005) Facility location models for distribution system design. Eur J Oper Res 162(1):4–29

    Article  MATH  MathSciNet  Google Scholar 

  11. Kolb L, Thor A, Rahm E (2012) Load balancing for mapreduce-based entity resolution. ICDE, pp 618–629

  12. Korn F, Muthukrishnan S (2000) Influence sets based on reverse nearest neighbor queries. SIGMOD, pp 201–212

  13. Kwon Y, Balazinska M, Howe B, Rolia J (2012) Skewtune: mitigating skew in mapreduce applications. SIGMOD, pp 25–36

  14. Lu W, Shen Y, Chen S, Ooi BC (2012) Efficient processing of k nearest neighbor joins using mapreduce. Proc. VLDB Endow. 5(10):1016–1027

    Article  Google Scholar 

  15. Melkote S, Daskin MS (2001) Capacitated facility location/network design problems. Eur J Oper Res 129(3):481–495

    Article  MATH  MathSciNet  Google Scholar 

  16. Melo M, Nickel S, Saldanha da Gama F (2006) Dynamic multi-commodity capacitated facility location: a mathematical modeling framework for strategic supply chain planning. Comput Oper Res 33(1):181–208

    Article  MATH  Google Scholar 

  17. Melo MT, Nickel S, Saldanha-Da-Gama F (2009) Facility location and supply chain management-a review. Eur J Oper Res 196(2):401–412

    Article  MATH  MathSciNet  Google Scholar 

  18. Nutanong S, Tanin E, Zhang R (2010) Incremental evaluation of visible nearest neighbor queries. TKDE 22(5):665–681

    Google Scholar 

  19. Nutanong S, Zhang R, Tanin E, Kulik L (2010) Analysis and evaluation of v*-knn: an efficient algorithm for moving knn queries. VLDB J 19(3):307–332

    Google Scholar 

  20. Qi J, Zhang R, Kulik L, Lin D, Xue Y (2012) The min-dist location selection query. ICDE, pp 366–377

  21. Qiao Y, von Bochmann G (2012) Load balancing in peer-to-peer systems using a diffusive approach. Computing, pp 649–678

  22. Quan X, Wenyin L, Dou W, Xiong H, Ge Y (2012) Link graph analysis for business site selection. Computer 45(3):64–69

    Article  Google Scholar 

  23. Revelle CS, Eiselt HA, Daskin MS (2008) A bibliography for some fundamental problem categories in discrete location science. Eur J Oper Res 184(3):817–848

    Article  MATH  MathSciNet  Google Scholar 

  24. Sun Y, Huang J, Chen Y, Du X, Zhang R (2012) Top-k most incremental location selection with capacity constraint. WAIM, pp 165–171

  25. Sun Y, Huang J, Chen Y, Zhang R, Du X (2012) Location selection for utility maximization with capacity constraints. CIKM, pp 2154–2158

  26. Tao Y, Lin W, Xiao X (2013) Minimal mapreduce algorithms. SIGMOD

  27. Mouratidis LHUK, Yiu ML, Mamoulis N (2010) Optimal matching between spatial datasets under capacity constraints. TODS 35(2):9:1–9:44

    Google Scholar 

  28. Wong RC-W, Özsu MT, Fu AW-C, Yu PS, Liu L, Liu Y (2011) Maximizing bichromatic reverse nearest neighbor for l p -norm in two- and three-dimensional spaces. VLDB J 20(6):893–919

    Article  Google Scholar 

  29. Wong RC-W, Tao Y, Fu AW-C, Xiao X (2007) On efficient spatial matching. VLDB, pp 579–590

  30. Xia T, Zhang D, Kanoulas E, Du Y (2005) On computing top-t most influential spatial sites. VLDB, pp 946–957

  31. Yan D, Wong RC-W, Ng W (2011) Efficient methods for finding influential locations with adaptive grids. CIKM, pp 1475–1484

  32. Yu C, Zhang R, Huang Y, Xiong H (2010) High-dimensional knn joins with incremental updates. GeoInformatica 14(1):55–82

    Article  Google Scholar 

  33. Yuan J, Zheng Y, Xie X (2012) Discovering regions of different functions in a city using human mobility and pois. KDD, pp 186–194

  34. Zhan L, Zhang Y, Zhang W, Lin X (2012) Finding top k most influential spatial facilities over uncertain objects. CIKM, pp 922–931

  35. Zhang D, Du Y, Xia T, Tao Y (2006) Progressive computation of the min-dist optimal-location query. VLDB, pp 643–654

  36. Zhang R, Jagadish HV, Dai BT, Ramamohanarao K (2010) Optimized algorithms for predictive range and knn queries on moving objects. Inf Syst 35(8):911–932

    Article  Google Scholar 

  37. Zheng K, Huang Z, zhou A, Zhou X (2012) Discovering the most influential sites over uncertain data: a rank-based approach. TKDE 24(12):2156–2169

    Google Scholar 

  38. Zhou Z, Wu W, Li X, Lee ML, Hsu W (2011) Maxfirst for maxbrknn. ICDE, pp 828–839

Download references

Acknowledgments

This work is supported by the Australian Research Council (ARC) Discovery Project DP130104587. Dr. Rui Zhang is supported by the ARC Future Fellowships Project FT120100832. Dr. Yueguo Chen is partially supported by the National Science Foundation of China under Grant No. 61003085. Dr. Xiaoyong Du is partially supported by the National Science Foundation of China under Grant No. 61170010.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yu Sun.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Sun, Y., Qi, J., Zhang, R. et al. MapReduce based location selection algorithm for utility maximization with capacity constraints. Computing 97, 403–423 (2015). https://doi.org/10.1007/s00607-013-0382-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00607-013-0382-5

Keywords

Mathematics Subject Classification

Navigation