skip to main content
research-article

AutoPlacer: Scalable Self-Tuning Data Placement in Distributed Key-Value Stores

Authors Info & Claims
Published:08 December 2014Publication History
Skip Abstract Section

Abstract

This article addresses the problem of self-tuning the data placement in replicated key-value stores. The goal is to automatically optimize replica placement in a way that leverages locality patterns in data accesses, such that internode communication is minimized. To do this efficiently is extremely challenging, as one needs not only to find lightweight and scalable ways to identify the right assignment of data replicas to nodes but also to preserve fast data lookup. The article introduces new techniques that address these challenges. The first challenge is addressed by optimizing, in a decentralized way, the placement of the objects generating the largest number of remote operations for each node. The second challenge is addressed by combining the usage of consistent hashing with a novel data structure, which provides efficient probabilistic data placement. These techniques have been integrated in a popular open-source key-value store. The performance results show that the throughput of the optimized system can be six times better than a baseline system employing the widely used static placement based on consistent hashing.

References

  1. M. Ahmad, B. Kemme, I. Brondino, M. Patiño-Martínez, and R. Jiménez-Peris. 2013. Transactional failure recovery for a distributed key-value store. In Proceedings of the 14th Middleware (Middleware'13). Springer, Berlin, China, 267--286.Google ScholarGoogle Scholar
  2. P. Almeida, C. Baquero, N. Preguiça, and D. Hutchison. 2007. Scalable Bloom filters. Information Processing Letters 101, 6 (March 2007), 255--261. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. C. Amza, A. Cox, and W. Zwaenepoel. 2003. Conflict-aware scheduling for dynamic content applications. In Proceedings of the 4th Conference on USENIX Symposium on Internet Technologies and Systems (USITS'03). USENIX Association, Berkeley, CA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. B. Ban and V. Blagojevic. 2002. Reliable Group Communication with JGroups 3.x. Technical Report. Red Hat, Inc. Retrieved from http://www.jgroups.org.Google ScholarGoogle Scholar
  5. C. Bauer and G. King. 2006. Java Persistence with Hibernate. Manning Publications. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. C. Bishop. 2006. Pattern Recognition and Machine Learning (Information Science and Statistics). Springer-Verlag, New York. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. B. Bloom. 1970. Space/time trade-offs in hash coding with allowable errors. Communications of the ACM 13, 7 (July 1970), 422--426. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. K. Chandy and J. Hewes. 1976. File allocation in distributed systems. In Proceedings of the ACM SIGMETRICS (SIGMETRICS'76). ACM, New York, 10--13. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. F. Chang and others. 2008. Bigtable: A distributed storage system for structured data. ACM Transactions on Compututer Systems 26, 2 (June 2008), 4:1--4:26. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. B. Chazelle, J. Kilian, R. Rubinfeld, and A. Tal. 2004. The Bloomier filter: An efficient data structure for static support lookup tables. In Proceedings of the 15th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA'04). Society for Industrial and Applied Mathematics. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. H. Chen, M. Song, J. Song, A. Gavrilovska, and K. Schwan. 2011. HEaRS: A hierarchical energy-aware resource scheduler for virtualized data centers. In Proceedings of the International Conference on Cluster Computing (CLUSTER'11). IEEE, New York, 508--512. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. N. Cook, D. Milojicic, and V. Talwar. 2012. Cloud management. Journal of Internet Services and Applications 3, 1 (2012), 67--75.Google ScholarGoogle ScholarCross RefCross Ref
  13. B. Cooper, R. Ramakrishnan, U. Srivastava, A. Silberstein, P. Bohannon, H.-A. Jacobsen, N. Puz, D. Weaver, and R. Yerneni. 2008. PNUTS: Yahoo!'s hosted data serving platform. In Proceedings of the 34th International Conference on Very Large Databases (VLDB'08). VLDB Endowment, Auckland, New Zealand. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. B. Cooper, A. Silberstein, E. Tam, R. Ramakrishnan, and R. Sears. 2010. Benchmarking cloud serving systems with YCSB. In Proceedings of the 1st ACM Symposium on Cloud Computing (SoCC'10). ACM, New York, 143--154. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. J. Corbett and others. 2012. Spanner: Google's globally-distributed database. In Proceedings of the 10th Symposium on Operating Systems Design and Implementation (OSDI'12). USENIX Association, Berkeley, CA, 251--264. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. F. Cruz, F. Maia, M. Matos, R. Oliveira, J. Paulo, J. Pereira, and R. Vilaça. 2013. MeT: Workload aware elasticity for NoSQL. In Proceedings of the 8th ACM European Conference on Computer Systems (EuroSys'13). ACM, New York, 183--196. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. C. Curino, E. Jones, Y. Zhang, and S. Madden. 2010. Schism: A workload-driven approach to database replication and partitioning. In Proceedings of the 36th International Conference on Very Large Databases (VLDB'10). VLDB Endowment, Singapore. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. G. DeCandia and others. 2007. Dynamo: Amazon's highly available key-value store. In Proceedings of the 21st ACM SIGOPS Symposium on Operating Systems Principles (SOSP'07). ACM, New York, 205--220. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. D. Didona, P. Romano, S. Peluso, and F. Quaglia. 2012. Transactional auto scaler: Elastic scaling of in-memory transactional data grids. In Proceedings of the 9th ACM International Conference on Autonomic Computing (ICAC'12). ACM, San Jose, CA, 125--134. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. P. Domingos and G. Hulten. 2000. Mining high-speed data streams. In Proceedings of the 6th International Conference on Knowledge Discovery and Data Mining (SIGKDD'12). ACM, Boston, Massachusetts, USA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. L. Dowdy and D. Foster. 1982. Comparative models of the file assignment problem. ACM Computing Surveys 14, 2 (June 1982), 287--313. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. B. Fleisch and G. Popek. 1989. Mirage: A coherent distributed shared memory design. In Proceedings of the 12th ACM Symposium on Operating Systems Principles (SOSP'89). ACM, New York, 211--223. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. T. Forell, D. Milojicic, and V. Talwar. 2011. Cloud management: Challenges and opportunities. In IPDPS Workshops. IEEE, Los Alamitos, CA, 881--889. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. S. Garbatov and J. Cachopo. 2011. Data access pattern analysis and prediction for object-oriented applications. INFOCOMP Journal of Computer Science 10, 4 (December 2011), 1--14.Google ScholarGoogle Scholar
  25. Y. Jia, I. Brondino, R. Jiménez-Peris, M. Patiño Martínez, and D. Ma. 2013. A multi-resource load balancing algorithm for cloud cache systems. In Proceedings of the 28th Annual ACM Symposium on Applied Computing (SAC'13). ACM, New York, 463--470. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. R. Jiménez-Peris, M. Patiño Martínez, and G. Alonso. 2002. Non-intrusive, parallel recovery of replicated data. In Proceedings of the 21st IEEE Symposium on Reliable Distributed Systems (SRDS'02). IEEE, Los Alamitos, CA, 150--159. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. P. Krishnan, D. Raz, and Y. Shavitt. 2000. The cache location problem. IEEE/ACM Transactions on Networking 8, 5 (October 2000), 568--582. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. L. Sangyeol and L. Taewook. 2004. CUSUM test for parameter change based on the maximum likelihood estimator. Sequential Analysis: Design Methods and Applications 23, 2 (2004), 239--256.Google ScholarGoogle Scholar
  29. A. Lakshman and P. Malik. 2010. Cassandra: A decentralized structured storage system. SIGOPS Operating Systems Review 44, 2 (April 2010), 35--40. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. N. Laoutaris, O. Telelis, V. Zissimopoulos, and I. Stavrakakis. 2006. Distributed selfish replication. IEEE Transactions on Parallel and Distributed Systems 17, 12 (December 2006), 1401--1413. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. A. Leff, J. Wolf, and P. Yu. 1993. Replication algorithms in a remote caching architecture. IEEE Transactions on Parallel and Distributed Systems 4, 11 (November 1993), 1185--1204. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. S. Leutenegger and D. Dias. 1993. A modeling study of the TPC-C benchmark. In Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data (SIGMOD'93). ACM, New York, 22--31. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. S. Li, T. Abdelzaher, and M. Yuan. 2011. TAPA: Temperature aware power allocation in data center with Map-Reduce. In Proceedings of the IGCC Workshops. 1--8. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. S. Li, S. Wang, F. Yang, S. Hu, F. Saremi, and T. Abdelzaher. 2013. Proteus: Power proportional memory cache cluster in data centers. In Proceedings of the 33rd International Conference on Distributed Computing Systems (ICDCS'13). IEEE, New York, 73--82. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. H. Liu and H. Motoda. 1998. Feature Selection for Knowledge Discovery and Data Mining. Kluwer Academic, Norwell, MA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. F. Marchioni and M. Surtani. 2012. Infinispan Data Grid Platform. PACKT Publishing.Google ScholarGoogle Scholar
  37. A. Metwally, D. Agrawal, and A. El Abbadi. 2005. Efficient computation of frequent and top-k elements in data streams. In Proceedings of the 10th International Conference on Database Theory (ICDT'05). Springer-Verlag, 398--412. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. T. Mitchell. 1997. Machine Learning. McGraw-Hill, New York. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. A. Pavlo, C. Curino, and S. Zdonik. 2012. Skew-aware automatic database partitioning in shared-nothing, parallel OLTP systems. In Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data (SIGMOD'12). ACM, New York, 61--72. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. S. Peluso, P. Romano, and F. Quaglia. 2012a. SCORe: A scalable one-copy serializable partial replication protocol. In Proceedings of the 13th Middleware (Middleware'12). Springer-Verlag, New York, 456--475. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. S. Peluso, P. Ruivo, P. Romano, F. Quaglia, and L. Rodrigues. 2012b. When scalability meets consistency: Genuine multiversion update-serializable partial data replication. In Proceedings of the 32nd International Conference on Distributed Computing Systems (ICDCS'12). IEEE, Los Alamitos, CA, 455--465. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. J. Ross Quinlan. 1993. C4.5: Programs for Machine Learning. Morgan Kaufmann, San Francisco, CA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. RedHat/JBoss. 2013. Non Blocking State Transfer V2. Retrieved from https://github.com/infinispan/infinispan/wiki/Non-Blocking-State-Transfer-V2.Google ScholarGoogle Scholar
  44. P. Romano, M. Little, F. Quaglia, L. Rodrigues, and V. Ziparo. 2014. Cloud-TM: Transactional, Object-oriented, Self-tuning Cloud Data Store. Technical Report 7. INESC-ID.Google ScholarGoogle Scholar
  45. P. Ruivo, M. Couceiro, P. Romano, and L. Rodrigues. 2011. Exploiting total order multicast in weakly consistent transactional caches. In Proceedings of the the 17th Pacific Rim International Symposium on Dependable Computing (PRDC'11). IEEE, Los Alamitos, CA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. A. L. Tatarowicz, C. Curino, E. Jones, and S. Madden. 2012. Lookup tables: Fine-grained partitioning for distributed databases. In Proceedings of the 28th International Conference on Data Engineering (ICDE'12). IEEE Computer Society, Washington, DC, 102--113. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. R. Vilaça, R. Oliveira, and J. Pereira. 2011. A correlation-aware data placement strategy for key-value stores. In Proceedings of the 11th IFIP International Conference on Distributed Applications and Interoperable Systems (DAIS'11). Springer-Verlag, 214--227. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. L. Wang, J. Xu, M. Zhao, and J. Fortes. 2011. Adaptive virtual resource management with fuzzy model predictive control. In Proceedings of the 8th ACM International Conference on Autonomic Computing (ICAC'11). ACM, New York, 191--192. Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. I. Witten and E. Frank. 2005. Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems). Morgan Kaufmann, San Francisco, CA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. G.-Won You, S.-Won Hwang, and N. Jain. 2013. Ursa: Scalable load and power management in cloud storage systems. ACM Transactions on Storage 9, 1, Article 1 (March 2013), 29 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. S. Zaman and D. Grosu. 2011. A distributed algorithm for the replica placement problem. IEEE Transactions on Parallel and Distributed Systems 22, 9 (September 2011), 1455--1468. Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. V. Ziparo, F. Cottefoglie, D. Calisi, M. Zaratti, F. Giannone, and P. Romano. 2013. D4.3 - Prototype of pilot application I. In Cloud-TM Project. Retrieved from http://cloudtm.ist.utl.pt/.Google ScholarGoogle Scholar

Index Terms

  1. AutoPlacer: Scalable Self-Tuning Data Placement in Distributed Key-Value Stores

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image ACM Transactions on Autonomous and Adaptive Systems
        ACM Transactions on Autonomous and Adaptive Systems  Volume 9, Issue 4
        January 2015
        137 pages
        ISSN:1556-4665
        EISSN:1556-4703
        DOI:10.1145/2695594
        Issue’s Table of Contents

        Copyright © 2014 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 8 December 2014
        • Revised: 1 June 2014
        • Accepted: 1 June 2014
        • Received: 1 January 2014
        Published in taas Volume 9, Issue 4

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article
        • Research
        • Refereed

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader