research-article

AutoPlacer: Scalable Self-Tuning Data Placement in Distributed Key-Value Stores

Authors:
João Paiva

INESC-ID, Instituto Superior Técnico, Universidade de Lisboa, Portugal

INESC-ID, Instituto Superior Técnico, Universidade de Lisboa, Portugal
View Profile

,
Pedro Ruivo

Red Hat, Inc., London, United Kingdom

Red Hat, Inc., London, United Kingdom
View Profile

,
Paolo Romano

INESC-ID, Instituto Superior Técnico, Universidade de Lisboa, Portugal

INESC-ID, Instituto Superior Técnico, Universidade de Lisboa, Portugal
View Profile

,
Luís Rodrigues

INESC-ID, Instituto Superior Técnico, Universidade de Lisboa, Portugal

INESC-ID, Instituto Superior Técnico, Universidade de Lisboa, Portugal
View Profile

ACM Transactions on Autonomous and Adaptive Systems Volume 9 Issue 4Article No.: 19pp 1–30https://doi.org/10.1145/2641573

Published:08 December 2014Publication History

ACM Transactions on Autonomous and Adaptive Systems

Abstract

This article addresses the problem of self-tuning the data placement in replicated key-value stores. The goal is to automatically optimize replica placement in a way that leverages locality patterns in data accesses, such that internode communication is minimized. To do this efficiently is extremely challenging, as one needs not only to find lightweight and scalable ways to identify the right assignment of data replicas to nodes but also to preserve fast data lookup. The article introduces new techniques that address these challenges. The first challenge is addressed by optimizing, in a decentralized way, the placement of the objects generating the largest number of remote operations for each node. The second challenge is addressed by combining the usage of consistent hashing with a novel data structure, which provides efficient probabilistic data placement. These techniques have been integrated in a popular open-source key-value store. The performance results show that the throughput of the optimized system can be six times better than a baseline system employing the widely used static placement based on consistent hashing.

References

M. Ahmad, B. Kemme, I. Brondino, M. Patiño-Martínez, and R. Jiménez-Peris. 2013. Transactional failure recovery for a distributed key-value store. In Proceedings of the 14th Middleware (Middleware'13). Springer, Berlin, China, 267--286.Google Scholar
P. Almeida, C. Baquero, N. Preguiça, and D. Hutchison. 2007. Scalable Bloom filters. Information Processing Letters 101, 6 (March 2007), 255--261. Google ScholarDigital Library
C. Amza, A. Cox, and W. Zwaenepoel. 2003. Conflict-aware scheduling for dynamic content applications. In Proceedings of the 4th Conference on USENIX Symposium on Internet Technologies and Systems (USITS'03). USENIX Association, Berkeley, CA. Google ScholarDigital Library
B. Ban and V. Blagojevic. 2002. Reliable Group Communication with JGroups 3.x. Technical Report. Red Hat, Inc. Retrieved from http://www.jgroups.org.Google Scholar
C. Bauer and G. King. 2006. Java Persistence with Hibernate. Manning Publications. Google ScholarDigital Library
C. Bishop. 2006. Pattern Recognition and Machine Learning (Information Science and Statistics). Springer-Verlag, New York. Google ScholarDigital Library
B. Bloom. 1970. Space/time trade-offs in hash coding with allowable errors. Communications of the ACM 13, 7 (July 1970), 422--426. Google ScholarDigital Library
K. Chandy and J. Hewes. 1976. File allocation in distributed systems. In Proceedings of the ACM SIGMETRICS (SIGMETRICS'76). ACM, New York, 10--13. Google ScholarDigital Library
F. Chang and others. 2008. Bigtable: A distributed storage system for structured data. ACM Transactions on Compututer Systems 26, 2 (June 2008), 4:1--4:26. Google ScholarDigital Library
B. Chazelle, J. Kilian, R. Rubinfeld, and A. Tal. 2004. The Bloomier filter: An efficient data structure for static support lookup tables. In Proceedings of the 15th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA'04). Society for Industrial and Applied Mathematics. Google ScholarDigital Library
H. Chen, M. Song, J. Song, A. Gavrilovska, and K. Schwan. 2011. HEaRS: A hierarchical energy-aware resource scheduler for virtualized data centers. In Proceedings of the International Conference on Cluster Computing (CLUSTER'11). IEEE, New York, 508--512. Google ScholarDigital Library
N. Cook, D. Milojicic, and V. Talwar. 2012. Cloud management. Journal of Internet Services and Applications 3, 1 (2012), 67--75.Google ScholarCross Ref
B. Cooper, R. Ramakrishnan, U. Srivastava, A. Silberstein, P. Bohannon, H.-A. Jacobsen, N. Puz, D. Weaver, and R. Yerneni. 2008. PNUTS: Yahoo!'s hosted data serving platform. In Proceedings of the 34th International Conference on Very Large Databases (VLDB'08). VLDB Endowment, Auckland, New Zealand. Google ScholarDigital Library
B. Cooper, A. Silberstein, E. Tam, R. Ramakrishnan, and R. Sears. 2010. Benchmarking cloud serving systems with YCSB. In Proceedings of the 1st ACM Symposium on Cloud Computing (SoCC'10). ACM, New York, 143--154. Google ScholarDigital Library
J. Corbett and others. 2012. Spanner: Google's globally-distributed database. In Proceedings of the 10th Symposium on Operating Systems Design and Implementation (OSDI'12). USENIX Association, Berkeley, CA, 251--264. Google ScholarDigital Library
F. Cruz, F. Maia, M. Matos, R. Oliveira, J. Paulo, J. Pereira, and R. Vilaça. 2013. MeT: Workload aware elasticity for NoSQL. In Proceedings of the 8th ACM European Conference on Computer Systems (EuroSys'13). ACM, New York, 183--196. Google ScholarDigital Library
C. Curino, E. Jones, Y. Zhang, and S. Madden. 2010. Schism: A workload-driven approach to database replication and partitioning. In Proceedings of the 36th International Conference on Very Large Databases (VLDB'10). VLDB Endowment, Singapore. Google ScholarDigital Library
G. DeCandia and others. 2007. Dynamo: Amazon's highly available key-value store. In Proceedings of the 21st ACM SIGOPS Symposium on Operating Systems Principles (SOSP'07). ACM, New York, 205--220. Google ScholarDigital Library
D. Didona, P. Romano, S. Peluso, and F. Quaglia. 2012. Transactional auto scaler: Elastic scaling of in-memory transactional data grids. In Proceedings of the 9th ACM International Conference on Autonomic Computing (ICAC'12). ACM, San Jose, CA, 125--134. Google ScholarDigital Library
P. Domingos and G. Hulten. 2000. Mining high-speed data streams. In Proceedings of the 6th International Conference on Knowledge Discovery and Data Mining (SIGKDD'12). ACM, Boston, Massachusetts, USA. Google ScholarDigital Library
L. Dowdy and D. Foster. 1982. Comparative models of the file assignment problem. ACM Computing Surveys 14, 2 (June 1982), 287--313. Google ScholarDigital Library
B. Fleisch and G. Popek. 1989. Mirage: A coherent distributed shared memory design. In Proceedings of the 12th ACM Symposium on Operating Systems Principles (SOSP'89). ACM, New York, 211--223. Google ScholarDigital Library
T. Forell, D. Milojicic, and V. Talwar. 2011. Cloud management: Challenges and opportunities. In IPDPS Workshops. IEEE, Los Alamitos, CA, 881--889. Google ScholarDigital Library
S. Garbatov and J. Cachopo. 2011. Data access pattern analysis and prediction for object-oriented applications. INFOCOMP Journal of Computer Science 10, 4 (December 2011), 1--14.Google Scholar
Y. Jia, I. Brondino, R. Jiménez-Peris, M. Patiño Martínez, and D. Ma. 2013. A multi-resource load balancing algorithm for cloud cache systems. In Proceedings of the 28th Annual ACM Symposium on Applied Computing (SAC'13). ACM, New York, 463--470. Google ScholarDigital Library
R. Jiménez-Peris, M. Patiño Martínez, and G. Alonso. 2002. Non-intrusive, parallel recovery of replicated data. In Proceedings of the 21st IEEE Symposium on Reliable Distributed Systems (SRDS'02). IEEE, Los Alamitos, CA, 150--159. Google ScholarDigital Library
P. Krishnan, D. Raz, and Y. Shavitt. 2000. The cache location problem. IEEE/ACM Transactions on Networking 8, 5 (October 2000), 568--582. Google ScholarDigital Library
L. Sangyeol and L. Taewook. 2004. CUSUM test for parameter change based on the maximum likelihood estimator. Sequential Analysis: Design Methods and Applications 23, 2 (2004), 239--256.Google Scholar
A. Lakshman and P. Malik. 2010. Cassandra: A decentralized structured storage system. SIGOPS Operating Systems Review 44, 2 (April 2010), 35--40. Google ScholarDigital Library
N. Laoutaris, O. Telelis, V. Zissimopoulos, and I. Stavrakakis. 2006. Distributed selfish replication. IEEE Transactions on Parallel and Distributed Systems 17, 12 (December 2006), 1401--1413. Google ScholarDigital Library
A. Leff, J. Wolf, and P. Yu. 1993. Replication algorithms in a remote caching architecture. IEEE Transactions on Parallel and Distributed Systems 4, 11 (November 1993), 1185--1204. Google ScholarDigital Library
S. Leutenegger and D. Dias. 1993. A modeling study of the TPC-C benchmark. In Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data (SIGMOD'93). ACM, New York, 22--31. Google ScholarDigital Library
S. Li, T. Abdelzaher, and M. Yuan. 2011. TAPA: Temperature aware power allocation in data center with Map-Reduce. In Proceedings of the IGCC Workshops. 1--8. Google ScholarDigital Library
S. Li, S. Wang, F. Yang, S. Hu, F. Saremi, and T. Abdelzaher. 2013. Proteus: Power proportional memory cache cluster in data centers. In Proceedings of the 33rd International Conference on Distributed Computing Systems (ICDCS'13). IEEE, New York, 73--82. Google ScholarDigital Library
H. Liu and H. Motoda. 1998. Feature Selection for Knowledge Discovery and Data Mining. Kluwer Academic, Norwell, MA. Google ScholarDigital Library
F. Marchioni and M. Surtani. 2012. Infinispan Data Grid Platform. PACKT Publishing.Google Scholar
A. Metwally, D. Agrawal, and A. El Abbadi. 2005. Efficient computation of frequent and top-k elements in data streams. In Proceedings of the 10th International Conference on Database Theory (ICDT'05). Springer-Verlag, 398--412. Google ScholarDigital Library
T. Mitchell. 1997. Machine Learning. McGraw-Hill, New York. Google ScholarDigital Library
A. Pavlo, C. Curino, and S. Zdonik. 2012. Skew-aware automatic database partitioning in shared-nothing, parallel OLTP systems. In Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data (SIGMOD'12). ACM, New York, 61--72. Google ScholarDigital Library
S. Peluso, P. Romano, and F. Quaglia. 2012a. SCORe: A scalable one-copy serializable partial replication protocol. In Proceedings of the 13th Middleware (Middleware'12). Springer-Verlag, New York, 456--475. Google ScholarDigital Library
S. Peluso, P. Ruivo, P. Romano, F. Quaglia, and L. Rodrigues. 2012b. When scalability meets consistency: Genuine multiversion update-serializable partial data replication. In Proceedings of the 32nd International Conference on Distributed Computing Systems (ICDCS'12). IEEE, Los Alamitos, CA, 455--465. Google ScholarDigital Library
J. Ross Quinlan. 1993. C4.5: Programs for Machine Learning. Morgan Kaufmann, San Francisco, CA. Google ScholarDigital Library
RedHat/JBoss. 2013. Non Blocking State Transfer V2. Retrieved from https://github.com/infinispan/infinispan/wiki/Non-Blocking-State-Transfer-V2.Google Scholar
P. Romano, M. Little, F. Quaglia, L. Rodrigues, and V. Ziparo. 2014. Cloud-TM: Transactional, Object-oriented, Self-tuning Cloud Data Store. Technical Report 7. INESC-ID.Google Scholar
P. Ruivo, M. Couceiro, P. Romano, and L. Rodrigues. 2011. Exploiting total order multicast in weakly consistent transactional caches. In Proceedings of the the 17th Pacific Rim International Symposium on Dependable Computing (PRDC'11). IEEE, Los Alamitos, CA. Google ScholarDigital Library
A. L. Tatarowicz, C. Curino, E. Jones, and S. Madden. 2012. Lookup tables: Fine-grained partitioning for distributed databases. In Proceedings of the 28th International Conference on Data Engineering (ICDE'12). IEEE Computer Society, Washington, DC, 102--113. Google ScholarDigital Library
R. Vilaça, R. Oliveira, and J. Pereira. 2011. A correlation-aware data placement strategy for key-value stores. In Proceedings of the 11th IFIP International Conference on Distributed Applications and Interoperable Systems (DAIS'11). Springer-Verlag, 214--227. Google ScholarDigital Library
L. Wang, J. Xu, M. Zhao, and J. Fortes. 2011. Adaptive virtual resource management with fuzzy model predictive control. In Proceedings of the 8th ACM International Conference on Autonomic Computing (ICAC'11). ACM, New York, 191--192. Google ScholarDigital Library
I. Witten and E. Frank. 2005. Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems). Morgan Kaufmann, San Francisco, CA. Google ScholarDigital Library
G.-Won You, S.-Won Hwang, and N. Jain. 2013. Ursa: Scalable load and power management in cloud storage systems. ACM Transactions on Storage 9, 1, Article 1 (March 2013), 29 pages. Google ScholarDigital Library
S. Zaman and D. Grosu. 2011. A distributed algorithm for the replica placement problem. IEEE Transactions on Parallel and Distributed Systems 22, 9 (September 2011), 1455--1468. Google ScholarDigital Library
V. Ziparo, F. Cottefoglie, D. Calisi, M. Zaratti, F. Giannone, and P. Romano. 2013. D4.3 - Prototype of pilot application I. In Cloud-TM Project. Retrieved from http://cloudtm.ist.utl.pt/.Google Scholar

Index Terms

AutoPlacer: Scalable Self-Tuning Data Placement in Distributed Key-Value Stores
1. Computer systems organization
  1. Architectures
    1. Distributed architectures
2. Software and its engineering
  1. Software organization and properties
    1. Software system structures
      1. Distributed systems organizing principles

Recommendations

A machine learning assisted data placement mechanism for hybrid storage systems
Abstract
Emerging applications produce massive files that show different properties in file size, lifetime, and read/write frequency. Existing hybrid storage systems place these files onto different storage mediums assuming that the access ...
Read More
Sibyl: adaptive and extensible data placement in hybrid storage systems using online reinforcement learning
ISCA '22: Proceedings of the 49th Annual International Symposium on Computer Architecture

Hybrid storage systems (HSS) use multiple different storage devices to provide high and scalable storage capacity at high performance. Data placement across different devices is critical to maximize the benefits of such a hybrid system. Recent research ...
Read More
A priority-based data placement method for databases using solid-state drives
RACS '18: Proceedings of the 2018 Conference on Research in Adaptive and Convergent Systems

When applications require high I/O performance, solid-state drives (SSDs) are often preferable because they perform better than traditional hard-disk drives (HDDs). Therefore, database system response time can be improved by moving frequently used data ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
ACM Transactions on Autonomous and Adaptive Systems Volume 9, Issue 4
January 2015
137 pages
ISSN:1556-4665
EISSN:1556-4703
DOI:10.1145/2695594
Editors:
Manish Parashar
Rutgers University, USA
,
Franco Zambonelli
University of Modena e Reggio Emilia, Italy
Issue’s Table of Contents
Copyright © 2014 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 8 December 2014
- Revised: 1 June 2014
- Accepted: 1 June 2014
- Received: 1 January 2014
Published in taas Volume 9, Issue 4

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Distributed data management
data placement
machine learning
probabilistic algorithms
Qualifiers
- research-article
- Research
- Refereed
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 32
  Total Citations
  View Citations
- 512
  Total Downloads
- Downloads (Last 12 months)14
- Downloads (Last 6 weeks)1
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

AutoPlacer: Scalable Self-Tuning Data Placement in Distributed Key-Value Stores

ACM Transactions on Autonomous and Adaptive Systems

Abstract

References

Cited By

Index Terms

Recommendations

A machine learning assisted data placement mechanism for hybrid storage systems

Sibyl: adaptive and extensible data placement in hybrid storage systems using online reinforcement learning

A priority-based data placement method for databases using solid-state drives

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

AutoPlacer: Scalable Self-Tuning Data Placement in Distributed Key-Value Stores

ACM Transactions on Autonomous and Adaptive Systems

Abstract

References

Cited By

Index Terms

Recommendations

A machine learning assisted data placement mechanism for hybrid storage systems

Sibyl: adaptive and extensible data placement in hybrid storage systems using online reinforcement learning

A priority-based data placement method for databases using solid-state drives

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media