Abstract
An increasing number of distributed data-driven applications are moving into shared public clouds. By sharing resources and operating at scale, public clouds promise higher utilization and lower costs than private clusters. To achieve high utilization, however, cloud providers inevitably allocate virtual machine instances noncontiguously, i.e., instances of a given application may end up in physically distant machines in the cloud. This allocation strategy can lead to large differences in average latency between instances. For a large class of applications, this difference can result in significant performance degradation, unless care is taken in how application components are mapped to instances.
In this paper, we propose ClouDiA, a general deployment advisor that selects application node deployments minimizing either (i) the largest latency between application nodes, or (ii) the longest critical path among all application nodes. ClouDiA employs mixed-integer programming and constraint programming techniques to efficiently search the space of possible mappings of application nodes to instances. Through experiments with synthetic and real applications in Amazon EC2, we show that our techniques yield a 15% to 55% reduction in time-to-solution or service response time, without any need for modifying application code.
- R. D. Alpert and J. F. Philbin. cBSP: Zero-cost synchronization in a modified BSP model. Technical report, NEC Research Institute, 1997.Google Scholar
- Amazon web services, elastic compute cloud (ec2). http://aws.amazon.com/ec2/.Google Scholar
- Amazon web services, case studies. http://aws.amazon.com/solutions/case-studies/.Google Scholar
- Amazon web services, cloudformation. http://aws.amazon.com/cloudformation/.Google Scholar
- Amazon web services, search engines & web crawlers. http://aws.amazon.com/search-engines/.Google Scholar
- S. Babu. Towards automatic optimization of mapreduce programs. In SOCC, 2010. Google Scholar
- C. S. Badue, R. A. Baeza-Yates, B. A. Ribeiro-Neto, and N. Ziviani. Distributed query processing using partitioned inverted files. In SPIRE, pages 10-20, 2001.Google Scholar
- H. Ballani, P. Costa, T. Karagiannis, and A. I. T. Rowstron. Towards predictable datacenter networks. In SIGCOMM, 2011. Google Scholar
- D. Battré, N. Frejnik, S. Goel, O. Kao, and D. Warneke. Evaluation of network topology inference in opaque compute clouds through end-to-end measurements. In IEEE CLOUD, 2011. Google Scholar
- T. Benson, A. Akella, and D. A. Maltz. Network traffic characteristics of data centers in the wild. In Internet Measurement Conference, 2010. Google Scholar
- O. Bonorden, B. Juurlink, I. von Otte, and I. Rieping. The paderborn university BSP (PUB) library. Parallel Computing, 29(2):187-207, 2003. Google Scholar
- D. Borthakur. The hadoop distributed file system: Architecture and design. http://hadoop.apache.org/core/docs/current/hdfsdesign.pdf.Google Scholar
- S. Chaudhuri and V. Narasayya. An efficient, cost-driven index selection tool for microsoft SQL server. In VLDB, 1997. Google Scholar
- X. Cheng, S. Su, Z. Zhang, H. Wang, F. Yang, Y. Luo, and J. Wang. Virtual network embedding through topology-aware node ranking. SIGCOMM CCR, 41(2):38-47, 2011. Google Scholar
- M. Chowdhury, M. Zaharia, J. Ma, M. I. Jordan, and I. Stoica. Managing data transfers in computer clusters with orchestra. In SIGCOMM, 2011. Google Scholar
- N. M. M. K. Chowdhury, M. R. Rahman, and R. Boutaba. Virtual Network Embedding with Coordinated Node and Link Mapping. In INFOCOM, 2009.Google Scholar
- L. P. Cordella, P. Foggia, C. Sansone, and M. Vento. Performance evaluation of the VF graph matching algorithm. In International Conference on Image Analysis and Processing, 1999. Google Scholar
- L. P. Cordella, P. Foggia, C. Sansone, and M. Vento. A (sub)graph isomorphism algorithm for matching large graphs. IEEE Transactions on PAMI, 26:1367-1372, 2004. Google Scholar
- I. Couzin, J. Krause, N. Franks, and S. Levin. Effective leadership and decision-making in animal groups on the move. Nature, 433(7025):513-516, 2005.Google Scholar
- G. DeCandia, D. Hastorun, M. Jampani, G. Kakulapati, A. Lakshman, A. Pilchin, S. Sivasubramanian, P. Vosshall, and W. Vogels. Dynamo: Amazon's highly available key-value store. In SOSP, 2007. Google Scholar
- J. Demmel, M. Hoemmen, M. Mohiyuddin, and K. A. Yelick. Avoiding communication in sparse matrix computations. In IPDPS, 2008.Google Scholar
- D. Eppstein. Subgraph isomorphism in planar graphs and related problems. In SODA, 1995. Google Scholar
- C. Evangelinos and C. N. Hill. Cloud computing for parallel scientific HPC applications. In Cloud Computing and Its Applications, 2008.Google Scholar
- I. Fajjari, N. Aitsaadi, G. Pujolle, and H. Zimmermann. VNE-AC: Virtual network embedding algorithm based on ant colony metaheuristic. In IEEE International Conference on Communications, 2011.Google Scholar
- B. Farley, V. Varadarajan, K. Bowers, A. Juels, T. Ristenpart, and M. Swift. More for your money: Exploiting performance heterogeneity in public clouds. In SOCC, 2012. Google Scholar
- M. R. Garey and D. S. Johnson. Computers and intractability. Freeman, 1979.Google Scholar
- R. Geambasu, S. D. Gribble, and H. M. Levy. Cloudviews: Communal data sharing in public clouds. In HotCloud, 2009. Google Scholar
- C. Guo, G. Lu, H. J. Wang, S. Yang, C. Kong, P. Sun, W. Wu, and Y. Zhang. Secondnet: a data center network virtualization architecture with bandwidth guarantees. In CoNEXT, 2010. Google Scholar
- J. Han, D. Watson, and F. Jahanian. Topology aware overlay networks. In INFOCOM. IEEE, 2005.Google Scholar
- HP intelligent management center virtual application network manager. http://h17007.www1.hp.com/us/en/products/network-management/IMC_VANM_Software/index.aspx.Google Scholar
- M. Isard, V. Prabhakaran, J. Currey, U. Wieder, K. Talwar, and A. Goldberg. Quincy: fair scheduling for distributed computing clusters. In SOSP, 2009. Google Scholar
- E. Jahani, M. J. Cafarella, and C. Ré. Automatic optimization for mapreduce programs. PVLDB, 2011. Google Scholar
- G. Juve, E. Deelman, K. Vahi, G. Mehta, B. Berriman, B. P. Berman, and P. Maechling. Scientific workflow applications on amazon EC2. In IEEE International Conference on e-Science., 2009.Google Scholar
- J.-S. Kim, S. Ha, and C. S. Jhon. Efficient barrier synchronization mechanism for the BSP model on message-passing architectures. In IPPS/SPDP, 1998. Google Scholar
- P. Krishnan, D. Raz, and Y. Shavitt. The cache location problem. IEEE/ACM Transactions on Networking, 8(5):568-582, 2000. Google Scholar
- A. Lakshman and P. Malik. Cassandra: a decentralized structured storage system. SIGOPS OSR, 44(2):35-40, 2010. Google Scholar
- J. Larrosa and G. Valiente. Constraint satisfaction algorithms for graph pattern matching. Mathematical. Structures in Comp. Sci., 12(4):403-422, Aug. 2002. Google Scholar
- G. Lee, N. Tolia, P. Ranganathan, and R. H. Katz. Topology-aware resource allocation for data-intensive workloads. SIGCOMM CCR, 2011. Google Scholar
- B. Li, M. J. Golin, G. F. Italiano, X. Deng, and K. Sohraby. On the optimal placement of web proxies in the internet. In INFOCOM, 1999.Google Scholar
- G. L. Miller. Isomorphism testing for graphs of bounded genus. In STOC, 1980. Google Scholar
- Z. Miller and J. B. Orlin. Np-completeness for minimizing maximum edge length in grid embeddings. J. Algorithms, pages 10-16, 1985.Google Scholar
- R. N. Mysore, A. Pamboris, N. Farrington, N. Huang, P. Miri, S. Radhakrishnan, V. Subramanya, and A. Vahdat. Portland: a scalable fault-tolerant layer 2 data center network fabric. In SIGCOMM, 2009. Google Scholar
- C. O'Hanlon. A conversation with werner vogels. ACM Queue, 4(4):14-22, May 2006. Google Scholar
- B. Palanisamy, A. Singh, L. Liu, and B. Jain. Purlieus: locality-aware resource allocation for mapreduce in a cloud. In SC, 2011. Google Scholar
- O. Papaemmanouil, Y. Ahmad, U. Çetintemel, J. Jannotti, and Y. Yildirim. Extensible optimization in overlay dissemination trees. In SIGMOD, 2006. Google Scholar
- P. R. Pietzuch, J. Ledlie, J. Shneidman, M. Roussopoulos, M. Welsh, and M. I. Seltzer. Network-aware operator placement for stream-processing systems. In ICDE, 2006. Google Scholar
- L. Qiu, V. N. Padmanabhan, and G. M. Voelker. On the placement of web server replicas. In INFOCOM, 2001.Google Scholar
- Rackspace. http://www.rackspace.com/.Google Scholar
- A. Rai, R. Bhagwan, and S. Guha. Generalized resource allocation for the cloud. In SOCC, 2012. Google Scholar
- L. Ramakrishnan, K. R. Jackson, S. Canon, S. Cholia, and J. Shalf. Defining future platform requirements for e-Science clouds. In SOCC, 2010. Google Scholar
- R. Ramakrishnan. Data serving in the cloud. In LADIS, 2010.Google Scholar
- V. Ramasubramanian, D. Malkhi, F. Kuhn, M. Balakrishnan, A. Gupta, and A. Akella. On the treeness of internet latency and bandwidth. In SIGMETRICS, 2009. Google Scholar
- R. Ricci, C. Alfeld, and J. Lepreau. A solver for the network testbed mapping problem. SIGCOMM CCR, 33(2):65-81, 2003. Google Scholar
- S. Roy, H. Pucha, Z. Zhang, Y. C. Hu, and L. Qiu. Overlay node placement: Analysis, algorithms and impact on applications. In ICDCS, 2007. Google Scholar
- J. Schad, J. Dittrich, and J.-A. Quiané-Ruiz. Runtime measurements in the cloud: Observing, analyzing, and reducing variance. PVLDB, 3(1), 2010. Google Scholar
- Y. J. Song, M. K. Aguilera, R. Kotla, and D. Malkhi. RPC chains: Efficient client-server communication in geodistributed systems. In NSDI, 2009. Google Scholar
- J. R. Ullmann. An algorithm for subgraph isomorphism. Journal of the ACM, 23, 1976. Google Scholar
- W. Vogels. Data access patterns in the amazon.com technology platform. In VLDB, page 1, 2007. Google Scholar
- G. Wang and T. S. E. Ng. The impact of virtualization on network performance of Amazon EC2 data center. In INFOCOM, 2010. Google Scholar
- G. Wang, M. A. V. Salles, B. Sowell, X. Wang, T. Cao, A. J. Demers, J. Gehrke, and W. M. White. Behavioral simulations in mapreduce. PVLDB, 2010. Google Scholar
- G. Weikum, C. Hasse, A. Moenkeberg, and P. Zabback. The COMFORT automatic tuning project, invited project review. Inf. Syst., 19(5), 1994. Google Scholar
- Y. Wen. Scalability of Dynamic Traffic Assignment. PhD thesis, Massachusetts Institute of Technology, 2008. Google Scholar
- A. Wieder, P. Bhatotia, A. Post, and R. Rodrigues. Orchestrating the deployment of computations in the cloud with conductor. In NSDI, 2012. Google Scholar
- Windows azure. http://www.windowsazure.com/.Google Scholar
- M. Yu, Y. Yi, J. Rexford, and M. Chiang. Rethinking virtual network embedding: substrate support for path splitting and migration. SIGCOMM CCR, 38(2):17-29, 2008. Google Scholar
- Y. Yu, M. Isard, D. Fetterly, M. Budiu, Ú. Erlingsson, P. K. Gunda, and J. Currey. DryadLINQ: A system for general-purpose distributed data-parallel computing using a high-level language. In OSDI, 2008. Google Scholar
- M. Zaharia, A. Konwinski, A. D. Joseph, R. H. Katz, and I. Stoica. Improving mapreduce performance in heterogeneous environments. In OSDI, 2008. Google Scholar
- S. Zampelli, Y. Deville, and C. Solnon. Solving subgraph isomorphism problems with constraint programming. Constraints, 15(3):327-353, 2010. Google Scholar
- T. Zou, G. Wang, M. V. Salles, D. Bindel, A. Demers, J. Gehrke, and W. White. Making time-stepped applications tick in the cloud. In SOCC, 2011. Google Scholar
Index Terms
- ClouDiA: a deployment advisor for public clouds
Recommendations
Cloud Infrastructure & Applications --- CloudIA
CloudCom '09: Proceedings of the 1st International Conference on Cloud ComputingThe idea behind Cloud Computing is to deliver Infrastructure-as-a-Services and Software-as-a-Service over the Internet on an easy pay-per-use business model. To harness the potentials of Cloud Computing for e-Learning and research purposes, and to small-...
ClouDiA: a deployment advisor for public clouds
An increasing number of distributed data-driven applications are moving into shared public clouds. By sharing resources and operating at scale, public clouds promise higher utilization and lower costs than private clusters. To achieve high utilization, ...
A Conceptual Platform of SLA in Cloud Computing
DASC '11: Proceedings of the 2011 IEEE Ninth International Conference on Dependable, Autonomic and Secure ComputingCloud computing is a promising technology, where the infrastructure, developing platform, software and storage are delivered as a service. With the development of cloud computing, more and more cloud service providers emerge. However, there are no ...
Comments