Abstract
The backend database system is often the performance bottleneck when running web applications. A common approach to scale the database component is query result caching, but it faces the challenge of maintaining a high cache hit rate while efficiently ensuring cache consistency as the database is updated. In this paper we introduce Ferdinand, the first proxy-based cooperative query result cache with fully distributed consistency management. To maintain a high cache hit rate, Ferdinand uses both a local query result cache on each proxy server and a distributed cache. Consistency management is implemented with a highly scalable publish/subscribe system. We implement a fully functioning Ferdinand prototype and evaluate its performance compared to several alternative query-caching approaches, showing that our high cache hit rate and consistency management are both critical for Ferdinand's performance gains over existing systems.
- G. Alonso. Partial database replication and group communication primitives. In Proc. European Research Seminar on Advances in Distributed Systems, 1997.Google Scholar
- M. Altinel, C. Bornhovd, S. Krishnamurthy, C. Mohan, H. Pirahesh, and B. Reinwald. Cache tables: Paving the way for an adaptive database cache. In Proc. International Conference on Very Large Data Bases, 2003. Google ScholarDigital Library
- K. Amiri, S. Park, R. Tewari, and S. Padmanabhan. DBProxy: A dynamic data cache for Web applications. In Proc. International Conference on Data Engineering, 2003.Google ScholarCross Ref
- K. Amiri, S. Sprenkle, R. Tewari, and S. Padmanabhan. Exploiting templates to scale consistency maintenance in edge database caches. In Proc. International Workshop on Web Content Caching and Distribution, 2003.Google Scholar
- E. Brynojolfsson, M. Smith, and Y. Hu. Consumer surplus in the digital economy: Estimating the value of increased product variety at online booksellers. MIT Sloan Working paper No. 4305--03, 2003.Google Scholar
- M. Castro, P. Druschel, A-M. Kermarrec, and A. Rowstron. SCRIBE: A large-scale and decentralised application-level multicast infrastructure. IEEE Journal on Selected Areas in Communication, October 2002. Google ScholarDigital Library
- I. Chabbouh and M. Makpangou. Caching dynamic content with automatic fragmentation. In G. Kotsis, D. Taniar, S. Bressan, I. K. Ibrahim, and S. Mokhtar, editors, ii WAS, volume 196 of [email protected], pages 975--986. Austrian Computer Society, 2005.Google Scholar
- J. Challenger, P. Dantzig, and A. Iyengar. A scalable system for consistently caching dynamic web data. In Proceedings of the 18th Annual Joint Conference of the IEEE Computer and Communications Societies, New York, New York, 1999.Google ScholarCross Ref
- J. Challenger, P. Dantzig, A. Iyengar, and K. Witting. A fragment-based approach for efficiently creating dynamic web content. ACM Trans. Internet Techn., 5(2):359--389, 2005. Google ScholarDigital Library
- B. Chandramouli, J. Xie, and J. Yang. On the database/network interface in large-scale publish/subscribe systems. In Proc. ACM SIGMOD International Conference on Management of Data, 2006. Google ScholarDigital Library
- L. Gao, M. Dahlin, A. Nayate, J. Zheng, and A. Iyengar. Improving availability and performance with application-specific data replication. IEEE Transactions on Knowlege and Data Engineering, 17(1):106--120, Jan 2005. Google ScholarDigital Library
- C. Garrod, A. Manjhi, A. Ailamaki, P. Gibbons, B. M. Maggs, T. Mowry, C. Olston, and A. Tomasic. Scalable consistency management for web database caches. Technical Report CMU-CS-06-128, Carnegie Mellon University, July 2006.Google Scholar
- T. Groothuyse, S. Sivasubramanian, and G. Pierre. Globetp: template-based database replication for scalable web applications. In Carey L. Williamson, Mary Ellen Zurko, Peter F. Patel-Schneider, and Prashant J. Shenoy, editors, Proc. International Conference on the World Wide Web, pages 301--310. ACM, 2007. Google ScholarDigital Library
- S. Iyer, A. Rowstron, and P. Druschel. Squirrel: A decentralized peer-to-peer web cache. In Proc. 21st ACM SIGACT-SIGOPS Principles of Distributed Commuting, 2002. Google ScholarDigital Library
- A. Y. Levy and Y. Sagiv. Queries independent of updates. In Proc. International Conference on Very Large Data Bases, 1993. Google ScholarDigital Library
- W. Li, O. Po, W. Hsiung, K. S. Candan, D Agrawal, Y. Akca, and K Taniguchi. CachePortal II: Acceleration of very large scale data center-hosted database-driven web applications. In Proc. International Conference on Very Large Data Bases, 2003. Google ScholarDigital Library
- Q. Luo, S. Krishnamurthy, C. Mohan, H. Pirahesh, H. Woo, B. G. Lindsay, and J. F. Naughton. Middle-tier database caching for e-business. In Proc. ACM SIGMOD International Conference on Management of Data, 2002. Google ScholarDigital Library
- A. Manjhi, A. Ailamaki, B. M. Maggs, T. C. Mowry, C. Olston, and A. Tomasic. Simultaneous scalability and security for data-intensive web applications. In Proc. ACM SIGMOD International Conference on Management of Data, 2006. Google ScholarDigital Library
- A. Manjhi, P. B. Gibbons, A. Ailamaki, C. Garrod, B. M. Maggs, T. C. Mowry, C. Olston, A. Tomasic, and H. Yu. Invalidation clues for database scalability services. In Proc. International Conference on Data Engineering, 2007.Google ScholarCross Ref
- Object Web Consortium. C-JDBC: Flexible database clustering middleware. http://c-jdbc.objectweb.org/.Google Scholar
- Object Web Consortium. Rice University bidding system. http://rubis.objectweb.org/.Google Scholar
- C. Olston, A. Manjhi, C. Garrod, A. Ailamaki, B. Maggs, and T. Mowry. A scalability service for dynamic web applications. In Proc. Conference on Innovative Data Systems Research (CIDR), 2005.Google Scholar
- C. G. Plaxton, R. Rajaraman, and A. W. Richa. Accessing nearby copies of replicated objects in a distributed environment. Theory of Computing Systems, 32:241--280, 1999.Google ScholarCross Ref
- A. Rowstron and P. Druschel. Pastry: Scalable, distributed object location and routing for large-scale peer-to-peer systems. In Proc. IFIP/ACM International Conference on Distributed Systems Platforms (Middleware), Heidelberg, Germany, November 2001. Google ScholarDigital Library
- S. Sivasubramanian, G. Alonso, G. Pierre, and M. van Steen. GlobeDB: Autonomic data replication for web applications. In Proc. International Conference on the World Wide Web, 2005. Google ScholarDigital Library
- Transaction Processing Council. TPC-W specification ver. 1.7. http://www.tpc.org/tpcw/.Google Scholar
- B. White, J. Lepreau, L. Stoller, R. Ricci, S. Guruprasad, M. Newbold, M. Hibler, C. Barb, and A. Joglekar. An integrated experimental environment for distributed systems and networks. In Proc. Symposium on Operating Systems Design and Implementation, 2002. Google ScholarDigital Library
Index Terms
- Scalable query result caching for web applications
Recommendations
Selective Victim Caching: A Method to Improve the Performance of Direct-Mapped Caches
Although direct-mapped caches suffer from higher miss ratios as compared to set-associative caches, they are attractive for today's high-speed pipelined processors that require very low access times. Victim caching was proposed by Jouppi [1] as an ...
ChronoCache: Predictive and Adaptive Mid-Tier Query Result Caching
SIGMOD '20: Proceedings of the 2020 ACM SIGMOD International Conference on Management of DataThe performance of data-driven, web-scale client applications is sensitive to access latency. To address this concern, enterprises strive to cache data on edge nodes that are closer to users, thereby avoiding expensive round-trips to remote data ...
A machine learning approach for result caching in web search engines
To the best of our knowledge, our work is therst in literature to apply machine learning techniques to the result caching problem in search engines, for both static, dynamic, and state-of-the-art static-dynamic cache organizations.We evaluate a large ...
Comments