ABSTRACT
Functions as a Service (also called serverless computing) promises to revolutionize how applications use cloud resources. However, functions suffer from cold-start problems due to the overhead of initializing their code and data dependencies before they can start executing. Keeping functions alive and warm after they have finished execution can alleviate the cold-start overhead. Keep-alive policies must keep functions alive based on their resource and usage characteristics, which is challenging due to the diversity in FaaS workloads.
Our insight is that keep-alive is analogous to caching. Our caching-inspired Greedy-Dual keep-alive policy can be effective in reducing the cold-start overhead by more than 3× compared to current approaches. Caching concepts such as reuse distances and hit-ratio curves can also be used for auto-scaled server resource provisioning, which can reduce the resource requirement of FaaS providers by 30% for real-world dynamic workloads. We implement caching-based keep-alive and resource provisioning policies in our FaasCache system, which is based on OpenWhisk. We hope that our caching analogy opens the door to more principled and optimized keep-alive and resource provisioning techniques for future FaaS workloads and platforms.
- [n.d.]. AWS Lambda Limits. https://docs.aws.amazon.com/lambda/latest/dg/gettingstarted-limits.html.Google Scholar
- [n.d.]. Keeping Functions Warm - How To Fix AWS Lambda Cold Start Issues. https://serverless.com/blog/keep-your-lambdas-warm/.Google Scholar
- [n.d.]. PID Controllers. https://en.wikipedia.org/wiki/PID_controller.Google Scholar
- 2015. Docker. https://www.docker.com/.Google Scholar
- 2017. How long does AWS Lambda keep your idle functions around before a cold start? https://read.acloud.guru/how-long-does-aws-lambda-keep-your-idle-functions-around-before-a-cold-start-bf715d3b810.Google Scholar
- 2018. Lambda Warmer: Optimize AWS Lambda Function Cold Starts. https://www.jeremydaly.com/lambda-warmer-optimize-aws-lambda-function-cold-starts/.Google Scholar
- 2019. AWS Lambda predictable start-up times with provisioned concurrency. https://aws.amazon.com/blogs/compute/new-for-aws-lambda-predictable-start-up-times-with-provisioned-concurrency/.Google Scholar
- 2019. Azure Functions Warm-up trigger. https://docs.microsoft.com/en-us/azure/azure-functions/functions-bindings-warmup.Google Scholar
- 2020. Apache OpenWhisk: Open Source Serverless Cloud Platform. https://openwhisk.apache.org/.Google Scholar
- 2020. AWS Lambda. https://aws.amazon.com/lambda/.Google Scholar
- 2020. Azure Functions. https://azure.microsoft.com/en-us/services/functions/ .Google Scholar
- 2020. Google Cloud Functions. https://cloud.google.com/functions .Google Scholar
- 2020. Google Cloud Functions Tips and Tricks. https://cloud.google.com/functions/docs/bestpractices/tips.Google Scholar
- 2020. OpenFaaS : Server Functions, Made Simple. https://www.openfaas.com.Google Scholar
- Alexandru Agache, Marc Brooker, Alexandra Iordache, Anthony Liguori, Rolf Neugebauer, Phil Piwonka, and Diana-Maria Popa. 2020. Firecracker: Lightweight Virtualization for Serverless Applications. In 17th $\$USENIX$\$ Symposium on Networked Systems Design and Implementation ($\$NSDI$\$ 20). 419?434.Google Scholar
- Istemi Ekin Akkus, Ruichuan Chen, Ivica Rimac, Manuel Stein, Klaus Satzke, Andre Beck, Paarijaat Aditya, and Volker Hilt. 2018. SAND: Towards High-Performance Serverless Computing. USENIX ATC (2018), 14.Google Scholar
- Erwan Alliaume and Benjamin Le Roux. 2018. Cold start / Warm start with AWS Lambda. https://blog.octo.com/en/cold-start-warm-start-with-aws-lambda/.Google Scholar
- Soumya Basu, Aditya Sundarrajan, Javad Ghaderi, Sanjay Shakkottai, and Ramesh Sitaraman. 2017. Adaptive TTL-based caching for content delivery. In Proceedings of the 2017 ACM SIGMETRICS/International Conference on Measurement and Modeling of Computer Systems. 45?46.Google ScholarDigital Library
- Pei Cao and Sandy Irani. 1997. Cost-Aware WWW Proxy Caching Algorithms. In Proceedings of the USENIX Symposium on Internet Technologies and Systems. 15.Google Scholar
- Joao Carreira, Pedro Fonseca, Alexey Tumanov, Andrew Zhang, and Randy Katz. 2018. A case for serverless machine learning. In Workshop on Systems for ML and Open Source Software at NeurIPS, Vol. 2018.Google Scholar
- Joao Carreira, Pedro Fonseca, Alexey Tumanov, Andrew Zhang, and Randy Katz. 2019. Cirrus: a Serverless Framework for End-to-end ML Workflows. In Proceedings of the ACM Symposium on Cloud Computing - SoCC '19. ACM Press, Santa Cruz, CA, USA, 13?24. isbn:978-1-4503-6973-2 https://doi.org/10.1145/3357223.3362711 Google ScholarDigital Library
- Benjamin Carver, Jingyuan Zhang, Ao Wang, and Yue Cheng. 2019. In Search of a Fast and Efficient Serverless DAG Engine. arXiv:1910.05896 [cs] (Oct. 2019). http://arxiv.org/abs/1910.05896 arXiv: 1910.05896.Google Scholar
- Ryan Chard, Yadu Babuji, Zhuozhao Li, Tyler Skluzacek, Anna Woodard, Ben Blaiszik, Ian Foster, and Kyle Chard. 2020. FuncX: A Federated Function Serving Fabric for Science. In Proceedings of the 29th International Symposium on High-Performance Parallel and Distributed Computing (Stockholm, Sweden) (HPDC '20). Association for Computing Machinery, New York, NY, USA, 65?76. isbn:9781450370523 https://doi.org/10.1145/3369583.3392683 Google ScholarDigital Library
- Hao Che, Ye Tung, and Zhijun Wang. 2002. Hierarchical web caching systems: Modeling, design and experimental results. IEEE journal on Selected Areas in Communications 20, 7 (2002), 1305?1314.Google Scholar
- Kai Cheng and Yahiko Kambayashi. 2000. LRU-SP: a size-adjusted and popularity-aware LRU replacement algorithm for web caching. In Proceedings 24th Annual International Computer Software and Applications Conference. COMPSAC2000. IEEE, 48?53.Google ScholarCross Ref
- Ludmila Cherkasova. 1998. Improving WWW Proxies Performance with Greedy-Dual-Size-Frequency Caching Policy. In HP Labs Technical Report 98-69 (R.1).Google Scholar
- Ludmila Cherkasova and Gianfranco Ciardo. 2001. Role of Aging, Frequency, and Size in Web Cache Replacement Policies. In High-Performance Computing and Networking, G. Goos, J. Hartmanis, J. van Leeuwen, Bob Hertzberger, Alfons Hoekstra, and Roy Williams (Eds.). Vol. 2110. Springer Berlin Heidelberg, Berlin, Heidelberg, 114?123. isbn:978-3-540-42293-8 978-3-540-48228-4 https://doi.org/10.1007/3-540-48228-8_12 Series Title: Lecture Notes in Computer Science. Google ScholarCross Ref
- Ludmila Cherkasova and Gianfranco Ciardo. 2001. Role of aging, frequency, and size in web cache replacement policies. In International Conference on High-Performance Computing and Networking. Springer, 114?123.Google ScholarCross Ref
- Dong Du, Tianyi Yu, Yubin Xia, Binyu Zang, Guanglu Yan, Chenggang Qin, Qixuan Wu, and Haibo Chen. 2020. Catalyzer: Sub-millisecond Startup for Serverless Computing with Initialization-less Booting. In Proceedings of the Twenty-Fifth International Conference on Architectural Support for Programming Languages and Operating Systems. 467?481.Google ScholarDigital Library
- Gil Einziger, Roy Friedman, and Ben Manes. 2017. Tinylfu: A highly efficient cache admission policy. ACM Transactions on Storage (ToS) 13, 4 (2017), 1?31.Google Scholar
- Sadjad Fouladi, Francisco Romero, Dan Iter, Qian Li, and Shuvo Chatterjee. 2019. From Laptop to Lambda: Outsourcing Everyday Jobs to Thousands of Transient Functional Containers. USENIX ATC (2019), 15.Google Scholar
- Anshul Gandhi, Mor Harchol-Balter, Ram Raghunathan, and Michael A Kozuch. 2012. Autoscale: Dynamic, robust capacity management for multi-tier data centers. ACM Transactions on Computer Systems (TOCS) 30, 4 (2012), 1?26.Google ScholarDigital Library
- Bishakh Chandra Ghosh, Sourav Kanti Addya, Nishant Baranwal Somy, Shubha Brata Nath, Sandip Chakraborty, and Soumya K. Ghosh. 2019. Caching Techniques to Improve Latency in Serverless Architectures. arXiv:1911.07351 [cs] (Nov. 2019). http://arxiv.org/abs/1911.07351 arXiv: 1911.07351.Google Scholar
- Scott Hendrickson, Stephen Sturdevant, Tyler Harter, Venkateshwaran Venkataramani, Andrea C Arpaci-Dusseau, and Remzi H Arpaci-Dusseau. 2016. Serverless computation with OpenLambda. In 8th USENIX Workshop on Hot Topics in Cloud Computing (HotCloud 16).Google ScholarDigital Library
- Xiameng Hu, Xiaolin Wang, Lan Zhou, Yingwei Luo, Chen Ding, and Zhenlin Wang. 2016. Kinetic modeling of data eviction in cache. In 2016 USENIX Annual Technical Conference (USENIX ATC 16)). 351?364.Google ScholarDigital Library
- Bo Jiang, Philippe Nain, and Don Towsley. 2018. On the convergence of the ttl approximation for an lru cache under independent stationary request processes. ACM Transactions on Modeling and Performance Evaluation of Computing Systems (TOMPECS) 3, 4 (2018), 1?31.Google Scholar
- Aji John, Kristiina Ausmees, Kathleen Muenzen, Catherine Kuhn, and Amanda Tan. 2019. SWEEP: Accelerating Scientific Research Through Scalable Serverless Workflows. In Proceedings of the 12th IEEE/ACM International Conference on Utility and Cloud Computing Companion - UCC '19 Companion. ACM Press, Auckland, New Zealand, 43?50. isbn:978-1-4503-7044-8 https://doi.org/10.1145/3368235.3368839 Google ScholarDigital Library
- Eric Jonas, Qifan Pu, Shivaram Venkataraman, Ion Stoica, and Benjamin Recht. 2017. Occupy the cloud: Distributed computing for the 99\ Cloud Computing. ACM, 445?451.Google Scholar
- Eric Jonas, Johann Schleier-Smith, Vikram Sreekanti, Chia-Che Tsai, Anurag Khandelwal, Qifan Pu, Vaishaal Shankar, Joao Carreira, Karl Krauth, Neeraja Yadwadkar, Joseph E. Gonzalez, Raluca Ada Popa, Ion Stoica, and David A. Patterson. 2019. Cloud Programming Simplified: A Berkeley View on Serverless Computing. arXiv:1902.03383 [cs] (Feb. 2019). http://arxiv.org/abs/1902.03383 arXiv: 1902.03383.Google Scholar
- Jeongchul Kim and Kyungyong Lee. 2019. FunctionBench: A Suite of Workloads for Serverless Cloud Function Service. In 2019 IEEE 12th International Conference on Cloud Computing (CLOUD). 502?504. https://doi.org/10.1109/CLOUD.2019.00091 ISSN: 2159-6182. Google ScholarCross Ref
- Ping-Min Lin and Alex Glikson. 2019. Mitigating Cold Starts in Serverless Platforms: A Pool-Based Approach. arXiv:1903.12221 [cs] (March 2019). http://arxiv.org/abs/1903.12221 arXiv: 1903.12221.Google Scholar
- Anil Madhavapeddy, Richard Mortier, Charalampos Rotsos, David Scott, Balraj Singh, Thomas Gazagnaire, Steven Smith, Steven Hand, and Jon Crowcroft. 2013. Unikernels: Library Operating Systems for the Cloud. In Proceedings of the Eighteenth International Conference on Architectural Support for Programming Languages and Operating Systems (Houston, Texas, USA) (ASPLOS '13). ACM, New York, NY, USA, 461?472. isbn:978-1-4503-1870-9 https://doi.org/10.1145/2451116.2451167 Google ScholarDigital Library
- Johannes Manner, Martin EndreB, Tobias Heckel, and Guido Wirtz. 2018. Cold Start Influencing Factors in Function as a Service. In 2018 IEEE/ACM International Conference on Utility and Cloud Computing Companion (UCC Companion). IEEE, Zurich, 181?188. isbn:978-1-72810-359-4 https://doi.org/10.1109/UCC-Companion.2018.00054 Google ScholarCross Ref
- Nimrod Megiddo and Dharmendra S Modha. 2003. ARC: A Self-Tuning, Low Overhead Replacement Cache.. In USENIX FAST, Vol. 3. 115?130.Google Scholar
- Anup Mohan, Harshad Sane, Kshitij Doshi, Saikrishna Edupuganti, Vadim Sukhomlinov, and Naren Nayak. 2019. Agile Cold Starts for Scalable Serverless. USENIX Workshop on Hot Topics in Cloud Computing (HotCloud) (2019), 6.Google Scholar
- Edward Oakes, Leon Yang, Dennis Zhou, Kevin Houck, Tyler Harter, Andrea C Arpaci-Dusseau, and Remzi H Arpaci-Dusseau. 2018. SOCK: Rapid Task Provisioning with Serverless-Optimized Containers. USENIX ATC (2018), 14.Google Scholar
- Elizabeth J O'neil, Patrick E O'neil, and Gerhard Weikum. 1993. The LRU-K page replacement algorithm for database disk buffering. Acm Sigmod Record 22, 2 (1993), 297?306.Google Scholar
- Mohammad Shahrad, Rodrigo Fonseca, Inigo Goiri, Gohar Chaudhry, Paul Batum, Jason Cooke, Eduardo Laureano, Colby Tresness, Mark Russinovich, and Ricardo Bianchini. 2020. Serverless in the Wild: Characterizing and Optimizing the Serverless Workload at a Large Cloud Provider. (July 2020), 205?218. isbn:978-1-939133-14-4 http://arxiv.org/abs/2003.03423Google Scholar
- Vaishaal Shankar, Karl Krauth, Qifan Pu, Eric Jonas, Shivaram Venkataraman, Ion Stoica, Benjamin Recht, and Jonathan Ragan-Kelley. 2018. Numpywren: Serverless linear algebra. arXiv preprint arXiv:1810.09679 (2018).Google Scholar
- Prateek Sharma, Ahmed Ali-Eldin, and Prashant Shenoy. 2019. Resource Deflation: A New Approach For Transient Resource Reclamation. In Proceedings of the Fourteenth EuroSys Conference 2019 (Dresden, Germany) (EuroSys '19). ACM, New York, NY, USA, Article 33, 17 pages. isbn:978-1-4503-6281-8 https://doi.org/10.1145/3302424.3303945 Google ScholarDigital Library
- Aditya Sundarrajan, Mingdong Feng, Mangesh Kasbekar, and Ramesh K Sitaraman. 2017. Footprint descriptors: Theory and practice of cache provisioning in a global cdn. In Proceedings of the 13th International Conference on emerging Networking EXperiments and Technologies. 55?67.Google ScholarDigital Library
- Amoghavarsha Suresh, Gagan Somashekar, Anandh Varadarajan, Veerendra Ramesh Kakarla, Hima Upadhyay, and Anshul Gandhi. 2020. ENSURE: Efficient Scheduling and Autonomous Resource Management in Serverless Environments. (2020), 10.Google Scholar
- Erwin van Eyk, Alexandru Iosup, Simon Seif, and Markus Thommes. 2017. The SPEC cloud group's research vision on FaaS and serverless architectures. In Proceedings of the 2nd International Workshop on Serverless Computing - WoSC '17. ACM Press, Las Vegas, Nevada, 1?4. isbn:978-1-4503-5434-9 https://doi.org/10.1145/3154847.3154848 Google ScholarDigital Library
- Carl A Waldspurger, Nohhyun Park, Alexander Garthwaite, and Irfan Ahmad. 2015. Efficient MRC Construction with SHARDS. In 13th USENIX Conference on File and Storage Technologies (FAST 15). 95?110.Google Scholar
- Liang Wang, Mengyuan Li, Yinqian Zhang, Thomas Ristenpart, and Michael Swift. 2018. Peeking behind the curtains of serverless platforms. In 2018 USENIX Annual Technical Conference. 133?146.Google Scholar
- B. P. Welford. 1962. Note on a Method for Calculating Corrected Sums of Squares and Products. Technometrics 4, 3 (1962), 419?420. https://doi.org/10.1080/00401706.1962.10490022 arxiv:https://www.tandfonline.com/doi/pdf/10.1080/00401706.1962.10490022 Google ScholarCross Ref
- Jake Wires, Stephen Ingram, Zachary Drudi, Nicholas JA Harvey, and Andrew Warfield. 2014. Characterizing storage workloads with counter stacks. In 11th USENIX Symposium on Operating Systems Design and Implementation (OSDI 14). 335?349.Google ScholarDigital Library
- N. Young. 1994. The K-server dual and loose competitiveness for paging. Algorithmica 11, 6 (June 1994), 525?541. issn:0178-4617, 1432-0541 https://doi.org/10.1007/BF01189992 Google ScholarDigital Library
- Neal E Young. 2002. On-line file caching. Algorithmica 33, 3 (2002), 371?383.Google Scholar
- Yu Zhang, Ping Huang, Ke Zhou, Hua Wang, Jianying Hu, Yongguang Ji, and Bin Cheng. 2020. OSCA: An Online-Model Based Cache Allocation Scheme in Cloud Block Storage Systems. In 2020 USENIX Annual Technical Conference (USENIX ATC 20). USENIX Association, 785?798. isbn:978-1-939133-14-4 https://www.usenix.org/conference/atc20/presentation/zhang-yuGoogle Scholar
- Yu Zhang, Ping Huang, Ke Zhou, Hua Wang, Jianying Hu, Yongguang Ji, and Bin Cheng. 2020. OSCA: An Online-Model Based Cache Allocation Scheme in Cloud Block Storage Systems. In 2020 USENIX Annual Technical Conference. 785?798.Google Scholar
Index Terms
- FaasCache: keeping serverless computing alive with greedy-dual caching
Recommendations
S-Cache: Function Caching for Serverless Edge Computing
EdgeSys '23: Proceedings of the 6th International Workshop on Edge Systems, Analytics and NetworkingServerless edge computing uses an event-driven model in which Internet-of-Things (IoT) services are run in short-lived, stateless containers only when invoked, leading to significant reduction of resource utilization. However, a cold-start of a ...
Supporting Multi-Provider Serverless Computing on the Edge
ICPP Workshops '18: Workshop Proceedings of the 47th International Conference on Parallel ProcessingServerless computing has recently emerged as a new execution model for cloud computing, in which service providers offer compute runtimes, also known as Function-as-a-Service (FaaS) platforms, allowing users to develop, execute and manage application ...
Palette Load Balancing: Locality Hints for Serverless Functions
EuroSys '23: Proceedings of the Eighteenth European Conference on Computer SystemsFunction-as-a-Service (FaaS) serverless computing enables a simple programming model with almost unbounded elasticity. Unfortunately, current FaaS platforms achieve this flexibility at the cost of lower performance for data-intensive applications ...
Comments