Context
Serverless computing, also known as function-as-a-service, improves upon cloud computing by enabling programmers to develop and scale their applications without worrying about infrastructure management [1, 2]. It involves breaking an application into small functions that can be executed and scaled automatically, offering applications high elasticity, cost efficiency, and easy deployment [3, 4].
Serverless computing is a key platform for building next-generation web services, which are typically realized by running distributed machine learning (ML) and deep learning (DL) applications. Indeed, 50% of AWS customers are now using serverless computing [5]. Significant efforts have focused on deploying and optimizing ML applications on homogeneous clouds by enabling fast storage services to share data between stages [6], by solving the cold-start problem (launching an appropriate container to perform a given function) when scaling resources [7], and by proposing lightweight runtimes to efficiently execute serverless workflows on GPUs [8]; and on building simulation to evaluate resource allocation and task scheduling policies [9] . However, few efforts have focused on deploying serverless computing in the Edge-Cloud Continuum, where resources are heterogeneous and have limited compute and storage capacity [10], or have addressed the simultaneous deployment of multiple applications.
References:
[1] Shadi Ibrahim, Omer Rana, Olivier Beaumont, Xiaowen Chu (2025). Serverless Computing, in IEEE Internet Computing, vol. 28, no. 6, pp. 5-7, Nov.-Dec. 2024, doi: 10.1109/MIC.2024.3524507.
[2] Vincent Lannurien, Laurent d’Orazio, Olivier Barais, Stephane Paquelet, Jalil Boukhobza. (2023). Serverless Cloud Computing: State of the Art and Challenges. In Serverless Computing: Principles and Paradigms. Lecture Notes on Data Engineering and Communications Technologies, vol 162. Springer.
[3] Zijun Li, Linsong Guo, Jiagan Cheng, Quan Chen, Bingsheng He, and Minyi Guo. The Serverless Computing Survey: A Technical Primer for Design Architecture. ACM Comput. Surv. 54, 10s, Article 220 (January 2022), 34 pages.
[4] Mohammad Shahrad, Rodrigo Fonseca, Inigo Goiri, Gohar Chaudhry, Paul Batum, Jason Cooke, Eduardo Laureano, Colby Tresness, Mark Russinovich, and Ricardo Bianchini. Serverless in the wild: characterizing and optimizing the serverless workload at a large cloud provider. In Proceedings of the USENIX Annual Technical Conference, pages 205–218, 2020.
[5] Aws insider. Report: AWS Lambda Popular Among Enterprises, Container Users. 2020. https://awsinsider.net/articles/2020/02/04/aws-lambda-usage-profile.aspx
[6] Hao Wu, Junxiao Deng, Hao Fan, Shadi Ibrahim, Song Wu, Hai Jin. QoS-Aware and Cost-Efficient Dynamic Resource Allocation for Serverless ML Workflows, 2023 IEEE International Parallel and Distributed Processing Symposium (IPDPS),
[7] Mohan, Anup, Harshad Sane, Kshitij Doshi, Saikrishna Edupuganti, Naren Nayak, and Vadim Sukhomlinov. Agile cold starts for scalable serverless. In 11th USENIX Workshop on Hot Topics in Cloud Computing (HotCloud 19). 2019.
[8] Hao Wu, Yue Yu, Junxiao Deng, Shadi Ibrahim, Song Wu, Hao Fan, Ziyue Cheng, Hai Jin. {StreamBox}: A Lightweight {GPU}{SandBox} for Serverless Inference Workflow. In : 2024 USENIX Annual Technical Conference (USENIX ATC 24). 2024. p. 59-73.
[9] Lannurien, V., d’Orazio, L., Barais, O., Paquelet, S. and Boukhobza, J., 2024. HeROsim: An Allocation and Scheduling Simulator for Evaluating Serverless Orchestration Policies. IEEE Internet Computing.
[10] S. Moreschini, F. Pecorelli, X. Li, S. Naz, D. Hästbacka and D. Taibi, "Cloud Continuum: The Definition," in IEEE Access.