Scheduling Methods to Reduce Response Latency of Function as a Service
Function as a Service (FaaS) permits cloud customers to deploy to cloud individual functions, in contrast to complete virtual machines or Linux containers. All major cloud providers offer FaaS products (Amazon Lambda, Google Cloud Functions, Azure Serverless); there are also popular open-source implementations (Apache OpenWhisk) with commercial offerings (Adobe I/O Runtime, IBM Cloud Functions). A new feature of FaaS is function composition: a function may (sequentially) call another function, which, in turn, may call yet another function - forming a chain of invocations. From the perspective of the infrastructure, a composed FaaS is less opaque than a virtual machine or a container. We show that this additional information enables the infrastructure to reduce the response latency. In particular, knowing the sequence of future invocations, the infrastructure can schedule these invocations along with environment preparation. We model resource management in FaaS as a scheduling problem combining (1) sequencing of invocations, (2) deploying execution environments on machines, and (3) allocating invocations to deployed environments. For each aspect, we propose heuristics. We explore their performance by simulation on a range of synthetic workloads. Our results show that if the setup times are long compared to invocation times, algorithms that use information about the composition of functions consistently outperform greedy, myopic algorithms, leading to significant decrease in response latency.
READ FULL TEXT