On the Power-of-d-choices with Least Loaded Server Selection
Motivated by distributed schedulers that combine the power-of-d-choices with late binding and systems that use replication with cancellation-on-start, we study the performance of the LL(d) policy which assigns a job to a server that currently has the least workload among d randomly selected servers in large-scale homogeneous clusters. We consider general service time distributions and propose a partial integro-differential equation to describe the evolution of the system. This equation relies on the earlier proven ansatz for LL(d) which asserts that the workload distribution of any finite set of queues becomes independent of one another as the number of servers tends to infinity. Based on this equation we propose a fixed point iteration for the limiting workload distribution and study its convergence. For exponential job sizes we present a simple closed form expression for the limiting workload distribution that is valid for any work-conserving service discipline as well as for the limiting response time distribution in case of first-come-first-served scheduling. We further show that for phase-type distributed job sizes the limiting workload and response time distribution can be expressed via the unique solution of a simple set of ordinary differential equations. Numerical and analytical results that compare response time of the classic power-of-d-choices algorithm and the LL(d) policy are also presented and the accuracy of the limiting response time distribution for finite systems is illustrated using simulation.
READ FULL TEXT