The Greatest Teacher, Failure is: Using Reinforcement Learning for SFC Placement Based on Availability and Energy Consumption
Software defined networking (SDN) and network functions virtualisation (NFV) are making networks programmable and consequently much more flexible and agile. To meet service level agreements, achieve greater utilisation of legacy networks, faster service deployment, and reduce expenditure, telecommunications operators are deploying increasingly complex service function chains (SFCs). Notwithstanding the benefits of SFCs, increasing heterogeneity and dynamism from the cloud to the edge introduces significant SFC placement challenges, not least adding or removing network functions while maintaining availability, quality of service, and minimising cost. In this paper, an availability- and energy-aware solution based on reinforcement learning (RL) is proposed for dynamic SFC placement. Two policy-aware RL algorithms, Advantage Actor-Critic (A2C) and Proximal Policy Optimisation (PPO2), are compared using simulations of a ground truth network topology based on the Rede Nacional de Ensino e Pesquisa (RNP) Network, Brazil's National Teaching and Research Network backbone. The simulation results showed that PPO2 generally outperformed A2C and a greedy approach both in terms of acceptance rate and energy consumption. A2C outperformed PPO2 only in the scenario where network servers had a greater number of computing resources.
READ FULL TEXT