Scalable Deep Reinforcement Learning for Routing and Spectrum Access in Physical Layer
This paper proposes a novel and scalable reinforcement learning approach for simultaneous routing and spectrum access in wireless ad-hoc networks. In most previous works on reinforcement learning for network optimization, routing and spectrum access are tackled as separate tasks; further, the wireless links in the network are assumed to be fixed, and a different agent is trained for each transmission node – this limits scalability and generalizability. In this paper, we account for the inherent signal-to-interference-plus-noise ratio (SINR) in the physical layer and propose a more scalable approach in which a single agent is associated with each flow. Specifically, a single agent makes all routing and spectrum access decisions as it moves along the frontier nodes of each flow. The agent is trained according to the physical layer characteristics of the environment using the future bottleneck SINR as a novel reward definition. This allows a highly effective routing strategy based on the geographic locations of the nodes in the wireless ad-hoc network. The proposed deep reinforcement learning strategy is capable of accounting for the mutual interference between the links. It learns to avoid interference by intelligently allocating spectrum slots and making routing decisions for the entire network in a scalable manner.
READ FULL TEXT