Deep Reinforcement Learning-Aided Random Access
We consider a system model comprised of an access point (AP) and K Internet of Things (IoT) nodes that sporadically become active in order to send data to the AP. The AP is assumed to have N time-frequency resource blocks that it can allocate to the IoT nodes that wish to send data, where N < K. The main problem is how to allocate the N time-frequency resource blocks to the IoT nodes in each time slot such that the average packet rate is maximized. For this problem, we propose a deep reinforcement learning (DRL)-aided random access (RA) scheme, where an intelligent DRL agent at the AP learns to predict the activity of the IoT nodes in each time slot and grants time-frequency resource blocks to the IoT nodes predicted as active. Next, the IoT nodes that are missclassified as non-active by the DRL agent, as well as unseen or newly arrived nodes in the cell, employ the standard RA scheme in order to obtain time-frequency resource blocks. We leverage expert knowledge for faster training of the DRL agent. Our numerical results show significant improvements in terms of average packet rate when the proposed DRL-aided RA scheme is implemented compared to the existing solution used in practice, the standard RA scheme.
READ FULL TEXT