Throughput Maximization for Ambient Backscatter Communication: A Reinforcement Learning Approach
Ambient backscatter (AB) communication is an emerging wireless communication technology that enables wireless devices (WDs) to communicate without requiring active radio transmission. In an AB communication system, a WD switches between communication and energy harvesting modes. The harvested energy is used to power the devices operations, e.g., circuit power consumption and sensing operation. In this paper, we focus on maximizing the throughput performance of AB communication system by adaptively selecting the operating mode under fading channel environment. We model the problem as an infinite-horizon Markov Decision Process (MDP) and accordingly obtain the optimal mode switching policy by the value iteration algorithm given the channel distributions. Meanwhile, when the knowledge of channel distribution is absent, a Q-learning (QL) method is applied to explore a suboptimal strategy through device repeated interaction with the environment. Finally, our simulations show that the proposed QL method can achieve close-to-optimal throughput performance and significantly outperforms the other than representative benchmark methods.
READ FULL TEXT