Reinforcement-Learning-based Adaptive Optimal Control for Arbitrary Reference Tracking
Model-free control based on the idea of Reinforcement Learning is a promising control approach that has recently gained extensive attention. However, most Reinforcement-Learning-based control methods solely focus on the regulation problem or learn to track a reference that is generated by a time-invariant exo-system. In order to overcome these limitations, we develop a new Reinforcement-Learning-based adaptive optimal control method that is able to generalize to arbitrary reference trajectories. Therefore, we propose a novel Q-function that incorporates a given reference trajectory on a moving horizon. We show that only the Q-function needs to be determined in order to solve the optimal tracking problem. The analytical solution of our Q-function provides insights into its structure and allows us to choose basis functions suited for Q-function approximation purposes. Based on that, the optimal solution to the moving horizon linear-quadratic tracking problem with arbitrary reference trajectories is learned by means of a temporal difference learning method without knowledge of the system. We furthermore prove convergence of our algorithm to the optimal Q-function as well as the optimal control law. Finally, simulation examples demonstrate the effectiveness of our developed method.
READ FULL TEXT