Optimal Energy Management of Plug-in Hybrid Vehicles Through Exploration-to-Exploitation Ratio Control in Ensemble Reinforcement Learning
Developing intelligent energy management systems with high adaptability and superiority is necessary and significant for Hybrid Electric Vehicles (HEVs). This paper proposed an ensemble learning-based scheme based on a learning automata module (LAM) to enhance vehicle energy efficiency. Two parallel base learners following two exploration-to-exploitation ratios (E2E) methods are used to generate an optimal solution, and the final action is jointly determined by the LAM using three ensemble methods. 'Reciprocal function-based decay' (RBD) and 'Step-based decay' (SBD) are proposed respectively to generate E2E ratio trajectories based on conventional Exponential decay (EXD) functions of reinforcement learning. Furthermore, considering the different performances of three decay functions, an optimal combination with the RBD, SBD, and EXD is employed to determine the ultimate action. Experiments are carried out in software-in-loop (SiL) and hardware-in-the-loop (HiL) to validate the potential performance of energy-saving under four predefined cycles. The SiL test demonstrates that the ensemble learning system with an optimal combination can achieve 1.09% higher vehicle energy efficiency than a single Q-learning strategy with the EXD function. In the HiL test, the ensemble learning system with an optimal combination can save more than 1.04% in the predefined real-world driving condition than the single Q-learning scheme based on the EXD function.
READ FULL TEXT