Reinforcement learning with distance-based incentive/penalty (DIP) updates for highly constrained industrial control systems

11/22/2020
by   Hyungjun Park, et al.
0

Typical reinforcement learning (RL) methods show limited applicability for real-world industrial control problems because industrial systems involve various constraints and simultaneously require continuous and discrete control. To overcome these challenges, we devise a novel RL algorithm that enables an agent to handle a highly constrained action space. This algorithm has two main features. First, we devise two distance-based Q-value update schemes, incentive update and penalty update, in a distance-based incentive/penalty update technique to enable the agent to decide discrete and continuous actions in the feasible region and to update the value of these types of actions. Second, we propose a method for defining the penalty cost as a shadow price-weighted penalty. This approach affords two advantages compared to previous methods to efficiently induce the agent to not select an infeasible action. We apply our algorithm to an industrial control problem, microgrid system operation, and the experimental results demonstrate its superiority.

READ FULL TEXT
research
09/16/2022

Optimizing Industrial HVAC Systems with Hierarchical Reinforcement Learning

Reinforcement learning (RL) techniques have been developed to optimize i...
research
12/22/2020

Dynamic penalty function approach for constraints handling in reinforcement learning

Reinforcement learning (RL) is attracting attentions as an effective way...
research
02/22/2021

Escaping from Zero Gradient: Revisiting Action-Constrained Reinforcement Learning via Frank-Wolfe Policy Optimization

Action-constrained reinforcement learning (RL) is a widely-used approach...
research
10/19/2021

Continuous Control with Action Quantization from Demonstrations

In Reinforcement Learning (RL), discrete actions, as opposed to continuo...
research
01/02/2020

Continuous-Discrete Reinforcement Learning for Hybrid Control in Robotics

Many real-world control problems involve both discrete decision variable...
research
07/22/2022

Learn Continuously, Act Discretely: Hybrid Action-Space Reinforcement Learning For Optimal Execution

Optimal execution is a sequential decision-making problem for cost-savin...
research
05/31/2022

Timing is Everything: Learning to Act Selectively with Costly Actions and Budgetary Constraints

Many real-world settings involve costs for performing actions; transacti...

Please sign up or login with your details

Forgot password? Click here to reset