Categorical semantics of compositional reinforcement learning

08/29/2022
by   Georgios Bakirtzis, et al.
0

Reinforcement learning (RL) often requires decomposing a problem into subtasks and composing learned behaviors on these tasks. Compositionality in RL has the potential to create modular subtask units that interface with other system capabilities. However, generating compositional models requires the characterization of minimal assumptions for the robustness of the compositional feature. We develop a framework for a compositional theory of RL using a categorical point of view. Given the categorical representation of compositionality, we investigate sufficient conditions under which learning-by-parts results in the same optimal policy as learning on the whole. In particular, our approach introduces a category 𝖬𝖣𝖯, whose objects are Markov decision processes (MDPs) acting as models of tasks. We show that 𝖬𝖣𝖯 admits natural compositional operations, such as certain fiber products and pushouts. These operations make explicit compositional phenomena in RL and unify existing constructions, such as puncturing hazardous states in composite MDPs and incorporating state-action symmetry. We also model sequential task completion by introducing the language of zig-zag diagrams that is an immediate application of the pushout operation in 𝖬𝖣𝖯.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset