Diverse Policies Converge in Reward-free Markov Decision Processe

08/23/2023
by   Fanqi Lin, et al.
0

Reinforcement learning has achieved great success in many decision-making tasks, and traditional reinforcement learning algorithms are mainly designed for obtaining a single optimal solution. However, recent works show the importance of developing diverse policies, which makes it an emerging research topic. Despite the variety of diversity reinforcement learning algorithms that have emerged, none of them theoretically answer the question of how the algorithm converges and how efficient the algorithm is. In this paper, we provide a unified diversity reinforcement learning framework and investigate the convergence of training diverse policies. Under such a framework, we also propose a provably efficient diversity reinforcement learning algorithm. Finally, we verify the effectiveness of our method through numerical experiments.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/23/2020

CertRL: Formalizing Convergence Proofs for Value and Policy Iteration in Coq

Reinforcement learning algorithms solve sequential decision-making probl...
research
07/12/2022

DGPO: Discovering Multiple Strategies with Diversity-Guided Policy Optimization

Recent algorithms designed for reinforcement learning tasks focus on fin...
research
10/15/2021

Effects of Different Optimization Formulations in Evolutionary Reinforcement Learning on Diverse Behavior Generation

Generating various strategies for a given task is challenging. However, ...
research
11/10/2018

Diversity-Driven Extensible Hierarchical Reinforcement Learning

Hierarchical reinforcement learning (HRL) has recently shown promising a...
research
03/15/2023

Smoothed Q-learning

In Reinforcement Learning the Q-learning algorithm provably converges to...
research
01/10/2023

Mastering Diverse Domains through World Models

General intelligence requires solving tasks across many domains. Current...
research
09/06/2021

Method for making multi-attribute decisions in wargames by combining intuitionistic fuzzy numbers with reinforcement learning

Researchers are increasingly focusing on intelligent games as a hot rese...

Please sign up or login with your details

Forgot password? Click here to reset