We study the problem of computing an optimal policy of an infinite-horiz...
We examine online safe multi-agent reinforcement learning using constrai...
We study sequential decision making problems aimed at maximizing the exp...
We examine global non-asymptotic convergence properties of policy gradie...
We study the Safe Reinforcement Learning (SRL) problem using the Constra...
For a class of nonsmooth composite optimization problems with linear equ...
We consider a distributed multi-agent policy evaluation problem in
reinf...