research
          
      
      ∙
      02/21/2018
    Convergent Actor-Critic Algorithms Under Off-Policy Training and Function Approximation
We present the first class of policy-gradient algorithms that work with ...
          
            research
          
      
      ∙
      01/08/2014