Robotic Knee Tracking Control to Mimic the Intact Human Knee Profile Based on Actor-critic Reinforcement Learning

by   Ruofan Wu, et al.

We address a state-of-the-art reinforcement learning (RL) control approach to automatically configure robotic prosthesis impedance parameters to enable end-to-end, continuous locomotion intended for transfemoral amputee subjects. Specifically, our actor-critic based RL provides tracking control of a robotic knee prosthesis to mimic the intact knee profile. This is a significant advance from our previous RL based automatic tuning of prosthesis control parameters which have centered on regulation control with a designer prescribed robotic knee profile as the target. In addition to presenting the complete tracking control algorithm based on direct heuristic dynamic programming (dHDP), we provide an analytical framework for the tracking controller with constrained inputs. We show that our proposed tracking control possesses several important properties, such as weight convergence of the learning networks, Bellman (sub)optimality of the cost-to-go value function and control input, and practical stability of the human-robot system under input constraint. We further provide a systematic simulation of the proposed tracking control using a realistic human-robot system simulator, the OpenSim, to emulate how the dHDP enables level ground walking, walking on different terrains and at different paces. These results show that our proposed dHDP based tracking control is not only theoretically suitable, but also practically useful.


A Data-Driven Reinforcement Learning Solution Framework for Optimal and Adaptive Personalization of a Hip Exoskeleton

Robotic exoskeletons are exciting technologies for augmenting human mobi...

Reinforcement Learning Control of Robotic Knee with Human in the Loop by Flexible Policy Iteration

This study is motivated by a new class of challenging control problems d...

Online Reinforcement Learning Control by Direct Heuristic Dynamic Programming: from Time-Driven to Event-Driven

In this paper time-driven learning refers to the machine learning method...

Actor-Critic with variable time discretization via sustained actions

Reinforcement learning (RL) methods work in discrete time. In order to a...

Epersist: A Self Balancing Robot Using PID Controller And Deep Reinforcement Learning

A two-wheeled self-balancing robot is an example of an inverse pendulum ...

Resource-Constrained Station-Keeping for Helium Balloons using Reinforcement Learning

High altitude balloons have proved useful for ecological aerial surveys,...

Please sign up or login with your details

Forgot password? Click here to reset