Reinforcement Teaching

04/25/2022
by   Alex Lewandowski, et al.
10

We propose Reinforcement Teaching: a framework for meta-learning in which a teaching policy is learned, through reinforcement, to control a student's learning process. The student's learning process is modelled as a Markov reward process and the teacher, with its action-space, interacts with the induced Markov decision process. We show that, for many learning processes, the student's learnable parameters form a Markov state. To avoid having the teacher learn directly from parameters, we propose the Parameter Embedder that learns a representation of a student's state from its input/output behaviour. Next, we use learning progress to shape the teacher's reward towards maximizing the student's performance. To demonstrate the generality of Reinforcement Teaching, we conducted experiments in which a teacher learns to significantly improve supervised and reinforcement learners by using a combination of learning progress reward and a Parameter Embedded state. These results show that Reinforcement Teaching is not only an expressive framework capable of unifying different approaches, but also provides meta-learning with the plethora of tools from reinforcement learning.

READ FULL TEXT

page 6

page 16

page 23

page 27

research
05/20/2018

Learning to Teach in Cooperative Multiagent Reinforcement Learning

We present a framework and algorithm for peer-to-peer teaching in cooper...
research
02/05/2019

Interactively shaping robot behaviour with unlabeled human instructions

In this paper, we propose a framework that enables a human teacher to sh...
research
02/13/2023

COACH: Cooperative Robot Teaching

Knowledge and skills can transfer from human teachers to human students....
research
09/30/2020

Learning Rewards from Linguistic Feedback

We explore unconstrained natural language feedback as a learning signal ...
research
11/29/2022

Airfoil Shape Optimization using Deep Q-Network

The feasibility of using reinforcement learning for airfoil shape optimi...
research
06/16/2020

The Teaching Dimension of Q-learning

In this paper, we initiate the study of sample complexity of teaching, t...
research
11/12/2021

Meta-Teacher For Face Anti-Spoofing

Face anti-spoofing (FAS) secures face recognition from presentation atta...

Please sign up or login with your details

Forgot password? Click here to reset