Video Segmentation using Teacher-Student Adaptation in a Human Robot Interaction (HRI) Setting

by   Mennatullah Siam, et al.

Video segmentation is a challenging task that has many applications in robotics. Learning segmentation from few examples on-line is important for robotics in unstructured environments. The total number of objects and their variation in the real world is intractable, but for a specific task the robot deals with a small subset. Our network is taught, by a human moving a hand-held object through different poses. A novel two-stream motion and appearance "teacher" network provides pseudo-labels. These labels are used to adapt an appearance "student" network. Segmentation can be used to support a variety of robot vision functionality, such as grasping or affordance segmentation. We propose different variants of motion adaptation training and extensively compare against the state-of-the-art methods. We collected a carefully designed dataset in the human robot interaction (HRI) setting. We denote our dataset as (L)ow-shot (O)bject (R)ecognition, (D)etection and (S)egmentation using HRI. Our dataset contains teaching videos of different hand-held objects moving in translation, scale and rotation. It contains kitchen manipulation tasks as well, performed by humans and robots. Our proposed method outperforms the state-of-the-art on DAVIS and FBMS with 7 In our more challenging LORDS-HRI dataset, our approach achieves significantly better performance with 46.7 baseline.


page 1

page 3

page 4

page 5

page 6


HandoverSim: A Simulation Framework and Benchmark for Human-to-Robot Object Handovers

We introduce a new simulation benchmark "HandoverSim" for human-to-robot...

Learning Human-to-Robot Handovers from Point Clouds

We propose the first framework to learn control policies for vision-base...

DexYCB: A Benchmark for Capturing Hand Grasping of Objects

We introduce DexYCB, a new dataset for capturing hand grasping of object...

Teaching Robots Novel Objects by Pointing at Them

Robots that must operate in novel environments and collaborate with huma...

Fast Object Segmentation Learning with Kernel-based Methods for Robotics

Object segmentation is a key component in the visual system of a robot t...

The rUNSWift SPL Field Segmentation Dataset

In RoboCup SPL, soccer field segmentation has been widely recognised as ...

SFU-Store-Nav: A Multimodal Dataset for Indoor Human Navigation

This article describes a dataset collected in a set of experiments that ...

Please sign up or login with your details

Forgot password? Click here to reset