Prediction of Manipulation Actions

by   Cornelia Fermuller, et al.

Looking at a person's hands one often can tell what the person is going to do next, how his/her hands are moving and where they will be, because an actor's intentions shape his/her movement kinematics during action execution. Similarly, active systems with real-time constraints must not simply rely on passive video-segment classification, but they have to continuously update their estimates and predict future actions. In this paper, we study the prediction of dexterous actions. We recorded from subjects performing different manipulation actions on the same object, such as "squeezing", "flipping", "washing", "wiping" and "scratching" with a sponge. In psychophysical experiments, we evaluated human observers' skills in predicting actions from video sequences of different length, depicting the hand movement in the preparation and execution of actions before and after contact with the object. We then developed a recurrent neural network based method for action prediction using as input patches around the hand. We also used the same formalism to predict the forces on the finger tips using for training synchronized video and force data streams. Evaluations on two new datasets showed that our system closely matches human performance in the recognition task, and demonstrate the ability of our algorithm to predict what and how a dexterous action is performed.


page 8

page 10


Forecasting Action through Contact Representations from First Person Video

Human actions involving hand manipulations are structured according to t...

Predicting Human Intentions from Motion Only: A 2D+3D Fusion Approach

In this paper, we address the new problem of the prediction of human int...

Channel Decomposition into Painting Actions

This work presents a method to decompose a convolutional layer of the de...

Surprisingly Robust In-Hand Manipulation: An Empirical Study

We present in-hand manipulation skills on a dexterous, compliant, anthro...

Event-based Vision for Early Prediction of Manipulation Actions

Neuromorphic visual sensors are artificial retinas that output sequences...

Channel Decomposition on Generative Networks

This work presents a method to decompose a layer of the generative netwo...

Long Activity Video Understanding using Functional Object-Oriented Network

Video understanding is one of the most challenging topics in computer vi...

Please sign up or login with your details

Forgot password? Click here to reset