Human Hands as Probes for Interactive Object Understanding

12/16/2021
by   Mohit Goyal, et al.
5

Interactive object understanding, or what we can do to objects and how is a long-standing goal of computer vision. In this paper, we tackle this problem through observation of human hands in in-the-wild egocentric videos. We demonstrate that observation of what human hands interact with and how can provide both the relevant data and the necessary supervision. Attending to hands, readily localizes and stabilizes active objects for learning and reveals places where interactions with objects occur. Analyzing the hands shows what we can do to objects and how. We apply these basic principles on the EPIC-KITCHENS dataset, and successfully learn state-sensitive features, and object affordances (regions of interaction and afforded grasps), purely by observing hands in egocentric videos.

READ FULL TEXT

page 4

page 16

page 19

page 20

page 21

page 25

page 26

page 27

research
10/12/2020

The MECCANO Dataset: Understanding Human-Object Interactions from Egocentric Videos in an Industrial-like Domain

Wearable cameras allow to collect images and videos of humans interactin...
research
12/05/2019

Zero-Shot Generation of Human-Object Interaction Videos

Generation of videos of complex scenes is an important open problem in c...
research
11/23/2022

Learning to Imitate Object Interactions from Internet Videos

We study the problem of imitating object interactions from Internet vide...
research
04/10/2019

Next-Active-Object prediction from Egocentric Videos

Although First Person Vision systems can sense the environment from the ...
research
06/25/2022

Learn to Predict How Humans Manipulate Large-sized Objects from Interactive Motions

Understanding human intentions during interactions has been a long-lasti...
research
07/19/2023

Object-centric Representations for Interactive Online Learning with Non-Parametric Methods

Large offline learning-based models have enabled robots to successfully ...
research
01/13/2021

EventAnchor: Reducing Human Interactions in Event Annotation of Racket Sports Videos

The popularity of racket sports (e.g., tennis and table tennis) leads to...

Please sign up or login with your details

Forgot password? Click here to reset