Adding Knowledge to Unsupervised Algorithms for the Recognition of Intent

by   Stuart Synakowski, et al.

Computer vision algorithms performance are near or superior to humans in the visual problems including object recognition (especially those of fine-grained categories), segmentation, and 3D object reconstruction from 2D views. Humans are, however, capable of higher-level image analyses. A clear example, involving theory of mind, is our ability to determine whether a perceived behavior or action was performed intentionally or not. In this paper, we derive an algorithm that can infer whether the behavior of an agent in a scene is intentional or unintentional based on its 3D kinematics, using the knowledge of self-propelled motion, Newtonian motion and their relationship. We show how the addition of this basic knowledge leads to a simple, unsupervised algorithm. To test the derived algorithm, we constructed three dedicated datasets from abstract geometric animation to realistic videos of agents performing intentional and non-intentional actions. Experiments on these datasets show that our algorithm can recognize whether an action is intentional or not, even without training data. The performance is comparable to various supervised baselines quantitatively, with sensible intentionality segmentation qualitatively.


page 4

page 5

page 10

page 14

page 15

page 16

page 17


Hierarchical Modeling for Task Recognition and Action Segmentation in Weakly-Labeled Instructional Videos

This paper focuses on task recognition and action segmentation in weakly...

Distillation of Human-Object Interaction Contexts for Action Recognition

Modeling spatial-temporal relations is imperative for recognizing human ...

Learning by Asking Questions for Knowledge-based Novel Object Recognition

In real-world object recognition, there are numerous object classes to b...

Follow the Attention: Combining Partial Pose and Object Motion for Fine-Grained Action Detection

Activity recognition in shopping environments is an important and challe...

Egocentric Hand Track and Object-based Human Action Recognition

Egocentric vision is an emerging field of computer vision that is charac...

PEARL: Parallelized Expert-Assisted Reinforcement Learning for Scene Rearrangement Planning

Scene Rearrangement Planning (SRP) is an interior task proposed recently...

Please sign up or login with your details

Forgot password? Click here to reset