Forecasting Hands and Objects in Future Frames

05/20/2017
by   Chenyou Fan, et al.
0

This paper presents an approach to forecast future presence and location of human hands and objects. Given an image frame, the goal is to predict what objects will appear in the future frame (e.g., 5 seconds later) and where they will be located at, even when they are not visible in the current frame. The key idea is that (1) an intermediate representation of a convolutional object recognition model abstracts scene information in its frame and that (2) we can predict (i.e., regress) such representations corresponding to the future frames based on that of the current frame. We design a new two-stream convolutional neural network (CNN) architecture for videos by extending the state-of-the-art convolutional object detection network, and present a new fully convolutional regression network for predicting future scene representations. Our experiments confirm that combining the regressed future representation with our detection network allows reliable estimation of future hands and objects in videos. We obtain much higher accuracy compared to the state-of-the-art future object presence forecast method on a public dataset.

READ FULL TEXT

page 3

page 7

page 9

research
03/03/2017

Learning Robot Activities from First-Person Human Videos Using Convolutional Future Regression

We design a new approach that allows robot learning of new activities fr...
research
06/16/2021

Unsupervised Video Prediction from a Single Frame by Estimating 3D Dynamic Scene Structure

Our goal in this work is to generate realistic videos given just one ini...
research
04/10/2017

ClusterNet: Detecting Small Objects in Large Scenes by Exploiting Spatio-Temporal Information

Object detection in wide area motion imagery (WAMI) has drawn the attent...
research
04/24/2019

Segmenting the Future

Predicting the future is an important aspect for decision-making in robo...
research
04/29/2015

Anticipating Visual Representations from Unlabeled Video

Anticipating actions and objects before they start or appear is a diffic...
research
02/02/2023

Dynamic Atomic Column Detection in Transmission Electron Microscopy Videos via Ridge Estimation

Ridge detection is a classical tool to extract curvilinear features in i...
research
06/13/2016

Visual-Inertial-Semantic Scene Representation for 3-D Object Detection

We describe a system to detect objects in three-dimensional space using ...

Please sign up or login with your details

Forgot password? Click here to reset