PIP: Physical Interaction Prediction via Mental Imagery with Span Selection

by   Jiafei Duan, et al.
Nanyang Technological University
Agency for Science, Technology and Research
Singapore University of Technology and Design

To align advanced artificial intelligence (AI) with human values and promote safe AI, it is important for AI to predict the outcome of physical interactions. Even with the ongoing debates on how humans predict the outcomes of physical interactions among objects in the real world, there are works attempting to tackle this task via cognitive-inspired AI approaches. However, there is still a lack of AI approaches that mimic the mental imagery humans use to predict physical interactions in the real world. In this work, we propose a novel PIP scheme: Physical Interaction Prediction via Mental Imagery with Span Selection. PIP utilizes a deep generative model to output future frames of physical interactions among objects before extracting crucial information for predicting physical interactions by focusing on salient frames using span selection. To evaluate our model, we propose a large-scale SPACE+ dataset of synthetic video frames, including three physical interaction events in a 3D environment. Our experiments show that PIP outperforms baselines and human performance in physical interaction prediction for both seen and unseen objects. Furthermore, PIP's span selection scheme can effectively identify the frames where physical interactions among objects occur within the generated frames, allowing for added interpretability.


page 3

page 4

page 6

page 7


Fixing Malfunctional Objects With Learned Physical Simulation and Functional Prediction

This paper studies the problem of fixing malfunctional 3D objects. While...

SPACE: A Simulator for Physical Interactions and Causal Learning in 3D Environments

Recent advancements in deep learning, computer vision, and embodied AI h...

A Mental-Model Centric Landscape of Human-AI Symbiosis

There has been significant recent interest in developing AI agents capab...

Adversarial Interaction Attack: Fooling AI to Misinterpret Human Intentions

Understanding the actions of both humans and artificial intelligence (AI...

Advantage of prediction and mental imagery for goal‐directed behaviour in agents and robots

Mental imagery and planning are important aspects of cognitive behaviour...

Learn to Predict How Humans Manipulate Large-sized Objects from Interactive Motions

Understanding human intentions during interactions has been a long-lasti...

Combining learned and analytical models for predicting action effects

One of the most basic skills a robot should possess is predicting the ef...

Please sign up or login with your details

Forgot password? Click here to reset