Combining learned and analytical models for predicting action effects
One of the most basic skills a robot should possess is predicting the effect of physical interactions with objects in the environment. This enables optimal action selection to reach a certain goal state. Traditionally, these dynamics are described by physics-based analytical models, which may however be very hard to find for complex problems. More recently, we have seen learning approaches that can predict the effect of more complex physical interactions directly from sensory input. However, it is an open question how far these models generalize beyond their training data. In this work, we analyse how analytical and learned models can be combined to leverage the best of both worlds. As physical interaction task, we use planar pushing, for which there exists a well-known analytical model and a large real-world dataset. We propose to use a neural network to convert the raw sensory data into a suitable representation that can be consumed by the analytical model and compare this approach to using neural networks for both, perception and prediction. Our results show that the combined method outperforms the purely learned version in terms of accuracy and generalization to push actions not seen during training. It also performs comparable to the analytical model applied on ground truth input values, despite using raw sensory data as input.
READ FULL TEXT