A Framework for Learning Invariant Physical Relations in Multimodal Sensory Processing

by   Du Xiaorui, et al.

Perceptual learning enables humans to recognize and represent stimuli invariant to various transformations and build a consistent representation of the self and physical world. Such representations preserve the invariant physical relations among the multiple perceived sensory cues. This work is an attempt to exploit these principles in an engineered system. We design a novel neural network architecture capable of learning, in an unsupervised manner, relations among multiple sensory cues. The system combines computational principles, such as competition, cooperation, and correlation, in a neurally plausible computational substrate. It achieves that through a parallel and distributed processing architecture in which the relations among the multiple sensory quantities are extracted from time-sequenced data. We describe the core system functionality when learning arbitrary non-linear relations in low-dimensional sensory data. Here, an initial benefit rises from the fact that such a network can be engineered in a relatively straightforward way without prior information about the sensors and their interactions. Moreover, alleviating the need for tedious modelling and parametrization, the network converges to a consistent description of any arbitrary high-dimensional multisensory setup. We demonstrate this through a real-world learning problem, where, from standard RGB camera frames, the network learns the relations between physical quantities such as light intensity, spatial gradient, and optical flow, describing a visual scene. Overall, the benefits of such a framework lie in the capability to learn non-linear pairwise relations among sensory streams in an architecture that is stable under noise and missing sensor input.


page 1

page 2

page 3

page 4

page 5


Towards Modeling the Interaction of Spatial-Associative Neural Network Representations for Multisensory Perception

Our daily perceptual experience is driven by different neural mechanisms...

Real-time Digital Double Framework to Predict Collapsible Terrains for Legged Robots

Inspired by the digital twinning systems, a novel real-time digital doub...

Adaptive robot body learning and estimation through predictive coding

The predictive functions that permit humans to infer their body state by...

Interpretable Latent Spaces for Learning from Demonstration

Effective human-robot interaction, such as in robot learning from human ...

Neural Multisensory Scene Inference

For embodied agents to infer representations of the underlying 3D physic...

Linear Readout of Object Manifolds

Objects are represented in sensory systems by continuous manifolds due t...

Universal Memory Architectures for Autonomous Machines

We propose a self-organizing memory architecture for perceptual experien...

Please sign up or login with your details

Forgot password? Click here to reset