Achieving mouse-level strategic evasion performance using real-time computational planning
Planning is an extraordinary ability in which the brain imagines and then enacts evaluated possible futures. Using traditional planning models, computer scientists have attempted to replicate this capacity with some level of success but ultimately face a reoccurring limitation: as the plan grows in steps, the number of different possible futures makes it intractable to determine the right sequence of actions to reach a goal state. Based on prior theoretical work on how the ecology of an animal governs the value of spatial planning, we developed a more efficient biologically-inspired planning algorithm, TLPPO. This algorithm allows us to achieve mouselevel predator evasion performance with orders of magnitude less computation than a widespread algorithm for planning in the situations of partial observability that typify predator-prey interactions. We compared the performance of a real-time agent using TLPPO against the performance of live mice, all tasked with evading a robot predator. We anticipate these results will be helpful to planning algorithm users and developers, as well as to areas of neuroscience where robot-animal interaction can provide a useful approach to studying the basis of complex behaviors.
READ FULL TEXT