Instance-Level Semantic Maps for Vision Language Navigation

05/21/2023
by   Laksh Nanwani, et al.
0

Humans have a natural ability to perform semantic associations with the surrounding objects in the environment. This allows them to create a mental map of the environment which helps them to navigate on-demand when given a linguistic instruction. A natural goal in Vision Language Navigation (VLN) research is to impart autonomous agents with similar capabilities. Recently introduced VL Maps <cit.> take a step towards this goal by creating a semantic spatial map representation of the environment without any labelled data. However, their representations are limited for practical applicability as they do not distinguish between different instances of the same object. In this work, we address this limitation by integrating instance-level information into spatial map representation using a community detection algorithm and by utilizing word ontology learned by large language models (LLMs) to perform open-set semantic associations in the mapping representation. The resulting map representation improves the navigation performance by two-fold (233%) on realistic language commands with instance-specific descriptions compared to VL Maps. We validate the practicality and effectiveness of our approach through extensive qualitative and quantitative experiments.

READ FULL TEXT

page 1

page 2

page 4

page 6

research
07/23/2023

Learning Navigational Visual Representations with Semantic Map Supervision

Being able to perceive the semantics and the spatial structure of the en...
research
12/07/2020

MultiON: Benchmarking Semantic Map Memory using Multi-Object Navigation

Navigation tasks in photorealistic 3D environments are challenging becau...
research
10/14/2022

Weakly-Supervised Multi-Granularity Map Learning for Vision-and-Language Navigation

We address a practical yet challenging problem of training robot agents ...
research
07/24/2023

GridMM: Grid Memory Map for Vision-and-Language Navigation

Vision-and-language navigation (VLN) enables the agent to navigate to a ...
research
10/11/2022

Visual Language Maps for Robot Navigation

Grounding language to the visual observations of a navigating agent can ...
research
01/26/2022

Self-supervised 3D Semantic Representation Learning for Vision-and-Language Navigation

In the Vision-and-Language Navigation task, the embodied agent follows l...
research
05/23/2019

Automatic Generation of Level Maps with the Do What's Possible Representation

Automatic generation of level maps is a popular form of automatic conten...

Please sign up or login with your details

Forgot password? Click here to reset