Predicting Semantic Map Representations from Images using Pyramid Occupancy Networks

03/30/2020
by   Thomas Roddick, et al.
0

Autonomous vehicles commonly rely on highly detailed birds-eye-view maps of their environment, which capture both static elements of the scene such as road layout as well as dynamic elements such as other cars and pedestrians. Generating these map representations on the fly is a complex multi-stage process which incorporates many important vision-based elements, including ground plane estimation, road segmentation and 3D object detection. In this work we present a simple, unified approach for estimating maps directly from monocular images using a single end-to-end deep learning architecture. For the maps themselves we adopt a semantic Bayesian occupancy grid framework, allowing us to trivially accumulate information over multiple cameras and timesteps. We demonstrate the effectiveness of our approach by evaluating against several challenging baselines on the NuScenes and Argoverse datasets, and show that we are able to achieve a relative improvement of 9.1 compared to the best-performing existing method.

READ FULL TEXT

page 2

page 3

page 4

page 5

page 6

page 9

page 10

page 11

research
08/06/2021

Bird's-Eye-View Panoptic Segmentation Using Monocular Frontal View Images

Bird's-Eye-View (BEV) maps have emerged as one of the most powerful repr...
research
12/05/2020

Understanding Bird's-Eye View Semantic HD-Maps Using an Onboard Monocular Camera

Autonomous navigation requires scene understanding of the action-space t...
research
11/25/2016

Learning from Maps: Visual Common Sense for Autonomous Driving

Today's autonomous vehicles rely extensively on high-definition 3D maps ...
research
08/13/2020

Lift, Splat, Shoot: Encoding Images From Arbitrary Camera Rigs by Implicitly Unprojecting to 3D

The goal of perception for autonomous vehicles is to extract semantic re...
research
06/26/2017

Learning to Map Vehicles into Bird's Eye View

Awareness of the road scene is an essential component for both autonomou...
research
01/10/2023

InstaGraM: Instance-level Graph Modeling for Vectorized HD Map Learning

The construction of lightweight High-definition (HD) maps containing geo...
research
09/19/2022

A Dual-Cycled Cross-View Transformer Network for Unified Road Layout Estimation and 3D Object Detection in the Bird's-Eye-View

The bird's-eye-view (BEV) representation allows robust learning of multi...

Please sign up or login with your details

Forgot password? Click here to reset