Multi-camera Bird's Eye View Perception for Autonomous Driving

09/16/2023
by   David Unger, et al.
0

Most automated driving systems comprise a diverse sensor set, including several cameras, Radars, and LiDARs, ensuring a complete 360coverage in near and far regions. Unlike Radar and LiDAR, which measure directly in 3D, cameras capture a 2D perspective projection with inherent depth ambiguity. However, it is essential to produce perception outputs in 3D to enable the spatial reasoning of other agents and structures for optimal path planning. The 3D space is typically simplified to the BEV space by omitting the less relevant Z-coordinate, which corresponds to the height dimension.The most basic approach to achieving the desired BEV representation from a camera image is IPM, assuming a flat ground surface. Surround vision systems that are pretty common in new vehicles use the IPM principle to generate a BEV image and to show it on display to the driver. However, this approach is not suited for autonomous driving since there are severe distortions introduced by this too-simplistic transformation method. More recent approaches use deep neural networks to output directly in BEV space. These methods transform camera images into BEV space using geometric constraints implicitly or explicitly in the network. As CNN has more context information and a learnable transformation can be more flexible and adapt to image content, the deep learning-based methods set the new benchmark for BEV transformation and achieve state-of-the-art performance. First, this chapter discusses the contemporary trends of multi-camera-based DNN (deep neural network) models outputting object representations directly in the BEV space. Then, we discuss how this approach can extend to effective sensor fusion and coupling downstream tasks like situation analysis and prediction. Finally, we show challenges and open problems in BEV perception.

READ FULL TEXT

page 4

page 5

page 7

page 8

page 13

research
05/26/2022

BEVFusion: Multi-Task Multi-Sensor Fusion with Unified Bird's-Eye View Representation

Multi-sensor fusion is essential for an accurate and reliable autonomous...
research
03/04/2021

PolarNet: Accelerated Deep Open Space Segmentation Using Automotive Radar in Polar Domain

Camera and Lidar processing have been revolutionized with the rapid deve...
research
07/25/2023

HeightFormer: Explicit Height Modeling without Extra Data for Camera-only 3D Object Detection in Bird's Eye View

Vision-based Bird's Eye View (BEV) representation is an emerging percept...
research
05/24/2023

Polarimetric Imaging for Perception

Autonomous driving and advanced driver-assistance systems rely on a set ...
research
11/08/2022

Estimation of Appearance and Occupancy Information in Birds Eye View from Surround Monocular Images

Autonomous driving requires efficient reasoning about the location and a...
research
08/15/2019

A Multimodal Vision Sensor for Autonomous Driving

This paper describes a multimodal vision sensor that integrates three ty...
research
03/14/2021

Radar Camera Fusion via Representation Learning in Autonomous Driving

Radars and cameras are mature, cost-effective, and robust sensors and ha...

Please sign up or login with your details

Forgot password? Click here to reset